CoreDNS
是kubernetes的御用DNS组件,负责集群内域名解析,服务注册发现,外部域名转发等,基础组件中的核心组件,稍微有点抖动,感觉全世界的 RD 都会找上你。
1.官仓
CoreDNS is a DNS server that chains plugins
1.1.能做啥
- Serve zone data from a file; both DNSSEC (NSEC only) and DNS are supported (file and auto).
- Retrieve zone data from primaries, i.e., act as a secondary server (AXFR only) (secondary).
- Sign zone data on-the-fly (dnssec).
- Load balancing of responses (loadbalance).
- Allow for zone transfers, i.e., act as a primary server (file + transfer).
- Automatically load zone files from disk (auto).
- Caching of DNS responses (cache).
- Use etcd as a backend (replacing SkyDNS) (etcd).
- Use k8s (kubernetes) as a backend (kubernetes).
- Serve as a proxy to forward queries to some other (recursive) nameserver (forward).
- Provide metrics (by using Prometheus) (prometheus).
- Provide query (log) and error (errors) logging.
- Integrate with cloud providers (route53).
- Support the CH class:
version.bind
and friends (chaos). - Support the RFC 5001 DNS name server identifier (NSID) option (nsid).
- Profiling support (pprof).
- Rewrite queries (qtype, qclass and qname) (rewrite and template).
- Block ANY queries (any).
- Provide DNS64 IPv6 Translation (dns64).
- 更多功能,看插件…
大部分功能我们是用不到的,其中加粗的会用到
2.部署
2.1.配置文件
.:53 {
errors
health {
lameduck 15s
}
ready
kubernetes cluster.local. in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . 10.0.0.1 10.0.0.2 10.0.0.3 {
health_check 1s
max_fails 3
policy round_robin
}
cache 60
loop
reload
loadbalance
}
3.维护
3.1.压测
ipvsadm --set 900 120 10
./dnstrace -s 10.96.0.10 -t A -n 100000 -c 10 --rate-limit=2000 sre.wiki --color
4.插件
4.1.forward
forward facilitates proxying DNS messages to upstream resolvers.
- 到upstreams能复用sockets
- 支持 UDP,TCP,DNS-over-TLS
- 支持健康检测,解析error触发周期性检测,使用TO的协议,当达到max_fails就摘掉upstream;如果所有upstream都不健康,会随机挑选一个upstream(兜底)
- 异常响应包括(REFUSED, NOTIMPL, SERVFAIL),NXDOMAIN是正常响应!!!
- 到upstream支持负载均衡策略,默认为random,可调为round_robin
4.2.health
health enables a health check endpoint.
Enabled process wide health endpoint. When CoreDNS is up and running this returns a 200 OK HTTP status code. The health is exported, by default, on port 8080/health.
health :8080 {
lameduck 15s
}
Where lameduck will delay shutdown for DURATION. /health will still answer 200 OK. Note: The ready plugin will not answer OK while CoreDNS is in lame duck mode prior to shutdown.
- 提供健康检测功能,如果不健康到阈值,就会被杀重启,恢复工作
- lameduck,中文“跛脚鸭”,引入这个是为了处理coredns停止时流量,见:issues/199,具体值需要参考ready探针的配置,能够在超时之前,pod变为
not ready
即可,使k8s集群能够感知到负载要下线。
4.3.errors
errors enables error logging.
4.4.reload
reload allows automatic reload of a changed Corefile.
reload [INTERVAL] [JITTER]
In some environments (for example, Kubernetes), there may be many CoreDNS instances that started very near the same time and all share a common Corefile. To prevent these all from reloading at the same time, some jitter is added to the reload check interval. This is jitter from the perspective of multiple CoreDNS instances; each instance still checks on a regular interval, but all of these instances will have their reloads spread out across the jitter duration. This isn’t strictly necessary given that the reloads are graceful, and can be disabled by setting the jitter to 0s.
INTERVAL and JITTER are Golang durations. The default INTERVAL is 30s, default JITTER is 15s, the minimal value for INTERVAL is 2s, and for JITTER it is 1s. If JITTER is more than half of INTERVAL, it will be set to half of INTERVAL.
5.风险
5.1.滚动更新或者reload总是有少量解析失败
- 引入跛脚鸭,解析失败大幅减少
- ipvs调整udpTimeout,默认300s,改为10s。参考:阿里云CoreDNS升级
ipvs的默认会话保持策略会使UDP协议后端在摘除后五分钟内出现概率性丢包的问题。如果您业务依赖于CoreDNS,当CoreDNS组件升级或所在节点重启时,您可能会在五分钟内遇到业务接口延迟、请求超时等现象。
5.2.coredns默认的资源限制太低,会导致oomkilled
分配到独立节点,并且调整到合理的资源限制。
6.源码
配置文件
.:53 {
errors
health {
lameduck 15s
}
ready
kubernetes cluster.local. in-addr.arpa ip6.arpa {
pods insecure
fallthrough in-addr.arpa ip6.arpa
}
prometheus :9153
forward . 192.168.224.1
cache 30
loop
reload
loadbalance
}
参数解析
- lameduck 跛脚鸭,用作优雅关闭,服务最后的的部分流量
- ready 独立ready就绪状态接口
- health 暴露health接口,用于存活检测
- prometheus 暴露coredns指标
- forward 对于未匹配kubernetes关联zone的域名,forward到上游dns服务器解析 1.6.2之前还支持proxy参数,但是之后就完全不支持proxy参数了,需要注意!
- cache 缓存非kubernetes域名解析记录30秒,有点短,可以适当调整
- loop 防止循环解析
- reload 支持配置文件变更热加载,并非watch configmap资源,而是感知mount的配置文件变更,一般比较滞后
- loadbalance 如果forward后有多个地址,那么负载均衡,均摊流量
- kubernetes kubernetes专属配置,用作内部服务发现
- errors 记录错误日志
CoreDNS压测
在CoreDNS缩容的时候,会因为udp会话保持的问题,导致很少量的DNS请求失败,缩短udpTimeout即可
升级说明
- 从1.7.0版本起,CoreDNS的默认内存限制会被调整至2 GB,正常情况下不会出现Pod内存溢出OOM(Out of Memory)情况,无需对内存限制再次修改。
- 如果使用了
kube-proxy
IPVS
模式,IPVS
的会话保持策略会导致整个集群在升级完成后五分钟内出现概率性解析失败的问题。您可以按以下方式降低IPVS UDP类型的会话保持超时时间至10秒,以减少解析失败的次数。
1.18集群:
apiVersion: v1
data:
config.conf: |
apiVersion: kubeproxy.config.k8s.io/v1alpha1
kind: KubeProxyConfiguration
mode: ipvs
ipvs:
udpTimeout: 10s
设置udpTimeout: 10s
,其实没啥鸟用,缩容还是会丢 3-4req
每实例
yum install -y ipvsadm
ipvsadm -L --timeout
1.16集群:
ipvsadm --set 900 120 10
Comments