Deep dive into CoreDNS

A Comprehensive Look at CoreDNS in the Kubernetes Ecosystem

Deep dive into CoreDNS

Introduction

CoreDNS is a flexible, extensible DNS server that is written in Go and can be used for service discovery, configuration management, and other networking tasks, offering advanced features and plugins for customization.

In 2016, Miek Gieben developed CoreDNS as an upgraded version of his earlier DNS server, SkyDNS, and the Go-based web server, Caddy. CoreDNS incorporates Caddy's user-friendly configuration syntax, and flexible plug-in architecture, and is built on the solid foundation of the Go language. This combination of features has contributed to the widespread adoption of CoreDNS as a reliable DNS server solution.

Importance and role in Kubernetes: CoreDNS is a crucial component in Kubernetes that provides DNS-based service discovery, configuration management, and network customization with advanced features and plugins. It became the default DNS Service starting from Kubernetes v1.11, marking its general availability.

Comparison with other DNS servers

DNS ServerFeaturesProsCons
CoreDNSModular, pluggable, and easy to configure.Fast, reliable, and scalable.Can be complex to set up and manage.
Kube-DNS#Simple and easy to use.Comes pre-installed with Kubernetes.Not as fast or scalable as CoreDNS.
BINDPowerful and feature-rich.Well-tested and reliable.Can be complex to set up and manage.
KnotScalable and high-performance.Well-tested and reliable.Not as easy to use as CoreDNS.
PowerDNSFlexible and extensible.Well-tested and reliable.Not as fast or scalable as CoreDNS.

#: Kube-DNS is deprecated and is removed in Kubernetes now

CoreDNS Architecture

CoreDNS has an easy-to-understand setup and works with plugins. It gets these features from Caddy, a web server made with Go. Plugins help with different tasks when a DNS question comes up. They create a chain where each plugin can handle a question, pass it on, or begin working on it and then move to the next plugin.

The core components of CoreDNS are:

  1. Core server: This is the main component of CoreDNS which is responsible for starting up the plugins and handling DNS queries.

  2. Plugins: As we discussed above, CoreDNS is all about plugins. These are the components that are responsible for actually doing the work of resolving DNS queries. There are several different plugins available, each of which performs a different task.

  3. Corefile: This is a simple text file that lists the plugins that the CoreDNS server should use and how they should be configured.

The CoreDNS plugins can be divided into three main categories:

  1. Zone plugins: These plugins are responsible for looking up names in zones. Zones are lists of names and IP addresses.

  2. Forwarder plugins: These plugins are responsible for forwarding DNS queries to other DNS servers.

  3. Cache plugins: These plugins are responsible for caching DNS responses. This can improve the performance of the CoreDNS server by reducing the number of times it has to contact other DNS servers.

Corefile

The Corefile is the main configuration file for CoreDNS and we can have several of them in a single CoreDNS setup, but this isn't the usual way of using and if possible should be avoided. Each Corefile consists of one or more server blocks and each server block lists one or more plugins, and these plugins can be further configured with directives.

Though, we will dive deep into the Corefile with examples later in this article, let's try to understand the options available with Corefile.

  1. Server Blocks: A server block sets a DNS area that CoreDNS controls. If there's no port number in a Corefile, it usually uses port 53. Below is a sample with two different server blocks.

     . {
         errors
         health
         kubernetes cluster.local in-addr.arpa ip6.arpa {
             pods insecure
             upstream
             fallthrough in-addr.arpa ip6.arpa
         }
         prometheus :9153
         forward . /etc/resolv.conf
         cache 30
         loop
         reload
         loadbalance
     }
    
     example.com {
         errors
         log
         file /etc/coredns/example.db
         reload 60s
     }
    

    a. The first server block handles all domains (.). This block is typical in a Kubernetes environment where CoreDNS is configured as the cluster's DNS server. The block uses several plugins like errors, health, kubernetes, prometheus, forward, cache, loop, reload, and loadbalance to handle various aspects of DNS resolution within the cluster.

    b. The second server block is specific to example.com domain. It logs all requests and responses, and the zone data for this domain is loaded from the /etc/coredns/example.db file. The reload 60s directive ensures that the zone file is checked for changes every 60 seconds, and if changes are detected, they are loaded into CoreDNS.

  2. Plugins: Plugins help in extending the CoreDNS's functionality. They are listed in each server block and can be further configured with directives. The order of plugins in the Corefile does not determine the order of execution. Instead, the order in which the plugins are executed is determined by the ordering in the plugin.cfg file, don't worry this is managed for you by default by the CoreDNS team unless you want to customize it. Over time, plugins may be deprecated, removed or added, and these changes should be reflected in your CoreDNS configuration. There are many plugins that we can use within each Server Block and every plugin has its way of defining, mostly in key-value pairs. There are enough examples of each plugin to get started with that can be found here.

  3. Directives: Directives are used to configure plugins. They can be used anywhere in the Corefile. The syntax for directives is usually plugin_name configuration_options which controls the behaviour of the plugin. Below is a Kubernetes plugin with a bunch of directives that drive the Kubernetes plugin.

     # CoreDNS configuration using the Kubernetes plugin
     .:53 {
         kubernetes cluster.local {
             endpoint https://kubernetes.default.svc.cluster.local # Specify the remote k8s API endpoint
             tls /etc/coredns/tls.crt /etc/coredns/tls.key /etc/coredns/ca.crt # TLS cert, key, and CA cert for remote k8s connection
             kubeconfig /etc/coredns/kubeconfig # Authenticate connection to a remote k8s cluster using kubeconfig
             namespaces kube-system # Specify the namespaces to watch for Kubernetes resources
             labels app=web # Filter Kubernetes resources based on labels
             pods any # Specify the mode to resolve A records for pods
             endpoint_pod_names # Enable resolving endpoint pod names
             ttl 300 # Set TTL for DNS responses to 300 seconds
             noendpoints # Do not provide PTR records for services or A records for pods
             fallthrough in-addr.arpa ip6.arpa # Define fallthrough zones for reverse lookups
             ignore empty_service # Ignore empty service records
         }
     }
    

    You can refer to each plugin's documentation for all available options and directives provided that can be used. Here's the documentation of the Kubernetes plugin.

  4. Environment Variables: CoreDNS supports environment variable substitution in its configuration. These can be used anywhere in the Corefile. The syntax for environment variables is {$ENV_VAR} or {%ENV_VAR%} While the former syntax is standard, the latter is a Windows-like syntax that is also supported.For example, if you want to use an environment variable named "MY_VAR" with the value "example.com" in your Corefile, you can define it as follows

     example.com {
         some_directive {$MY_VAR}
         other_directive some_value
     }
    

    In this example, the value of the "MY_VAR" environment variable will be substituted into the Corefile when CoreDNS starts, and the configuration will look like this

     example.com {
         some_directive example.com
         other_directive some_value
     }
    

    This allows you to use environment variables to customize CoreDNS configurations without the need to hard-code values directly in the Corefile. If you are wondering how and who will pass these environment variables, they can be manually set using shell's export, using config files like .bashrc or .bash_profile etc.,

  5. Comments: Comments in a Corefile are started with a #. The rest of the line is then considered a comment

  6. Protocol Configuration: CoreDNS accepts four different protocols: DNS, DNS over TLS (DoT), DNS over HTTP/2 (DoH), and DNS over gRPC. Below is the basic example, but remember to configure with more options per your requirements.

     server {
       domain example.com
       ip_address 192.168.1.1
       protocol h2 # Don't use this for plain DNS, tls for TLS, h2 for DNS over HTTP2, grpc for DNS over gRPC
     }
    

CoreDNS Performance

Performance metrics

Every system in distributed computing should be monitored for performance bottlenecks, and CoreDNS is no exception. Monitoring CoreDNS and understanding its performance metrics are essential for operations teams to ensure its optimal operation. Several metrics are available to monitor CoreDNS, providing insights into its health, efficiency, and potential issues. CoreDNS provides a metrics plugin that exposes Prometheus metrics on port 9153. When you enable the metrics plugin in CoreDNS, you can use Prometheus to scrape and collect the following metrics

  1. coredns_build_info{version, revision, goversion}: This metric provides information about CoreDNS itself, including its version, revision, and Go version.

  2. coredns_panics_total{}: This metric represents the total number of panics that occurred in CoreDNS. Panics are unexpected runtime errors that can affect the stability and availability of the service.

  3. coredns_dns_requests_total{server, zone, view, proto, family, type}: This metric counts the total number of DNS queries processed by CoreDNS. It helps to monitor the overall query load on the DNS server.

  4. coredns_dns_request_duration_seconds{server, zone, view, type}: This metric measures the time taken by CoreDNS to process each DNS query. It provides insights into the query processing time and helps identify potential performance bottlenecks.

  5. coredns_dns_request_size_bytes{server, zone, view, proto}: This metric indicates the size of the DNS request in bytes. It helps to understand the request payload and its impact on server resources.

  6. coredns_dns_do_requests_total{server, view, zone}: This metric tracks the number of DNS queries that have the DO (DNSSEC OK) bit set. DNSSEC is a security extension for DNS, and monitoring DO queries can help assess DNSSEC adoption.

  7. coredns_dns_response_size_bytes{server, zone, view, proto}: This metric measures the size of the DNS response in bytes. It gives insights into the response payload and its impact on network traffic.

  8. coredns_dns_responses_total{server, zone, view, rcode, plugin}: This metric records the total number of DNS responses sent by CoreDNS, categorized by response code and plugin used. It helps to identify DNS errors and their frequency.

  9. coredns_dns_https_responses_total{server, status}: This metric tracks the number of responses per server and HTTP status code for DNS over HTTPS (DoH) requests

Here's some sample data for the metrics collected by Prometheus of CoreDNS

coredns_build_info{version="1.8.3", revision="abcdef", goversion="go1.17"} 1
coredns_panics_total{} 0
coredns_dns_requests_total{server="dns1", zone="example.com", view="internal", proto="udp", family="1", type="A"} 150
coredns_dns_request_duration_seconds{server="dns1", zone="example.com", view="internal", type="A"} 0.023
coredns_dns_request_size_bytes{server="dns1", zone="example.com", view="internal", proto="udp"} 85
coredns_dns_do_requests_total{server="dns1", view="internal", zone="example.com"} 25
coredns_dns_response_size_bytes{server="dns1", zone="example.com", view="internal", proto="udp"} 145
coredns_dns_responses_total{server="dns1", zone="example.com", view="internal", rcode="NOERROR", plugin="cache"} 125
coredns_dns_https_responses_total{server="dns1", status="200"} 50

We can send this data to visualization tools like Grafana for continuous monitoring and configure alerting services based on the metrics as well.

Benchmarking and comparison with other DNS servers

While there are commercial tools like SysDig that provide detailed benchmark testing options, through this PR I found an open-source project that might be used for some benchmark tests. Honestly, I haven't given this a test, but you can see some of the Benchmark tests and results using Bencher here.

Performance optimization techniques

To make CoreDNS work better and faster, try these performance improvement methods.

Autopath Plugin: In the context of performance optimization, enabling the autopath plugin in CoreDNS can mitigate Kubernetes' ndots:5 issue (If you prefer to know more in details about this issue, I have summarized it in the Miscellaneous section as I couldn't find a feature to in-page hyperlink in Hashnode), improving DNS resolution performance for Pods and offering better memory and performance trade-offs, especially in large-scale clusters. Here's an example of how the autopath plugin can be used in the CoreDNS Corefile.

autopath example.com /path/to/resolv.conf

In the above example, the autopath plugin is configured to be authoritative for the zone "example.com." The "/path/to/resolv.conf" points to a resolv.conf-like file or uses a special syntax to point to another plugin. For instance, using "@kubernetes" in place of the file path will call out to the kubernetes plugin (for each query) to retrieve the search list it should use

Proper Resource Allocation: Ensure that CoreDNS is allocated with sufficient resources, including memory and CPU, to handle the DNS resolution requests effectively. In Kubernetes, you can configure resource requests and limits for CoreDNS pods to avoid resource contention and improve overall performance. Here's the sample YAML file for CoreDNS with proper resource allocations

apiVersion: apps/v1
kind: Deployment
metadata:
  name: coredns-deployment
  namespace: kube-system
spec:
  replicas: 2
  selector:
    matchLabels:
      k8s-app: coredns
  template:
    metadata:
      labels:
        k8s-app: coredns
    spec:
      serviceAccountName: coredns
      containers:
        - name: coredns
          image: coredns/coredns:latest
          resources:
            limits:
              cpu: "200m"   # Limit the CPU usage to 200 milliCPU (0.2 CPU cores)
              memory: "200Mi"  # Limit the memory usage to 200 MiB
            requests:
              cpu: "100m"   # Request 100 milliCPU (0.1 CPU cores)
              memory: "100Mi"  # Request 100 MiB
          args:
            - "-conf"
            - "/etc/coredns/Corefile"
          volumeMounts:
            - name: config-volume
              mountPath: /etc/coredns
      volumes:
        - name: config-volume
          configMap:
            name: coredns-configmap
            items:
              - key: Corefile
                path: Corefile

In this example, the CPU limit is set to 200 milliCPU (0.2 CPU cores), and the CPU request is set to 100 milliCPU (0.1 CPU cores). The memory limit is set to 200 MiB, and the memory request is set to 100 MiB. Though I have specified CPU Limits above, it's an anti pattern and I concur with Natan's views mentioned in the article here and here.

CoreDNS Configuration: Fine-tune CoreDNS configuration to meet the specific requirements of your Kubernetes cluster. Pay attention to DNS caching, cache sizes, and negative caching settings, which can have a significant impact on performance. Below is a code example demonstrating how to use the cache plugin in CoreDNS configuration to adjust caching parameters

apiVersion: v1
kind: ConfigMap
metadata:
  name: coredns-configmap
  namespace: kube-system
data:
  Corefile: |
    .:53 {
        errors
        health {
          lameduck 5s
        }
        ready
        cache 3600 {
          success 4096
          denial 512
          prefetch 10 3600
          serve_stale 24h
          servfail 5s
        }
        loop
        forward . /etc/resolv.conf
        cache 30
        reload
        loadbalance
    }

In the above ConfigMap cache plugin options used are

  • cache 3600: Enable caching for up to 3600 seconds (1 hour) for successful DNS queries

  • success 4096: Set the capacity of the cache for successful DNS queries to 4096 items.

  • denial 512: Set the capacity of the cache for negative (NXDOMAIN) DNS queries to 512 items.

  • prefetch 10 3600: Prefetch 10 additional DNS queries and cache them for 3600 seconds to reduce latency.

  • serve_stale 24h: Serve stale cache for up to 24 hours if the backend (upstream) DNS server is unavailable.

  • servfail 5s: Cache SERVFAIL (server failure) responses for 5 seconds to mitigate impact on DNS resolution.

Warning: cache plugin can only be used once per Server Block in CoreDNS

CoreDNS Version: Use the latest stable version of CoreDNS to take advantage of performance improvements and bug fixes which can be found here. Keeping CoreDNS up-to-date can ensure that you benefit from the latest optimizations and enhancements.

Here's an example of how you can update the CoreDNS deployment with the latest stable version

  • First, check the current version of CoreDNS running in your cluster. You can do this by running the following command kubectl get deployment coredns -n kube-system -o=jsonpath='{.spec.template.spec.containers[0].image}'

  • Now, check the CoreDNS GitHub releases page and find the latest stable version, which is v1.10.1 as on the date of writing this article.

  • Update the CoreDNS deployment with the latest version. Replace the image tag in the deployment configuration with the latest version. Here's an example of how the updated CoreDNS deployment configuration might look

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: coredns
        namespace: kube-system
      spec:
        replicas: 2
        selector:
          matchLabels:
            k8s-app: kube-dns
        template:
          metadata:
            labels:
              k8s-app: kube-dns
          spec:
            containers:
            - name: coredns
              image: coredns/coredns:v1.10.1  # Update this line with the latest version
              resources:
                limits:
                  memory: 170Mi
                requests:
                  cpu: 100m
                  memory: 70Mi
              args:
              - -conf
              - /etc/coredns/Corefile
              volumeMounts:
              - name: config-volume
                mountPath: /etc/coredns
                readOnly: true
            volumes:
            - name: config-volume
              configMap:
                name: coredns
                items:
                - key: Corefile
                  path: Corefile
    
  • Apply the updated CoreDNS deployment configuration to your cluster kubectl apply -f coredns-deployment.yaml

Monitor CoreDNS Metrics: Regularly monitor CoreDNS metrics to identify any performance bottlenecks or issues. By tracking metrics such as response times, query rates, cache hit rates, and memory usage, you can proactively address performance concerns. We have discussed above this and how you can visualize using tools like Grafana above.

Network Performance: Optimize the network setup to reduce latency and improve DNS resolution times. Ensure that CoreDNS can efficiently communicate with other nodes and DNS servers. To optimize the network setup and improve DNS resolution times with CoreDNS in your Kubernetes cluster, you can implement NodeLocal DNSCache. NodeLocal DNSCache enables a DNS caching agent to run on each cluster node, caching DNS results and reducing the average DNS lookup time, which can significantly improve the cluster's DNS resolution performance. Here's a code example to enable NodeLocal DNSCache in your Kubernetes cluster

  • Create a nodelocaldns.yaml manifest file with the following content

      apiVersion: v1
      kind: ConfigMap
      metadata:
        name: coredns-custom-config
        namespace: kube-system
      data:
        Corefile: |
          .:53 {
              errors
              health {
                  lameduck 5s
              }
              ready
              kubernetes cluster.local in-addr.arpa ip6.arpa {
                  pods insecure
                  fallthrough in-addr.arpa ip6.arpa
                  ttl 30
              }
              prometheus :9153
              forward . /etc/resolv.conf {
                  prefer_udp
              }
              cache 30
              loop
              reload
              loadbalance
          }
    
  • Apply the nodelocaldns.yaml manifest to create the coredns-custom-config ConfigMap kubectl apply -f nodelocaldns.yaml

  • Update the CoreDNS deployment to use the custom configuration. Update the CoreDNS deployment to use the ConfigMap coredns-custom-config by patching the coredns pod kubectl patch deployment coredns -n kube-system -p '{"spec":{"template":{"spec":{"containers":[{"name":"coredns","volumeMounts":[{"name":"config-volume","mountPath":"/etc/coredns"}]}]}}}}'

  • As last step customize the DNS settings of the workloads to use the NodeLocal DNSCache address. Edit the Deployment or Pod manifest for your workload and add the dnsConfig section

      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: my-app
        labels:
          app: my-app
      spec:
        replicas: 3
        selector:
          matchLabels:
            app: my-app
        template:
          metadata:
            labels:
              app: my-app
          spec:
            dnsConfig:
              nameservers:
              - <node-local-address-of-NodeLocal-DNSCache> #Replace <node-local-address-of-NodeLocal-DNSCache> with the IP address of the NodeLocal DNSCache on each cluster node, which can be obtained from the ConfigMap created earlier.
              options:
              - name: ndots
                value: "5"
            containers:
            - name: my-app-container
              image: my-app-image:latest
              # ... rest of your container configuration
    

DNS Queries Optimization: In cases where DNS resolution timeouts and failures occur, consider optimizing DNS queries submitted by clients to reduce resolution latency. Properly configure container images, node operating systems, and NodeLocal DNSCache to mitigate DNS resolution errors.

Scaling CoreDNS Resources: In larger Kubernetes clusters, scaling CoreDNS resources, such as memory and CPU, can be beneficial to handle increased DNS resolution demands. Adjusting resource requirements based on the number of Pods and Services can help ensure optimal performance

CoreDNS Troubleshooting

Common issues and solutions

  • Intermittent Delays or Slow DNS Resolution: Users may experience intermittent delays or slow DNS resolution, resulting in increased response times for applications. To troubleshoot and diagnose the problem, we can use the dnsutils Pod as a test environment. Then, verify the status and performance of CoreDNS using commands like kubectl exec and checking the Corefile configuration.

    • Here's a sample code to troubleshoot DNS issues with CoreDNS using the dnsutils Pod as a test environment and checking the Corefile configuration. Create the YAML manifest file for the dnsutils Pod

        apiVersion: v1
        kind: Pod
        metadata:
          name: dnsutils
          namespace: default
        spec:
          containers:
          - name: dnsutils
            image: registry.k8s.io/e2e-test-images/jessie-dnsutils:1.3
            command:
            - sleep
            - "infinity"
            imagePullPolicy: IfNotPresent
            restartPolicy: Always
      
    • Apply the YAML manifest to create the dnsutils Pod using kubectl apply -f dnsutils.yaml

    • Check the status of the dnsutils Pod using kubectl get pods dnsutils

    • Once the dnsutils Pod is running, execute nslookup inside the Pod to verify DNS resolution i.e., kubectl exec -ti dnsutils -- nslookup <domain_name_to_resolve> or you can first exec into the pod as kubectl exec -ti dnsutils and then issue the nslookup command nsloopup <domain_name_to_resolve> Either of the options will give same results.

  • DNS Server Unavailability: The DNS server (CoreDNS) becomes unavailable, causing DNS resolution failures for the cluster. Investigate the logs and events related to CoreDNS to identify the root cause of its unavailability. Ensure that the CoreDNS Deployment has sufficient resources and is not experiencing crashes or restart loops.

  • DNS Configuration Conflicts: DNS configuration conflicts or misconfigurations may lead to DNS queries being forwarded incorrectly or not at all. Check the CoreDNS ConfigMap and verify the forwarding configuration. Ensure that CoreDNS is forwarding queries to the appropriate upstream DNS server, and there are no errors in the configuration.

  • DNS Caching and TTL Issues: DNS caching may cause outdated or stale DNS records to persist, leading to potential connectivity problems. Review the CoreDNS caching settings and ensure that the TTL (Time-to-Live) values are appropriately set. Lowering the TTL can reduce the time DNS records remain cached, promoting faster updates.

  • CoreDNS Resource Constraints: CoreDNS may experience resource constraints, leading to performance degradation or unresponsiveness. Monitor the resource usage of CoreDNS pods and ensure that it has sufficient CPU and memory resources to handle the DNS queries of the cluster.

  • Pod DNS Configuration Errors: Misconfigurations in Pod DNS settings may cause DNS resolution failures for individual workloads. Check the DNS settings in the Pod or Deployment manifests. Ensure that the correct nameservers are specified and that the Pod can access the DNS service.

  • DNS Security and Access Control: Unauthorized access to CoreDNS or DNS service can lead to DNS hijacking or malicious activities. Implement appropriate network security measures, such as Network Policies, to control access to CoreDNS and the DNS service. To implement network security measures using Network Policies in Kubernetes to control access to CoreDNS and the DNS service, you can create and apply Network Policy rules to restrict unauthorized access. Network Policies are used to define rules that control how pods are allowed to communicate with each other within the cluster. Here's a sample Network Policy to restrict access to CoreDNS

    • Create a YAML manifest file for the Network Policy

        apiVersion: networking.k8s.io/v1
        kind: NetworkPolicy
        metadata:
          name: coredns-policy
          namespace: kube-system
        spec:
          podSelector:
            matchLabels:
              k8s-app: kube-dns
          policyTypes:
          - Ingress
          ingress:
          - from:
            - podSelector:
                matchLabels:
                  app: your-app-label   # Replace 'your-app-label' with the label of your allowed application
            ports:
            - protocol: UDP
              port: 53
      
    • Apply the Network Policy to restrict access to CoreDNS kubectl apply -f coredns-policy.yaml

    • This Network Policy will restrict access to the CoreDNS Pods in the kube-system namespace only to Pods labeled with app: your-app-label. Adjust the your-app-label to the appropriate label of your allowed application. Remember, when applying Network Policies, ensure that you have proper testing and validation in place to avoid accidentally blocking legitimate traffic.

Debugging tools and techniques

To debug CoreDNS issues in a Kubernetes cluster, you can utilize various tools and techniques to identify and troubleshoot potential problems. Here are some commonly used debugging tools and techniques for CoreDNS

  1. kubectl commands: Use kubectl commands to interact with the Kubernetes cluster and get information about the CoreDNS Pods and their status. For example:

    • Get the list of running Pods: kubectl get pods -n kube-system

    • View the logs of a CoreDNS Pod: kubectl logs <coredns-pod-name> -n kube-system

  2. DNS Query Testing: Perform DNS query tests to check the resolution and response of CoreDNS. For instance, you can use tools like nslookup or dig to test DNS resolution for specific domain names. For example:

  3. Telnet: Use telnet to check if you can connect to the CoreDNS server from both control plane nodes and worker nodes. For example:

    • Test connection to CoreDNS server: telnet 10.96.0.10 53
  4. Prometheus Metrics: If you have enabled Prometheus metrics for CoreDNS, you can use Prometheus and Grafana to monitor and analyze CoreDNS metrics. This can provide insights into the performance, latency, and error rates of CoreDNS.

  5. Debugging with Hubble: Hubble is a tool that can be used to identify and inspect DNS issues in Kubernetes clusters. It allows you to monitor DNS communication and diagnose problems related to DNS resolution.

  6. CoreDNS Configuration: Review and verify the CoreDNS configuration (Corefile) to ensure it is correctly set up for your cluster's requirements. Make sure that the relevant plugins and configurations are in place.

  7. Network Policies: Ensure that Network Policies are correctly configured to control access to CoreDNS and prevent unauthorized access.

  8. Cluster Health Checks: Check the overall health of the Kubernetes cluster to identify any issues that might be affecting CoreDNS. Look for possible resource constraints, node failures, or network problems.

  9. Kubernetes Events: Check Kubernetes events to identify any issues reported by the system that might be related to CoreDNS.

  10. Service Discovery: Validate if CoreDNS is properly providing service discovery within the cluster. Verify that services can be accessed using their DNS names.

Monitoring CoreDNS with Prometheus

I try to give a sample here, but please do customize it as per your needs. To set up monitoring for CoreDNS with Prometheus in a Kubernetes cluster, we'll need to perform the following steps and provide code examples where applicable

  • Ensure Prometheus Setup: Set up Prometheus in your Kubernetes cluster. You can use Helm to install Prometheus. Create a values.yaml file to configure Prometheus and install it using Helm.

      # values.yaml
      server:
        global:
          scrape_interval: 15s
    
      alertmanager:
        persistentVolume:
          enabled: false
    
      kubeStateMetrics:
        enabled: true
    
      nodeExporter:
        enabled: true
    
      prometheusOperator:
        enabled: true
    

    Install Prometheus using Helm

    helm repo add prometheus-community https://prometheus-community.github.io/helm-charts

    helm install prometheus prometheus-community/kube-prometheus-stack -f values.yaml

  • Install CoreDNS with Prometheus Metrics Plugin: Deploy CoreDNS with the Prometheus metrics plugin enabled. This can be achieved by using a ConfigMap to customize the CoreDNS configuration

      # coredns.yaml
      apiVersion: apps/v1
      kind: Deployment
      metadata:
        name: coredns
        namespace: kube-system
      spec:
        replicas: 2
        selector:
          matchLabels:
            k8s-app: coredns
        template:
          metadata:
            labels:
              k8s-app: coredns
          spec:
            containers:
              - name: coredns
                image: coredns/coredns:latest
                args:
                  - -conf
                  - /etc/coredns/Corefile
                  - -prometheus
                volumeMounts:
                  - name: config-volume
                    mountPath: /etc/coredns
                    readOnly: true
            volumes:
              - name: config-volume
                configMap:
                  name: coredns-config
                  items:
                    - key: Corefile
                      path: Corefile
    

    Create the ConfigMap for CoreDNS kubectl create configmap coredns-config --from-file=Corefile=path/to/your/Corefile

  • Configure Prometheus to Scrape CoreDNS Metrics: Modify the Prometheus configuration to scrape metrics from CoreDNS

      # prometheus-prometheus.yaml (located in the same directory as coredns.yaml)
      apiVersion: monitoring.coreos.com/v1
      kind: Prometheus
      metadata:
        name: prometheus
        namespace: kube-system
      spec:
        serviceMonitorSelector:
          matchLabels:
            k8s-app: coredns
        resources:
          requests:
            memory: "400Mi"
    

    Apply the Prometheus configuration kubectl apply -f prometheus-prometheus.yaml

  • Verify Metrics Collection: Verify that Prometheus is scraping CoreDNS metrics and CoreDNS is listed as a target kubectl port-forward -n kube-system svc/prometheus-operated 9090:9090

    Access Prometheus UI at localhost:9090 and check the "Targets" section to ensure CoreDNS is listed as UP

  • Set Up Grafana for Visualization: Set up Grafana in your cluster and configure it to use Prometheus as a data source. You can use Helm to install Grafana.

    helm repo add grafana https://grafana.github.io/helm-charts

    helm install grafana grafana/grafana

  • Create CoreDNS Dashboards in Grafana: Create Grafana dashboards to visualize CoreDNS metrics. You can import pre-existing dashboards from Grafana Labs or create custom dashboards

  • Optional: Set Up Alerts: Optionally, you can set up alerts in Prometheus or Grafana to notify you of any critical CoreDNS issues

CoreDNS Security

Securing CoreDNS in Kubernetes is an essential task to ensure the stability, integrity, and privacy of DNS resolution within the cluster.

  • Enable CoreDNS with DNSSEC: DNSSEC (Domain Name System Security Extensions) adds an extra layer of security by signing DNS data with cryptographic signatures. It helps protect against DNS spoofing and tampering. To enable DNSSEC in CoreDNS, we need to modify the Corefile, which is the configuration file for CoreDNS.
# Corefile
.:53 {
    errors
    health
    kubernetes cluster.local in-addr.arpa ip6.arpa {
       pods insecure
       upstream
       fallthrough in-addr.arpa ip6.arpa
       ttl 30
    }
    prometheus :9153
    forward . /etc/resolv.conf {
        max_concurrent 1000
    }
    cache 30
    loop
    reload
    loadbalance
    dnssec validation {
        trusted_keys key.example.com. 257 3 5 AwEAAe...
    }
}

# In the Corefile, the dnssec plugin is added with the validation keyword. The trusted_keys block should include the trusted DNSSEC key(s) for your domain(s). Replace key.example.com with the appropriate domain and provide the actual DNSSEC key data.
  • Secure CoreDNS Configuration: Ensure that the CoreDNS configuration is secure by following best practices. Limit access to the CoreDNS configuration files to authorized users only.

  • Enable RBAC (Role-Based Access Control): RBAC provides granular access control for Kubernetes resources, including CoreDNS. Ensure that RBAC is enabled and properly configured in your cluster. Create RBAC rules that restrict access to CoreDNS resources. Here's an example of a CoreDNS ServiceAccount and RoleBinding

      # coredns-serviceaccount.yaml
      apiVersion: v1
      kind: ServiceAccount
      metadata:
        name: coredns
        namespace: kube-system
    
      # coredns-rolebinding.yaml
      apiVersion: rbac.authorization.k8s.io/v1
      kind: RoleBinding
      metadata:
        name: coredns-rolebinding
        namespace: kube-system
      subjects:
      - kind: ServiceAccount
        name: coredns
        namespace: kube-system
      roleRef:
        kind: Role
        name: coredns-role
        apiGroup: rbac.authorization.k8s.io
    
      # coredns-role.yaml
      apiVersion: rbac.authorization.k8s.io/v1
      kind: Role
      metadata:
        name: coredns-role
        namespace: kube-system
      rules:
      - apiGroups: [""]
        resources: ["configmaps"]
        verbs: ["get", "watch", "list"]
      - apiGroups: [""]
        resources: ["secrets"]
        verbs: ["get"]
    
  • Monitor CoreDNS for Security Breaches: Implement monitoring and alerting for CoreDNS to detect any security breaches or anomalies. Use tools like Prometheus and Grafana to monitor CoreDNS metrics, including query rates, errors, and response times.

  • Keep CoreDNS Up-to-Date: Regularly update CoreDNS to the latest version to ensure that security patches and bug fixes are applied.

  • Restrict CoreDNS Pod Access: If possible, restrict direct access to CoreDNS Pods from outside the cluster. Limiting access to only internal communications within the cluster adds an extra layer of security.

CoreDNS Use Cases

CoreDNS Use Cases

  • DNS Resolution: CoreDNS serves as a flexible and extensible DNS server, capable of resolving domain names to IP addresses and vice versa.

  • DNS Forwarding: CoreDNS can forward DNS requests to external DNS servers if it does not have the required data in its own database.

  • Custom DNS Entries: CoreDNS allows administrators to add custom DNS entries to serve specific domain names or override default DNS resolution behavior.

CoreDNS Use Cases in Kubernetes

  • Kubernetes Cluster DNS: CoreDNS is the default DNS service for Kubernetes clusters from version 1.13 onwards. It serves as the DNS resolver for all Kubernetes resources, including Pods, Services, and Deployments.

  • Service Discovery: CoreDNS enables service discovery within the Kubernetes cluster, allowing Pods and Services to communicate with each other using domain names.

  • DNS for Kubernetes Networking: CoreDNS is responsible for handling DNS requests related to Kubernetes networking, ensuring Pods can reach each other through their DNS names.

  • DNSSEC Support: CoreDNS in Kubernetes can be configured to support DNSSEC (Domain Name System Security Extensions), adding an extra layer of security to DNS resolution within the cluster.

  • Custom DNS Entries for Services: Kubernetes allows adding custom DNS entries using CoreDNS to map specific services to external domain names or IP addresses.

  • DNS Monitoring and Troubleshooting: CoreDNS metrics can be monitored to ensure proper functioning, and troubleshooting DNS-related issues within the Kubernetes cluster.

Conclusion

Summary of CoreDNS

  • Features

    • Fast and Efficient, Flexibility, Customizable, Automatic Service Discovery, Open Source and DNSSEC Support
  • Benefits

    • Efficient Networking, Service Discovery, Flexible Integration, Simplified Configuration, Improved Scalability and Enhanced Security

Miscellaneous

Kubernetes' ndots:5 issue

When a container needs to resolve a domain name (e.g., www.example.com) into an IP address (e.g., 192.168.1.1), it follows a process guided by the /etc/resolv.conf configuration file. The ndots:5 option is a part of the /etc/resolv.conf configuration. It specifies the number of dots (periods) present in a domain name to be considered a fully qualified domain name (FQDN). In this context, an FQDN is a domain name that includes its top-level domain and is considered complete. When ndots:5 is set, it means that if a domain name in a DNS query has at least five dots (e.g., subdomain.example.com), the DNS resolver in the container will treat it as an absolute name and won't try to append the default search domains specified in the configuration. This can have implications on how DNS queries are resolved within Kubernetes. For example, if a DNS query is made for a domain name that contains at least five dots, such as "www.subdomain.example.com," the DNS resolution will not consider the Kubernetes search domains like "namespace.svc.cluster.local," leading to potential resolution failures or unexpected IP addresses being returned.To avoid such issues, it's essential to configure ndots appropriately based on the specific requirements of your Kubernetes environment. Setting ndots:1, for example, ensures that even domain names with only one dot are treated as FQDNs and will be resolved using the configured search domains, thus improving DNS resolution accuracy within the Kubernetes cluster.

Resources

CoreDNS Configuration https://coredns.io/manual/configuration/

CoreDNS Example Plugin https://github.com/coredns/example

CoreDNS Plugins Documentation https://coredns.io/manual/toc/#plugins

CoreDNS Plugin configuration file https://github.com/coredns/coredns/blob/master/plugin.cfg

CoreDNS Main Github Page https://github.com/coredns/coredns

Learning CoreDNS https://www.oreilly.com/library/view/learning-coredns/9781492047957/ch04.html

Custom domains using Kubernetes CoreDNS https://www.youtube.com/watch?v=kPDy7Nb32e4

Customizing CoreDNS on an OVHcloud Managed Kubernetes cluster https://help.ovhcloud.com/csm/fr-public-cloud-kubernetes-customizing-coredns?id=kb_article_view&sysparm_article=KB0054972

Working with the CoreDNS Amazon EKS add-on https://docs.aws.amazon.com/eks/latest/userguide/managing-coredns.html

CoreDNS Kubernetes Plugin https://coredns.io/plugins/kubernetes/