What is etcd?
Etcd is highly-available, distributed and consistent key-value store database. The name etcd was formed from /etc directory of Linux and d
istribution. Thus the addition of those two formed etcd
Etcd is a leader based highly available distribution key-value store database system. So, if we are running etcd outside the control plane and as it's own cluster, then we need to ensure that the leader periodically send heartbeats on time to all followers to keep the cluster stable
How does etcd typically work in multi-cluster?
Since etcd is for distributed and highly available systems, having etcd run on its own dedicated machines is desired, at least in production systems. Etcd works on the raft protocol which is nothing but a leader-follower model. At any given point in time only one leader will be available and the rest of the etcd nodes will be followers, following the instructions that leader etcd node orders to perform.
How does Kubernetes use Etcd?
Kubernetes uses etcd to store entire cluster data i.e., configuration data, state, Kubernetes resources and metadata. Since Kubernetes is a distributed system, hence there's a needs of a distributed data store like etcd. What etcd also provides is, it lets any of the nodes in the Kubernetes cluster to read and write data.
When we supply a command like below
kubectl get pods
kubernetes, through api server, fetches the information from etcd all the details of the pods and displays as output.
Etcd also stores, the actual and desired state of the cluster level resources. When a create, edit or apply (or state changing) commands are issued kubernetes will use etcd's watch functionality to help monitor the desired vs actual states. Kubernetes by this will also try to bring eventual consistency and equilibrium between desired state and actual state.
If someone's talking about Kubernetes knowingly or unknowingly, they are talking about etcd. So having etcd clusters always stable is critical to the overall stability of Kubernetes clusters. Hence, it's desired in production to run etcd clusters on separate and dedicated machines for guaranteed resource requirements.
Etcd vs other key-value store databases
Since our main goal of this article is to understand etcd in the context of kubernetes, i will leave this link for you to explore how it differs from other key-value store databases.
Why there is no plugin, like CNI, CRI, CSI etc., for etcd alternative?
There was a fair bit of discussion around this alternative here. But what ultimately decided was not to complicate the Kubernetes as a whole, and that kubernetes teams running after unnecessarily supporting these additional stuff. So, they have closed discussion once and for all, that there won't be any pluggable alternative to etcd.
In the second part of this article we will go over the following topics
- Setup etcd on a single node
- Setup etcd on multi-node for high availability
- How to interact with the etcd using etcdctl