As we all know, the basic units running in a Kubernetes cluster are docker containers, docker has its native logging drivers that can provide basic logging functionality, whether to write to stdout and stderr, or write to json file with rotations.
However, it’s not enough for a full logging solution. For example, your API services may have 3 replicas, which in turn will bring up 3 containers on which your logs are distributed across, how do you aggregate all the logs and be able to view them at one place? Another example is if your container crashes, the container will die and will be replaced with another one on the node, or could even gets scheduled on a different node, in this case, the logs of that container could get lost. Due to the above reasons, when it comes to logging, Kubernetes promotes cluster-level-logging, it means we need to have a separate backend to store, analyze and query logs, so that the logs are independent of the lifecycle of of any containers, pods or the nodes in the cluster.
Kubernetes does not ship with any logging facility that suffices the cluster-level-logging requirement, however there are many open source solutions you can leverage on to implement a full logging service. Among them, fluentd + elasticsearch form a great combination to provide a cluster level logging solution. In this post, I’ll introduce how to get it done.
In a nutshell, the ideal is:
Use fluentd as an agent on each node to scrape all the container logs
Push the logs to Elasticsearch for storing, analyzing and querying using logstash-ish indices.
We need to make sure we have a fluentd agent running on each node, and this fits the DaemonSet of Kubernetes perfectly. In order to make sure the pods can only access relevant resources in the cluster, we will create a service account and use it to create the DaemonSet later.
By default, you don’t need to make any configuration changes, this image will try to collect the following logs if they exists in /var/log/ directory, pack them with kubernetes metadata and push to an elasticsearch backend with logstash-ish indices.
all your docker containers logs
In our case, we are only interested in the containerized application logs for our workload but not the system applications, in order to achieve this, we can customize the the fluentd configuration file kubernetes.conf to replace the default one. We will only follow the docker container logs that we are interested. Typically you will have different namespaces for your application instead of using *default * and kube-system.
We can create a config map that contains fluentd.conf and kubernetes.conf and mount it to the /fluentd/etc path in the fluentd pods. The configuration file is shown as follows.
You might wonder why the host path /var/lib/docker/containers is also mounted as volume. It’s because the docker container logs in /var/log/ are actually symbol links which points to /var/lib/docker/containers/(container_id)/*******.log. If you don’t mount /var/lib/docker/container, fluentd won’t be able to read the actually log files.
Wait for the Daemonset to be deployed, you should find the pods count equals your node count.
You might notice that we set the FLUENT_ELASTICSEARCH_HOST to elasticsearch, this actually implicates that we should have an elasticsearch service running within the cluster itself. If you are using some external elasticsearch service, you can just set the three environment variables accordingly to point to your elasticsearch service and you will be all set:
You should start to receive logstash-ish logs in your elasticsearch cluster once you have the DaemonSet up and running.
1 2 3 4 5 6
curl -X GET 'http://your_elastic_service_url:9200/_cat/indices?v='
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size green open logstash-2018.07.04 bRiQJCnrTZqOQQI4ol1mfw 5 1 16744 0 24.4mb 12.1mb green open .kibana 6UUqrS5_RUSS5Ozy2tSYTg 1 1 19 1 115kb 57.5kb
If you want to use an in-cluster elasticsearch cluster, obviously you need to deploy one, please continue reading.
This part is assuming that your Kubernetes cluster is running on AWS EC2 instances across different availability zones to achieve HA. However even if it’s not, it can also provide a reference for you.
Here we’ll deploy a minimum HA elasticsearch cluster which consists:
3 master nodes
2 data nodes
The 3 master nodes manage the cluster and opens API access to clients. 1 of the 3 at all times will be elected as the leader, so that we have an active-passive HA master cluster.
In order not to put the masters under pressure, we separate the data nodes from master nodes. The 2 dedicated data nodes are used to perform resources heavy data operations such as CURD, search and etc. We use 2 nodes to make sure that for each index we have at least a replica on a different node, so if one node is offline, the cluster is still able to function.
Use Kubernetes Persistent Volume
In Kubernetes, usually your applications are stateless, which means it does not matter where the pods running them are deployed and how many times they are restarted, you don’t reply on any storage which outlives the life time of the pods. However, it is clearly the opposite case for elasticsearch cluster, as it provides a data service, you don’t want to lose your data when one or two pods running elasticsearch application get restarted.
And this is where Persistent Volume (PV) comes in to play. A PV in Kubernetes is a storage resource in the cluster just like a node is a computing resource in the cluster. The lifecycle of a PV is independent to any pods that use it.
A pod uses PV via Persistent Volume Claim (PVC), which translates to: I want to claim X size of storage with Y access mode from the cluster. So PVCs are like pods, as PVCs uses PV resource while pods use node resource.
Dynamic Provisioning Of PV
When a pod requests PV using PVC and the cluster can’t find a matching PV for it, it will try to dynamically provision a PV and bind it with the PVC. This is called dynamic provisioning. To enable dynamic provisioning, the API server must have been configured to support it, the PVC also needs to specify a StorageClass because that’s what the cluster will refer to when it provisions the PV.
AWSElasticBlockStore StorageClass and PV
Since the Kuberentes cluster is deployed on AWS EC2 instances, it’s nature to use EBS PV. To achieve HA, I assume that you have your node distributed across multiple zones, in this example, let’s say we have us-east-1a, us-east-1b and us-east-1c. The first thing to do is to create a StorageClass for each zone.
Notes: You can specify one StorageClass with zones: us-east-1a,us-east-1b,us-east-1c, however, when you claim volume using this StorageClass, you can’t guarantee which zone the EBS volumes the cluster creates lies in, and it’s every likely that the EBS volume created lies in a different zone from the node where the pod gets scheduled on, thus the binding can never be successfully. This is the reason why we create multiple StorageClass objects and point each of them to a specific zone.
Deploy data nodes using Kubernetes StatefulSet
StatefulSet, on the contrast of Deployment, is used for stateful applications. The major differences between StatefulSet and Deployment are StatefulSet:
Maintenance a sticky identity of pods.
Have stable network ID. (with the help of a headless service)
Have stable and persistent storage.
It makes perfect sense to use StatefulSet to deploy the elasticsearch data nodes, because we want the nodes to be across different zones and we want the persistent EBS volumes to be dynamically created, we need to create two StatefulSet, 1 per zone.
A headless service is created and points to the pods.
Each StatefulSet is restricted to deploy pods on only 1 zone, with the PVC pointing to the StorageClass for that zone.
We turn off node.master so that they are dedicated data nodes.
In order to let elasticsearch user access to mounted volume, we added a initial container to grant the access to the data path.
Deploy master nodes using Kubernetes Deployment
Deploying the master nodes are much easier than data nodes as they don’t need persistent storage. We use Deployment to manage that. In order to let the clients access the API, we need to expose the service as well, in my case, I use ingress to expose it.
A 60 seconds waiting time are set to allow the pods become alive and to able to health checked
Notes: when you kubectl apply to deploy the data nodes, you might encounter timeout for binding the EBS volumes, don’t panic, just do it again. It’s because it takes time for AWS to provision the volume and make it become available to mount.
Deploy Kibana to visualise your elasticsearch data
Your fluentd + elasticsearch cluster combination should be working fine now, check the status by running:
Now to play with your data, you might want to install Kibana, it’s quite straightforward, we just need to deploy 1 instance of Kibana using Deployment, and we also expose the service using Ingress so that we can access it.