elasticsearch

Ignore noisy logs with fluentd in EKS or other Kubernetes clusters

Recently, I decided to use the fluentd-kubernetes-daemonset project to easily ship all logs from an EKS Kubernetes cluster in Amazon to an Elasticsearch cluster operating elsewhere.

The initial configuration worked great out of the box—just fill in details like the FLUENT_ELASTICSEARCH_HOST and any authentication info, and then deploy the RBAC rules and DaemonSet into your cluster, and you're off to the races (assuming your Elasticsearch instance is configured to allow access from the cluster!).

But once I did that, I noticed the brand new EKS cluster was sending over 16,000 log messages per second to Elasticsearch. Doing a tiny bit of analysis (not much was required, honestly), I found that over 98% of the logs were coming from two EKS-specific noisy containers, efs-csi-node and ebs-snapshot-controller.

Cracks are showing in Enterprise Open Source's foundations

I've worked in open source my entire career1. To say that I'm worried about the impact recent events have on the open source ecosystem would be an understatement.

Red Hat and Elastic logos

In the past couple months:

  • Red Hat effectively killed CentOS
  • Elastic effectively killed Elasticsearch

People may rightfully refute these statements, but the statements are more complicated than you might think. Killing a project doesn't mean the project will vanish overnight, but what has happened so far is two very large companies in the 'enterprise open source' space have shown the chinks in the armor of the monetization of open source software.

For many years, everyone in the industry pointed at Red Hat as the shining example of 'how to build a company around open source'.

And for the past decade, the open source Elasticsearch, Logstash, and Kibana logging ecosystem was on a tear, becoming a standard in the open source cloud stack.