Since this is the third time I've burned more than a few hours on this particular problem, I thought I'd finally write up a blog post. Hopefully I find this post in the future, the fourth time I run into the problem.
What problem is that? Well, when I build a new Kubernetes cluster with multiple nodes in VirtualBox (usually orchestrated with Vagrant and Ansible, using my geerlingguy.kubernetes role), I get everything running.
kubectl works fine, all pods (including CoreDNS, Flannel or Calico, kube-apiserver, the scheduler) report
Running, and everything in the cluster seems right. But there are lots of strange networking issues.
Sometimes internal DNS queries work. Most of the time not. I can't ping other pods by their IP address. Some of the debugging I do includes: