Resolving intermittent Fedora DNF error "No such file or directory: '/var/lib/dnf/'"

For many of my Ansible playbooks and roles, I have CI tests which run over various distributions, including CentOS, Ubuntu, Debian, and Fedora. Many of my Docker Hub images for Ansible testing include systemd so I can test services that are installed inside. For the most part, systemd-related issues are rare, but it seems with Fedora and DNF, I often encounter random test failures which invariably have an error message like:

No such file or directory: '/var/lib/dnf/'

The full Ansible traceback is:

My DevOps books are free in April, thanks to Device42!

Last month I announced I was going to make my books Ansible for DevOps and Ansible for Kubernetes available free on LeanPub through the end of March, so people who are in self-isolation and/or who have lost their jobs could level up their automation skills.

The response floored me—in less than two weeks, I had given away over 40,000 copies of the two books, and they jumped to the top of LeanPub's bestseller lists.

Ansible for DevOps purchases - free and paid
Purchases (over 99% with price set to 'free') of both books spiked within hours of the announcement.

Ansible 101 by Jeff Geerling - YouTube streaming series

Ansible 101 Header Image

After the incredible response I got from making my Ansible books free for the rest of March to help people learn new automation skills, I tried to think of some other things I could do to help developers who may be experiencing hardship during the coronavirus pandemic and market upheaval.

So I asked on Twitter:

Ansible best practices: using project-local collections and roles

Note for Tower/AWX users: Currently, Tower requires role and collection requirements to be split out into different files; see Tower: Ansible Galaxy Support. Hopefully Tower will be able to support the requirements layout I outline in this post soon!

Since collections will be a major new part of every Ansible user's experience in the coming months, I thought I'd write a little about what I consider an Ansible best practice: that is, always using project-relative collection and role paths, so you can have multiple independent Ansible projects that track their own dependencies according to the needs of the project.

Early on in my Ansible usage, I would use a global roles path, and install all the roles I used (whether private or on Ansible Galaxy) into that path, and I would rarely have a playbook or project-specific role or use a different playbook-local version of the role.

Automatically building and publishing Ansible Galaxy Collections

I maintain a large number of Ansible Galaxy roles, and publish hundreds of new releases every year. If the process weren't fully automated, there would be no way I could keep up with it. For Galaxy roles, the process of tagging and publishing a new release is very simple, because Ansible Galaxy ties the role strongly to GitHub's release system. All that's needed is a webhook in your .travis.yml file (if using Travis CI):


For collections, Ansible Galaxy actually hosts an artifact—a .tar.gz file containing the collection contents. This offers some benefits that I won't get into here, but also a challenge: someone has to build and upload that artifact... and that takes more than one or two lines added to a .travis.yml file.

Until recently, I had been publishing collection releases manually. The process went something like:

Collections signal major shift in Ansible ecosystem

Every successful software project I've worked on reaches a point where architectural changes need to be made to ensure the project's continued success. I've been involved in the Drupal community for over a decade, and have written about the successes and failures resulting from a major rearchitecture in version 8. Apple's Macintosh OS had two major failed rewrites which were ultimately scrapped as Apple moved on to Mac OS X.

It's a common theme, and because change is hard, the first response to a major shift in a software project is often negative. Distrust over the project's stewards, or anger about a voice not being heard are two common themes. Even though it has nothing to do with the change (which was being discussed 3 years ago), the acquisition of Red Hat by IBM last year didn't do anything to assuage conspiracy theorists!

The Kubernetes Collection for Ansible

Opera-bull with Ansible bull looking on

The Ansible community has long been a victim of its own success. Since I got started with Ansible in 2013, the growth in the number of Ansible modules and plugins has been astronomical. That's what happens when you build a very simple but powerful tool—easy enough for anyone to extend into any automation use case.

When I started, I remember writing in Ansible for DevOps about 'hundreds' of modules—at the time, mostly covering Linux administration use cases. Today there are many thousands, covering Linux and Windows server administration, network automation, security automation, and even stranger use cases.

Jan-Piet Mens summed it up succinctly in a blog post last year, titled I care about Ansible:

In my opinion they’re being inundated.

Ansible for Kubernetes, my second self-published book

Ansible for Kubernetes book cover - by Jeff Geerling

Five years ago, I set out to write a book. For a topic, I picked Ansible, since I was familiar with the software, and noticed there weren't any other books about it. I struck gold with Ansible for DevOps, and have since sold over 22,000 copies between eBook and paperback copies.

I've written about self-publishing before, and my opinion about publishing technical works is stronger than ever:

I wrote an entire article (Self-Publish, don't write for a Publisher) on the first topic. Regarding the second topic, I see writing a technical book on the same plane as building a software project:

How to idempotently change file attributes (e.g. immutable) with Ansible

I recently needed to force the /etc/resolv.conf file to be immutable on a set of CentOS servers, since the upstream provider's DHCP server was giving me a poorly-running set of default DNS servers, which was getting written to the resolv.conf file on every reboot.

There are a few different ways to force your own DNS servers (and override DHCP), but one of the simplest, at least for my use case, is to change the file attributes on /etc/resolv.conf to make the file immutable (unable to be overwritten, e.g. by the network service's DHCP on reboot).

Typically you would do this on the command line with:

chattr +i /etc/resolv.conf

And Ansible's file module has an attributes (alias: attr) parameter which allows the setting of attributes. For example, to set the attributes to i, you would use a task like:

Run Ansible Tower or AWX in Kubernetes or OpenShift with the Tower Operator

Note: Please note that the Tower Operator this post references is currently in early alpha status, and has no official support from Red Hat. If you are planning on using Tower for production and have a Red Hat Ansible Automation subscription, you should use one of the official Tower installation methods. Someday the operator may become a supported install method, but it is not right now.

I have been building a variety of Kubernetes Operators using the Operator SDK. Operators make managing applications in Kubernetes (and OpenShift/OCP) clusters very easy, because you can capture the entire application lifecycle in the Operator's logic.

AWX Tower Operator SDK built with Ansible for Kubernetes