cache

Clearing Cloudflare and Nginx caches with Ansible

Since being DDoS continuously earlier this year, I've set up extra caching in front of my site. Originally I just had Nginx's proxy cache, but that topped out around 100 Mbps of continuous bandwidth and maybe 5-10,000 requests per second on my little DigitalOcean VPS.

So then I added Cloudflare's proxy caching service on top, and now I've been able to handle months with 5-10 TB of traffic (with multiple spikes of hundreds of mbps per second).

That's great, but caching comes with a tradeoff—any time I post a new article, update an old one, or a post receives a comment, it can take anywhere between 10-30 minutes before that change is reflected for end users.

I used to use Varnish, and with Varnish, you could configure cache purges directly from Drupal, so if any operation occurred that would invalidate cached content, Drupal could easily purge just that content from Varnish's cache.

Nginx serving up the wrong site content for a Drupal multisite install with https

I had a 'fun' and puzzling scenario present itself recently as I finished moving more of my Drupal multisite installations over to HTTPS using Let's Encrypt certificates. I've been running this website—along with six other Drupal 7 sites—on an Nginx installation for years. A few of the multisite installs use bare domains, (e.g. jeffgeerling.com instead of www. jeffgeerling.com), and because of that, I have some http redirects on Nginx to make sure people always end up on the canonical domain (e.g. example.com instead of www. example.com).

My Nginx configuration is spread across multiple .conf files, e.g.:

Dealing with Drupal 8 and a giant cache_render table

There are a number of scenarios in Drupal 8 where you might notice your MySQL database size starts growing incredibly fast, even if you're not adding any content. Most often, in my experience, the problem stems from a exponentially-increasing-in-size cache_render table. I've had enough private conversations about this issue that I figure I'd write this blog post to cover common scenarios, as well as short and long-term fixes if you run into this issue.

Consider the following scenarios I've seen where a cache_render table increased to 10, 50, 100 GB or more:

How to attach a CSS or JS library to a View in Drupal 8

File this one under the 'it's obvious, but only after you've done it' category—I needed to attach a CSS library to a view in Drupal 8 via a custom module so that, wherever the view displayed on the site, the custom CSS file from my module was attached. The process for CSS and JS libraries is pretty much identical, but here's how I added a CSS file as a library, and made sure it was attached to my view:

Add the CSS file as a library

In Drupal 8, drupal_add_css(), drupal_add_js(), and drupal_add_library() were removed (for various reasons), and now, to attach CSS or JS assets to views, nodes, etc., you need to use Drupal's #attached functionality to 'attach' assets (like CSS and JS) to rendered elements on the page.

In my custom module (custom.module) or custom theme (custom.theme), I added the CSS file css/custom_view.css:

Use Drupal 8 Cache Tags with Varnish and Purge

Varnish cache hit in Drupal 8

Over the past few months, I've been reading about BigPipe, Cache Tags, Dynamic Page Cache, and all the other amazing-sounding new features for performance in Drupal 8. I'm working on a blog post that more comprehensively compares and contrasts Drupal 8's performance with Drupal 7, but that's a topic for another day. In this post, I'll focus on cache tags in Drupal 8, and particularly their use with Varnish to make cached content expiration much easier than it ever was in Drupal 7.

Always getting X-Drupal-Cache: MISS? Check for messages

I spent about an hour yesterday debugging a Varnish page caching issue. I combed the site configuration and code for anything that might be setting cache to 0 (effectively disabling caching), I checked and re-checked the /admin/config/development/performance settings, verifying the 'Expiration of cached pages' (page_cache_maximum_age) had a non-zero value and that the 'Cache pages for anonymous users' checkbox was checked.

After scratching my head a while, I realized that the headers I was seeing when using curl --head [url] were specified as the defaults in drupal_page_header(), and were triggered any time there was a message displayed on the page (e.g. via drupal_set_message()):

X-Drupal-Cache: MISS
Expires: Sun, 19 Nov 1978 05:00:00 GMT
Cache-Control: no-cache, must-revalidate, post-check=0, pre-check=0
X-Content-Type-Options: nosniff

Boost Expire module being deprecated; how to switch to Cache Expiration

BoostI'm a huge fan of Boost for Drupal; the module generates static HTML pages for nodes and other pages on your Drupal site so Apache can serve anonymous visitors the static pages without touching PHP or Drupal, thus allowing a normal web server (especially on cheaper shared hosting) to serve thousands instead of tens of visitors per second (or worse!).

For Drupal 7, though, Boost was rewritten and substantially simplified. This was great in that it made Boost more stable, faster, and easier to configure, but it also meant that the integrated cache expiration functionality was dumbed down and didn't really exist at all for a long time. I wrote the Boost Expire module to make it easy for sites using Boost to have the static HTML cache cleared when someone created, updated, or deleted a node or comment, among other things.

APC Caching to Dramatically Reduce MySQL traffic

One Drupal site I manage has seen MySQL data throughput numbers rising constantly for the past year or so, and the site's page generation times have become progressively slower. After profiling the code with XHProf and monitoring query times on a staging server using Devel's query log, I found that there were a few queries that were running on pretty much every page load, grabbing data from cache tables with 5-10 MB in certain rows.

The two main culprits were cache_views and cache_field. These two tables alone contained more than 16MB of data, which was queried on almost every page request. There's an issue on drupal.org (_field_info_collate_fields() memory usage) to address the poor performance of field info caching for sites with more than a few fields, but I haven't found anything about better views caching strategies.

Knowing that these two tables, along with the system cache table, were queried on almost every page request, I decided I needed a way to cache the data so MySQL didn't have to spend so much time passing the cached data back to Drupal. Can you guess, in the following graph, when I started caching these things?

MySQL Throughput graph - munin

From OSC: Caching a Page; Saving a Server

I posted a story over on Open Source Catholic today concerning page caching and its importance for saving a server under a heavy load (read: the slashdot effect). You can save a lot of resources on your server by not only using built-in page caching on your favorite CMS, but also exploring further options (for Drupal, there's Boost (read our case study on Boost); for WordPress, there's WP Super Cache). From OSC:

A couple months ago, the Archdiocese of Saint Louis announced that a new Archbishop had been chosen (then-Archbishop-elect Robert J. Carlson). For the announcement, the Archdiocese streamed the press conference online, then posted pictures on the St. Louis Review website of the day's events (updated every hour or two).