memcached

Mcrouter Operator - demonstration of K8s Operator SDK usage with Ansible

It wouldn't surprise me if you've never heard of Mcrouter. Described by Facebook as "a memcached protocol router for scaling memcached deployments", it's not the kind of software that everyone needs.

There are many scenarios where a key-value cache is necessary, and in probably 90% of them, running a single Redis or Memcached instance would adequately serve the application's needs. There are more exotic use cases, though, where you need better horizontal scaling and consistency.

Dealing with Drupal 8 and a giant cache_render table

There are a number of scenarios in Drupal 8 where you might notice your MySQL database size starts growing incredibly fast, even if you're not adding any content. Most often, in my experience, the problem stems from a exponentially-increasing-in-size cache_render table. I've had enough private conversations about this issue that I figure I'd write this blog post to cover common scenarios, as well as short and long-term fixes if you run into this issue.

Consider the following scenarios I've seen where a cache_render table increased to 10, 50, 100 GB or more:

Highly-Available PHP infrastructure with Ansible

I just posted a large excerpt from Ansible for DevOps over on the Server Check.in blog: Highly-Available Infrastructure Provisioning and Configuration with Ansible. In it, I describe a simple set of playbooks that configures a highly-available infrastructure primarily for PHP-based websites and web applications, using Varnish, Apache, Memcached, and MySQL, each configured in a way optimal for high-traffic and highly-available sites.

Here's a diagram of the ultimate infrastructure being built:

Highly Available Infrastructure

APC Caching to Dramatically Reduce MySQL traffic

One Drupal site I manage has seen MySQL data throughput numbers rising constantly for the past year or so, and the site's page generation times have become progressively slower. After profiling the code with XHProf and monitoring query times on a staging server using Devel's query log, I found that there were a few queries that were running on pretty much every page load, grabbing data from cache tables with 5-10 MB in certain rows.

The two main culprits were cache_views and cache_field. These two tables alone contained more than 16MB of data, which was queried on almost every page request. There's an issue on drupal.org (_field_info_collate_fields() memory usage) to address the poor performance of field info caching for sites with more than a few fields, but I haven't found anything about better views caching strategies.

Knowing that these two tables, along with the system cache table, were queried on almost every page request, I decided I needed a way to cache the data so MySQL didn't have to spend so much time passing the cached data back to Drupal. Can you guess, in the following graph, when I started caching these things?

MySQL Throughput graph - munin