Introducing the Dramble - Raspberry Pi 2 cluster running Drupal 8

Dramble - 6 Raspberry Pi 2 model Bs running Drupal 8 on a cluster
Version 0.9.3 of the Dramble—running Drupal 8 on 6 Raspberry Pis

I've been tinkering with computers since I was a kid, but in the past ten or so years, mainstream computing has become more and more locked down, enclosed, lightweight, and, well, polished. I even wrote a blog post about how, nowadays, most computers are amazing. Long gone are the days when I had to worry about line voltage, IRQ settings, diagnosing bad capacitors, and replacing 40-pin cables that went bad!

But I'm always tempted back into my earlier years of more hardware-oriented hacking when I pull out one of my Raspberry Pi B+/A+ or Arduino Unos. These devices are as raw of modern computers as you can get—requiring you to actual touch the silicone chips and pins to be able to even use the devices. I've been building a temperature monitoring network that's based around a Node.js/Express app using Pis and Arduinos placed around my house. I've also been working a lot lately on a project that incorporates three of my current favorite technologies: The Raspberry Pi 2 model B (just announced earlier this month), Ansible, and Drupal!

In short, I'm building a cluster of Raspberry Pis, and designating it a 'Dramble'—a 'bramble' of Raspberry Pis running Drupal 8.

Getting Gigabit Networking on a Raspberry Pi 2, 3 and B+

tl;dr You can get Gigabit networking working on any current Raspberry Pi (A+, B+, Pi 2 model B, Pi 3 model B), and you can increase the throughput to at least 300+ Mbps (up from the standard 100 Mbps connection via built-in Ethernet).

Note about model 3 B+: The Raspberry Pi model 3 B+ includes a Gigabit wired LAN adapter onboard—though it's still hampered by the USB 2.0 bus speed (so in real world use you get ~224 Mbps instead of ~950 Mbps). So if you have a 3 B+, there's no need to buy an external USB Gigabit adapter if you want to max out the wired networking speed!

I received a shipment of some Raspberry Pi 2 model B computers for a project I'm working on (more on that to come!), and as part of my project, I've been performing a ton of benchmarks on every aspect of the 2, B+, and A+ Pis I have on hand—CPU, disk (microSD), external SSD, external HDD, memory, and networking.

NFS, rsync, and shared folder performance in Vagrant VMs

It's been a well-known fact that using native VirtualBox or VMWare shared folders is a terrible idea if you're developing a Drupal site (or some other site that uses thousands of files in hundreds of folders). The most common recommendation is to switch to NFS for shared folders.

NFS shared folders are a decent solution, and using NFS does indeed speed up performance quite a bit (usually on the order of 20-50x for a file-heavy framework like Drupal!). However, it has it's downsides: it requires extra effort to get running on Windows, requires NFS support inside the VM (not all Vagrant base boxes provide support by default), and is not actually all that fast—in comparison to native filesystem performance.

I was developing a relatively large Drupal site lately, with over 200 modules enabled, meaning there were literally thousands of files and hundreds of directories that Drupal would end up scanning/including on every page request. For some reason, even simple pages like admin forms would take 2+ seconds to load, and digging into the situation with XHProf, I found a likely culprit:

Diagnosing Disk I/O issues: swapping, high IO wait, congestion

One one small LEMP VPS I manage, I noticed munin graphs that showed anywhere between 5-50 MB/second of disk IO. Since the VM has an SSD instead of traditional spinning hard drive, performance wasn't too bad, but all that disk I/O definitely slowed things down.

I wanted to figure out what was the source of all the disk I/O, so I used the following techniques to narrow down the culprit (spoilers: it was MySQL, which was using some swap space because it was tuned to use a little too much memory).


First up was iotop, a handy top-like utility for monitoring disk IO in real-time. Install it via yum or apt, then run it with the command sudo iotop -ao to see an aggregated summary of disk IO over the course of the utility's run. I let it sit for a few minutes, then checked back in to find:

rsync in Vagrant 1.5 improves file performance and Windows usage

I've been using Vagrant for almost all development projects for the past two years, and for projects where I'm the only developer, Vagrant + VirtualBox has worked great, since I'm on a Mac. I usually use NFS shared folders so I can keep project data (Git/SVN repositories, assets, etc.) on my local computer, and share them to a folder on the VM, and not suffer the performance penalty of using VirtualBox's native shared folders.

However, this solution only scaled well to other Mac and Linux users with whom I shared development responsibilities. Windows users were left in a bit of a lurch. To extend an olive branch, I hackishly added SMB support by installing and configuring an SMB share from within the VM only on windows hosts, so Windows devs could mount the SMB share and work on files in their native editors.

Vagrant - NFS shared folders for Mac/Linux hosts, Samba shares for Windows

[Edit: I'm not using rsync shared folders (a new feature in 1.5+) instead of SMB/NFS - please see this post for more info: rsync in Vagrant 1.5 improves file performance and Windows usage].

[Edit 2: Some people have reported success using the vagrant-winnfsd plugin to use NFS in Windows.]

I've been using Vagrant to provision local development and testing VMs for a couple years, and on my Mac, NFS shared folders (which are supported natively by VirtualBox) work great; they're many, many times faster than native shared folders. To set up an NFS share in your Vagrantfile, just make sure the nfs-utils package is installed on the managed VM, and add the following:

    config.vm.synced_folder "~/Sites/shared", "/shared",
      :nfs => !is_windows,
      id: "shared"

Boost Expire module being deprecated; how to switch to Cache Expiration

BoostI'm a huge fan of Boost for Drupal; the module generates static HTML pages for nodes and other pages on your Drupal site so Apache can serve anonymous visitors the static pages without touching PHP or Drupal, thus allowing a normal web server (especially on cheaper shared hosting) to serve thousands instead of tens of visitors per second (or worse!).

For Drupal 7, though, Boost was rewritten and substantially simplified. This was great in that it made Boost more stable, faster, and easier to configure, but it also meant that the integrated cache expiration functionality was dumbed down and didn't really exist at all for a long time. I wrote the Boost Expire module to make it easy for sites using Boost to have the static HTML cache cleared when someone created, updated, or deleted a node or comment, among other things.

Moving Server functionality to Node.js increased per-server capacity by 100x

Just posted a new blog post to the Server blog: Moving functionality to Node.js increased per-server capacity by 100x. Here's a snippet from the post:

One feature that we just finished deploying is a small Node.js application that runs in tandem with Drupal to allow for an incredibly large number of servers and websites to be checked in a fraction of the time that we were checking them using only PHP, cron, and Drupal's Queue API.

If you need to do some potentially slow tasks very often, and they're either network or IO-bound, consider moving those tasks away from Drupal/PHP to a Node.js app. Your server and your overloaded queue will thank you!

Read more.

tl;dr Node.js is awesome for running through a large number of network or IO-bound tasks that would otherwise become burdensome at scale using Drupal's Queue API.

2013 VPS Benchmarks - Linode, Digital Ocean, Hot Drupal

Every year or two, I like to get a good overview of different hosting providers' VPS performance, and from time to time, I move certain websites and services to a new host based on my results.

In the past, I've stuck with Linode for many services (their end-to-end UX, and raw server performance is great!) that weren't intense on disk operations, and Hot Drupal for some sites that required high-performance IO (since Hot Drupal's VPSes use SSDs and are very fast). This year, though, after Digital Ocean jumped into the VPS hosting scene, I decided to give them a look.

Before going further, I thought I'd give a few quick benchmarks from each of the providers; these are all on middle-range plans (1 or 2GB RAM), and with the exception of Linode, the disks are all SSD, so should be super fast:

Disk Performance

Disk Performance Chart

Real User Monitoring (RUM) with Pingdom and Drupal

Edit: There's a module for that™ now: Pingdom RUM. The information below is for historical context only. Use the module instead, since it makes this a heck of a lot simpler.

Pingdom just announced that their Real User Monitoring service is now available for all Pingdom accounts—including monitoring on one site for free accounts!

This is a great opportunity for you to start making page-specific measurements of page load performance on your Drupal site.

To get started, log into your Pingdom account (or create one, if you don't have one already), then click on the "RUM" tab. Add a site for Real User Monitoring, and then Pingdom will give you a <script> tag, which you then need to insert into the markup on your Drupal site's pages.


Subscribe to RSS - performance