drupal 7

Solr for Drupal Developers, Part 3: Testing Solr locally

In earlier Solr for Drupal Developers posts, you learned about Apache Solr and it's history in and integration with Drupal. In this post, I'm going to walk you through a quick guide to getting Apache Solr running on your local workstation so you can test it out with a Drupal site you're working on.

The guide below is for those using Mac or Linux workstations, but if you're using Windows (or even if you run Mac or Linux), you can use Drupal VM instead, which optionally installs Apache Solr alongside Drupal.

As an aside, I am writing this series of blog posts from the perspective of a Drupal developer who has worked with large-scale, highly customized Solr search for Mercy (example), and with a variety of small-to-medium sites who are using Hosted Apache Solr, a service I've been running as part of Midwestern Mac since early 2011.

Installing Apache Solr in a Virtual Machine

Apache Solr can be run directly from any computer that has Java 1.7 or later, so technically you could run it on any modern Mac, Windows, or Linux workstation natively. But to keep your local workstation cleaner, and to save time and hassle (especially if you don't want to kludge your computer with a Java runtime!), this guide will show you how to set up an Apache Solr virtual machine using Vagrant, VirtualBox, and Ansible.

Let's get started:

Drupal on Mothballs - Convert Drupal 6 or 7 sites to static HTML

Drupal.org has an excellent resource page to help you create a static archive of a Drupal site. The page references tools and techniques to take your dynamically-generated Drupal site and turn it into a static HTML site with all the right resources so you can put the site on mothballs.

From time to time, one of Midwestern Mac's hosted sites is no longer updated (e.g. LOLSaints.com), or the event for which the site was created has long since passed (e.g. the 2014 DrupalCamp STL site).

I though I'd document my own workflow for converting typical Drupal 6 and 7 sites to static HTML to be served up on a simple Apache or Nginx web server without PHP, MySQL, or any other special software, since I do a few special things to preserve the original URL alias structure, keep CSS, JS and images in order, and make sure redirections still work properly.

NFS, rsync, and shared folder performance in Vagrant VMs

It's been a well-known fact that using native VirtualBox or VMWare shared folders is a terrible idea if you're developing a Drupal site (or some other site that uses thousands of files in hundreds of folders). The most common recommendation is to switch to NFS for shared folders.

NFS shared folders are a decent solution, and using NFS does indeed speed up performance quite a bit (usually on the order of 20-50x for a file-heavy framework like Drupal!). However, it has it's downsides: it requires extra effort to get running on Windows, requires NFS support inside the VM (not all Vagrant base boxes provide support by default), and is not actually all that fast—in comparison to native filesystem performance.

I was developing a relatively large Drupal site lately, with over 200 modules enabled, meaning there were literally thousands of files and hundreds of directories that Drupal would end up scanning/including on every page request. For some reason, even simple pages like admin forms would take 2+ seconds to load, and digging into the situation with XHProf, I found a likely culprit:

Fixing Drupal Fast - Using Ansible to deploy a security update on many sites

Earlier today, the Drupal Security Team announced SA-CORE-2014-005 - Drupal core - SQL injection, a 'Highly Critical' bug in Drupal 7 core that could result in SQL injection, leading to a whole host of other problems.

While not a regular occurrence, this kind of vulnerability is disclosed from time to time—if not in Drupal core, in some popular contributed module, or in some package you have running on your Internet-connected servers. What's the best way to update your entire infrastructure (all your sites and servers) against a vulnerability like this, and fast? High profile sites could be quickly targeted by criminals, and need to be able to deploy a fix ASAP... and though lower-profile sites may not be immediately targeted, you can bet there will eventually be a malicious bot scanning for vulnerable sites, so these sites need to still apply the fix in a timely manner.

In this blog post, I'll show how I patched all of Midwestern Mac's Drupal 7 sites in less than 5 minutes.

Multi-value spatial search with Solr 4.x and Drupal 7

For some time, Solr 3.x and Drupal 7 have been able to do geospatial search (using the location module, geofield, or other modules that stored latitude and longitude coordinates in Drupal that could be indexed by Apache Solr). Life was good—as long as you only had one location per node!

Sometimes, you may have a node (say a product, or a personality) affiliated with multiple locations. Perhaps you have a hammer that's available in three of your company's stores, or a speaker who is available to speak in two locations. When solr 3.x and Drupal 7 encountered this situation, you would either use a single location value in the index (so the second, third, etc. fields weren't indexed or searched), or if you put multiple values into solr's search index using the LatLonType, solr could throw out unexpected results (sometimes combining the closest latitude and closest longitude to a given point, meaning you get strange search results).

Multisite Apache Solr Search with Domain Access

Using one Apache Solr search core with more than one Drupal website isn't too difficult; you simply use a module like Apache Solr Multisite Search, or a technique like the one mentioned in Nick Veenhof's post, Let's talk Apache Solr Multisite. This kind of technique can save you time (and even money!) so you can use one Hosted Apache Solr subscription with multiple sites. The only caveat: any site using the solr core could see any other site's content (which shouldn't be a problem if you control all the sites and don't expose private data through solr).

There are two ways to make Apache Solr Search Integration work with Domain Access (one of which works similarly to the methods mentioned above for multisite), and which method you use depends on how your site's content is structured.

Hosted Apache Solr for Drupal

Midwestern Mac has been offering Apache Solr hosting for Drupal websites for the past three years, but this service has never been given too much attention or made easy to sign up for and use—until now!

Today we're announcing the re-launch of our service with a new website: Hosted Apache Solr.

Hosted Apache Solr home page - Drupal 7

The website was built on Drupal 7, and uses a custom base theme shared with Server Check.in (our server monitoring service built with Drupal and Node.js). We built a small payment integration module for PayPal subscriptions (though we're considering using Drupal Commerce, so we can use different payment processors more easily), and have built a very simple to use front-end for managing Solr core subscriptions.

Overriding a template file (.tpl.php) from a module

There are many times when a custom module provides functionality that requires a tweaked or radically altered template file, either for a node, a field, a view, or something else.

While it's often a better idea to use a preprocess or alter function to accomplish what you're doing, there are many times where you need to change the markup/structure of the HTML, and modifying a template directly is the only way to do it. In these cases, if you're writing a generic custom module that needs to be shared among different sites with different themes, you can't just throw the modified template into each theme, because you'd have to make sure each of the sites' themes has the same file, and updating it would be a tough proposition.

I like to keep module-based functionality inside modules themselves, so I put all templates that do specific things relating to that module into a 'templates' subdirectory.

Migrating Drupal 7 users from site to site while preserving password hashes

From time to time, I use the incredibly powerful Migrate module to migrate a subset of users from one Drupal 7 site to another.

Setting up the user migration class is pretty straightforward, and there are some great examples out there for the overall process. However, I couldn't find any particular documentation for how to preserve user passwords when migrating users from D7 to D7. It's simple enough to set the 'md5_passwords' boolean for Drupal 6 to Drupal 7 user migrations, so passwords will be updated when a user logs in the first time on the D7 site... but it's not as straightforward if you want to simply move the salted/hashed passwords from D7 to D7.

During the migration, when the user account is saved, Drupal will re-salt and re-hash the already-hashed-and-salted password you pass in through your field mappings, and users will have to reset their passwords to log in again.

To override this behavior, you need to implement the complete() function in your user migration, and manually overwrite the just-saved user account password field:

Boost Expire module being deprecated; how to switch to Cache Expiration

BoostI'm a huge fan of Boost for Drupal; the module generates static HTML pages for nodes and other pages on your Drupal site so Apache can serve anonymous visitors the static pages without touching PHP or Drupal, thus allowing a normal web server (especially on cheaper shared hosting) to serve thousands instead of tens of visitors per second (or worse!).

For Drupal 7, though, Boost was rewritten and substantially simplified. This was great in that it made Boost more stable, faster, and easier to configure, but it also meant that the integrated cache expiration functionality was dumbed down and didn't really exist at all for a long time. I wrote the Boost Expire module to make it easy for sites using Boost to have the static HTML cache cleared when someone created, updated, or deleted a node or comment, among other things.