Quick logrotate example for Apache logs and some gotchas

On one server, where I have a custom directory where all the Apache (httpd) error and access logs are written, one set per virtualhost, I noticed the folder had grown to multiple gigabytes in size (found using du -h --max-depth=1)—in this situation, there's a handy utility on pretty much every Linux/UNIX system called logrotate that is made to help ensure log files don't grow too large. It periodically copies and optionally compresses the log files and deletes old logs, daily, monthly, or on other schedules.

For this server, to quickly fix the problem of growing-too-large log files, I added a file 'httpd-custom' at /etc/logrotate.d/httpd-custom, with the following contents:

/home/user/log/httpd/*log<br />
/home/user/log/httpd/*err<br />
{<br />
rotate 5<br />
size 25M<br />
missingok<br />
notifempty<br />
sharedscripts<br />
compress<br />
postrotate<br />
/sbin/service httpd reload &gt; /dev/null 2&gt;/dev/null || true<br />
endscript<br />
}<br />

Easily manage Apache VirtualHosts with Ansible and Jinja2

Server Check.in's entire infrastructure is managed via Ansible, which has helped tremendously as the service has grown from one to many servers.

Ansible Borg Cow
cowsay and Ansible were made for each other.

One pain point with running Apache servers that host more than one website (using name-based virtual hosts) is that the virtual host configuration files quickly get unwieldy, as you have to define the entire <virtualhost></virtualhost> for every domain you have on the server, and besides Apache's mod_macro, there's no easy way to define a simple structured array of information and have the vhost definitions built from that.

Apache Kerberos Authentication and basic authentication fallback

Many businesses and organizations use Active Directory or other LDAP-based authentication systems, and many web applications (like Drupal) can easily integrate with them for authentication and user account provisioning.

The Kerberos Module for Apache allows users to be automatically logged into your web application, by passing through their credentials behind the scenes. This makes for a seamless user experience—the user never needs to log into your web application if the user is authenticated on his local machine.

A standard configuration for Kerberos authentication inside your Apache configuration file looks like:

<br />
<directory><br />
    # By default, allow access to anyone.<br />
    Order allow,deny<br />
    Allow from All</directory>

3 Small Tweaks to make Apache fly

Apache is the venerable old-timer in the http server world. There are many younger siblings like Nginx, Lighttpd, and even Node.js, which are often touted as being faster, lighter, and more scalable alternatives than Apache.

Old computer and man
Apache probably looks like this to many Nginx and Lighty users.

Though many alternatives are more lightweight and can be faster in certain circumstances, Apache offers many benefits (not the least of which is abundant documentation and widespread support) and is still a top-notch web server that can be tuned to fly.

Below I describe a few seemingly innocuous Apache configuration settings that can make a huge difference for your site's performance, and help Apache run as fast or faster than alternative servers in many circumstances.

Force SSL (https://) for only one virtual host with .htaccess

Many servers I help administer host many websites; and every now and then, someone wants me to set up a secure (SSL) certificate for one of the websites on the server. Once the certificate is working in Apache, and users can access the site at https://example.com/, they also request that all traffic that was originally destined for either http://www.example.com/ or http://example.com/ be routed to the secure site.

This can be slightly tricky if you're using multiple VirtualHosts on the same server/multisite installation with something like WordPress or Drupal, because if you just add in something like below with multiple sites routed through the same .htaccess file, ALL sites will be routed to the https version (which is not what's desired):

<br />
RewriteEngine On<br />
RewriteCond %{SERVER_PORT} 80<br />
RewriteRule ^(.*)$ https://example.com/$1 [R,L]<br />

Drupal 6.x and PHP 5.3.x - Date Timezone warnings

This morning, I was presented with quite the conundrum: one of my servers suddently started having about 4x the normal MySQL traffic it would have in a morning, and I had no indication as to why this was happening; traffic to the sites on the server was steady (no spikes), and I couldn't find any problems with any of the sites.

munin mysql traffic spike

However, after inspecting the Apache (httpd) error logs for the Drupal 6 sites, I found a ton of PHP warnings on almost all the sites. Something like the following:

Drupal Performance Guide - Drupal and the LAMP/LEMP stack

LAMP Stack with Drupal - Druplicon, Linux, Apache, MySQL, PHP

Drupal is a scalable, flexible, and open source content management system that is built to run on a variety of server architectures. The only real requirement is that PHP runs on your system. You can run Linux, Microsoft, Mac OS X, etc., along with Apache, IIS, nginx, MariaDB, MySQL, PostgreSQL, etc. if you're willing to do a few extra things.

However, the overwhelming majority of Drupal websites use the most popular LAMP stack on the backend: Linux, Apache, MySQL and PHP, or the 'LEMP' variation, with Nginx instead of Apache. This white paper (which is a living document – I'll be updating it as time progresses) provides my thoughts on performance considerations for Drupal on a LAMP stack, but this information can be used for pretty much any system on any server, if you look at the basic principles.


Gzip/mod_deflate not Working? Check your Proxy Server

Recently, I was troubleshooting performance issues on a few different websites, and was stymied by the fact that YSlow repeatedly reported an F for "Compress components with gzip," even though online sites like GIDNetwork's Gzip test were reporting successful Gzipping of text components on the site.

Gzip Failed
Yslow results - not very happy.

After scratching my head for a while, I finally figured out the problem, hinted at by a comment on a question on Stack Overflow. Our work's proxy server was blocking the 'Accept-Encoding' http header that is sent along with every file request; this prevented a gzipped transfer of any file, thus Yslow gave an F.

I set up a secure tunnel (using SSH) from my computer to the web server directly, and then reloaded the page in FireFox, and re-ran YSlow:

Drupal Development Environment on Mac OS X 10.6 - Multisite Capable

I've begun working a lot more with Drupal multisites, as doing so saves a lot of time in certain situations (usually, when you have a large group of sites that use the same kinds of Drupal modules, but need to have separate databases and front-end information.

One problem I've finally overcome is the use of actual domain host names for development (i.e. typing in dev.example.com instead of localhost to get to a site). This is important when doing multisite work, as it lets you use Drupal's built-in multisite capabilities without having to hack your way around using the http://localhost/ url.

Here's what I did to use dev.example.com to access a dev.example.com multisite in a Drupal installation using MAMP (the dev.example.com folder is located within Drupal's /sites/ folder):

Running Apache Benchmarks: Drupal/Joomla core vs. Static Page Cache

I just discovered (after asking about it in the #drupal IRC channel) the wonderful little program ab, included in an Apache installation. This little nugget does one thing, and does it well: It beats the heck out of your server, then tells you how your server did in terms of page serving. I tested a few different configurations on a dedicated, 4-core, 4 GB RAM server from SoftLayer, and used the following two commands:

1. Download the specified URL 1,000 times, with KeepAlive turned off (each request gets a new http connection):

ab -n 1000 -c 5 http://ip.address.of.site/path-to-page.php

2. Downlaod the specified URL 1,000 times, with KeepAlive turned on (thus allowing the connection to be maintained for as many http downloads as you have set in your httpd.conf file):

ab -n 1000 -kc 5 http://ip.address.of.site/path-to-page.php

I ran these tests a few different ways, and here are the results of the tests I ran with KeepAlive on, with the number of pages per second ab reported listed after the method:

  • Drupal - normal page caching turned on, css/js aggregation, 55kb page – 12.5 pages/sec
  • Joomla - no page caching (disabled due to buggy 1.x caching), 65kb page – 8.2 pages/sec
  • Drupal - boost module enabled, serving up the boost-cached file – 3,250 pages/sec
  • Joomla - custom page caching system enabled, serving static html file – 2,600 pages/sec

Speed boost due to caching: ~250x faster!