Highly-Available Infrastructure Provisioning and Configuration with Ansible

The following is an excerpt from Chapter 8 of Ansible for DevOps, a book on Ansible by Jeff Geerling. The example highlights Ansible's simplicity and flexibility by provisioning and configuring of a highly available web application infrastructure on a local Vagrant-managed cloud, DigitalOcean droplets, and Amazon Web Services EC2 instances, with one set of Ansible playbooks.

tl;dr Check out the code on GitHub, and buy the book to learn more about Ansible!

Highly-Available Infrastructure with Ansible

Real-world web applications require redundancy and horizontal scalability with multi-server infrastructure. In the following example, we'll use Ansible to configure a complex infrastructure (illustrated below) on servers provisioned either locally via Vagrant and VirtualBox, or on a set of automatically-provisioned instances running on either DigitalOcean or Amazon Web Services:

Highly-Available Infrastructure.

Varnish acts as a load balancer and reverse proxy, fronting web requests and routing them to the application servers. We could just as easily use something like Nginx or HAProxy, or even a proprietary cloud-based solution like an Amazon's Elastic Load Balancer or Linode's NodeBalancer, but for simplicity's sake, and for flexibility in deployment, we'll use Varnish.

Apache and mod_php run a PHP-based application that displays the entire stack's current status and outputs the current server's IP address for load balancing verification.

A Memcached server provides a caching layer that can be used to store and retrieve frequently-accessed objects in lieu of slower database storage.

Two MySQL servers, configured as a master and slave, offer redundant and performant database access; all data will be replicated from the master to the slave, and the slave can also be used as a secondary server for read-only queries to take some load off the master.

Directory Structure

In order to keep our configuration organized, we'll use the following structure for our playbooks and configuration:

lamp-infrastructure/
  inventories/
  playbooks/
    db/
    memcached/
    varnish/
    www/
  provisioners/
  configure.yml
  provision.yml
  requirements.txt
  Vagrantfile

Organizing things this way allows us to focus on each server configuration individually, then build playbooks for provisioning and configuring instances on different hosting providers later. This organization also keeps server playbooks completely independent, so we can modularize and reuse individual server configurations.

Individual Server Playbooks

Let's start building our individual server playbooks (in the playbooks directory). To make our playbooks more efficient, we'll use some contributed Ansible roles on Ansible Galaxy rather than install and configure everything step-by-step. We're going to target CentOS 6.x servers in these playbooks, but only minimal changes would be required to use the playbooks with Ubuntu, Debian, or later versions of CentOS.

Varnish

Create a main.yml file within the the playbooks/varnish directory, with the following contents:

---
- hosts: lamp-varnish
  sudo: yes

  vars_files:
    - vars.yml

  roles:
    - geerlingguy.firewall
    - geerlingguy.repo-epel
    - geerlingguy.varnish

  tasks:
    - name: Copy Varnish default.vcl.
      template:
        src: "templates/default.vcl.j2"
        dest: "/etc/varnish/default.vcl"
      notify: restart varnish

We're going to run this playbook on all hosts in the lamp-varnish inventory group (we'll create this later), and we'll run a few simple roles to configure the server:

geerlingguy.firewall configures a simple iptables-based firewall using a couple variables defined in vars.yml.
geerlingguy.repo-epel adds the EPEL repository (a prerequisite for varnish).
geerlingguy.varnish installs and configures Varnish.

Finally, a task copies over a custom default.vcl that configures Varnish, telling it where to find our web servers and how to load balance requests between the servers.

Let's create the two files referenced in the above playbook. First, vars.yml, in the same directory as main.yml:

---
firewall_allowed_tcp_ports:
  - "22"
  - "80"

varnish_use_default_vcl: false

The first variable tells the geerlingguy.firewall role to open TCP ports 22 and 80 for incoming traffic. The second variable tells the geerlingguy.varnish we will supply a custom default.vcl for Varnish configuration.

Create a templates directory inside the playbooks/varnish directory, and inside, create a default.vcl.j2 file. This file will use Jinja2 syntax to build Varnish's custom default.vcl file:

vcl 4.0;

import directors;

{% for host in groups['lamp-www'] %}
backend www{{ loop.index }} {
  .host = "{{ host }}";
  .port = "80";
}
{% endfor %}

sub vcl_init {
  new vdir = directors.random();
{% for host in groups['lamp-www'] %}
  vdir.add_backend(www{{ loop.index }}, 1);
{% endfor %}
}

sub vcl_recv {
  set req.backend_hint = vdir.backend();

  # For testing ONLY; makes sure load balancing is working correctly.
  return (pass);
}

We won't study Varnish's VCL syntax in depth but we'll run through default.vcl and highlight what is being configured:

(1-3) Indicate that we're using the 4.0 version of the VCL syntax and import the directors varnish module (which is used to configure load balancing).
(5-10) Define each web server as a new backend; give a host and a port through which varnish can contact each host.
(12-17) vcl_init is called when Varnish boots and initializes any required varnish modules. In this case, we're configuring a load balancer vdir, and adding each of the www[#] backends we defined earlier as backends to which the load balancer will distribute requests. We use a random director so we can easily demonstrate Varnish's ability to distribute requests to both app backends, but other load balancing strategies are also available.
(19-24) vcl_recv is called for each request, and routes the request through Varnish. In this case, we route the request to the vdir backend defined in vcl_init, and indicate that Varnish should not cache the result.

According to #4, we're actually bypassing Varnish's caching layer, which is not helpful in a typical production environment. If you only need a load balancer without any reverse proxy or caching capabilities, there are better options. However, we need to verify our infrastructure is working as it should. If we used Varnish's caching, Varnish would only ever hit one of our two web servers during normal testing.

In terms of our caching/load balancing layer, this should suffice. For a true production environment, you should remove the final return (pass) and customize default.vcl according to your application's needs.

Apache / PHP

Create a main.yml file within the the playbooks/www directory, with the following contents:

---
- hosts: lamp-www
  sudo: yes

  vars_files:
    - vars.yml

  roles:
    - geerlingguy.firewall
    - geerlingguy.repo-epel
    - geerlingguy.apache
    - geerlingguy.php
    - geerlingguy.php-mysql
    - geerlingguy.php-memcached

  tasks:
    - name: Remove the Apache test page.
      file:
        path: /var/www/html/index.html
        state: absent
    - name: Copy our fancy server-specific home page.
      template:
        src: templates/index.php.j2
        dest: /var/www/html/index.php

As with Varnish's configuration, we'll configure a firewall and add the EPEL repository (required for PHP's memcached integration), and we'll also add the following roles:

geerlingguy.apache installs and configures the latest available version of the Apache web server.
geerlingguy.php installs and configures PHP to run through Apache.
geerlingguy.php-mysql adds MySQL support to PHP.
geerlingguy.php-memcached adds Memcached support to PHP.

Two final tasks remove the default index.html home page included with Apache, and replace it with our PHP app.

As in the Varnish example, create the two files referenced in the above playbook. First, vars.yml, alongside main.yml:

---
firewall_allowed_tcp_ports:
  - "22"
  - "80"

Create a templates directory inside the playbooks/www directory, and inside, create an index.php.j2 file. This file will use Jinja2 syntax to build a (relatively) simple PHP script to display the health and status of all the servers in our infrastructure:

<?php
/**
 * @file
 * Infrastructure test page.
 *
 * DO NOT use this in production. It is simply a PoC.
 */

$mysql_servers = array(
{% for host in groups['lamp-db'] %}
  '{{ host }}',
{% endfor %}
);
$mysql_results = array();
foreach ($mysql_servers as $host) {
  if ($result = mysql_test_connection($host)) {
    $mysql_results[$host] = '<span style="color: green;">PASS</span>';
    $mysql_results[$host] .= ' (' . $result['status'] . ')';
  }
  else {
    $mysql_results[$host] = '<span style="color: red;">FAIL</span>';
  }
}

// Connect to Memcached.
$memcached_result = '<span style="color: red;">FAIL</span>';
if (class_exists('Memcached')) {
  $memcached = new Memcached;
  $memcached->addServer('{{ groups['lamp-memcached'][0] }}', 11211);

  // Test adding a value to memcached.
  if ($memcached->add('test', 'success', 1)) {
    $result = $memcached->get('test');
    if ($result == 'success') {
      $memcached_result = '<span style="color: green;">PASS</span>';
      $memcached->delete('test');
    }
  }
}

/**
 * Connect to a MySQL server and test the connection.
 *
 * @param string $host
 *   IP Address or hostname of the server.
 *
 * @return array
 *   Array with keys 'success' (bool) and 'status' ('slave' or 'master').
 *   Empty if connection failure.
 */
function mysql_test_connection($host) {
  $username = 'mycompany_user';
  $password = 'secret';
  try {
    $db = new PDO(
      'mysql:host=' . $host . ';dbname=mycompany_database',
      $username,
      $password,
      array(PDO::ATTR_ERRMODE => PDO::ERRMODE_EXCEPTION));

    // Query to see if the server is configured as a master or slave.
    $statement = $db->prepare("SELECT variable_value
      FROM information_schema.global_variables
      WHERE variable_name = 'LOG_BIN';");
    $statement->execute();
    $result = $statement->fetch();

    return array(
      'success' => TRUE,
      'status' => ($result[0] == 'ON') ? 'master' : 'slave',
    );
  }
  catch (PDOException $e) {
    return array();
  }
}
?>
<!DOCTYPE html>
<html>
<head>
  <title>Host {{ inventory_hostname }}</title>
  <style>* { font-family: Helvetica, Arial, sans-serif }</style>
</head>
<body>
  <h1>Host {{ inventory_hostname }}</h1>
  <?php foreach ($mysql_results as $host => $result): ?>
    <p>MySQL Connection (<?php print $host; ?>): <?php print $result; ?></p>
  <?php endforeach; ?>
  <p>Memcached Connection: <?php print $memcached_result; ?></p>
</body>
</html>

Don't try transcribing this example manually; you can get the code from this book's repository on GitHub. Visit the ansible-for-devops repository and download the source for index.php.j2

As this is the heart of the example application we're deploying to the infrastructure, it's necessarily a bit more complex than most examples in the book, but a quick run through follows:

(9-23) Iterate through all the lamp-db MySQL hosts defined in the playbook inventory, and test the ability to connect to them, and whether they are configured as master or slave, using the mysql_test_connection() function defined later (40-73).
(25-39) Check the first defined lamp-memcached Memcached host defined in the playbook inventory, confirming the ability to connect and create, retrieve, and delete a value from the cache.
(41-76) Define the mysql_test_connection() function which tests the the ability to connect to a MySQL server and also returns its replication status.
(78-91) Print the results of all the MySQL and Memcached tests, along with {{ inventory_hostname }} as the page title, so we can easily see which web server is serving the viewed page.

At this point, the heart of our infrastructure — the application that will test and display the status of all our servers — is ready to go.

Memcached

Compared to the earlier playbooks, the Memcached playbook is quite simple. Create playbooks/memcached/main.yml with the following contents:

---
- hosts: lamp-memcached
  sudo: yes

  vars_files:
    - vars.yml

  roles:
    - geerlingguy.firewall
    - geerlingguy.memcached

As with the other servers, we need to ensure only the required TCP ports are open using the simple geerlingguy.firewall role. Next we install Memcached using the geerlingguy.memcached role.

In our vars.yml file (again, alongside main.yml), add the following:

---
firewall_allowed_tcp_ports:
  - "22"
firewall_additional_rules:
  - "iptables -A INPUT -p tcp --dport 11211 -s {{ groups['lamp-www'][0] }} -j ACCEPT"
  - "iptables -A INPUT -p tcp --dport 11211 -s {{ groups['lamp-www'][1] }} -j ACCEPT"

We need port 22 open for remote access, and for Memcached, we're adding manual iptables rules to allow access on port 11211 for the web servers only. We add one rule per lamp-www server by drilling down into each item in the the generated groups variable that Ansible uses to track all inventory groups currently available.

The principle of least privilege "requires that in a particular abstraction layer of a computing environment, every module ... must be able to access only the information and resources that are necessary for its legitimate purpose" (Source: Wikipedia). Always restrict services and ports to only those servers or users that need access!

MySQL

The MySQL configuration is more complex than the other servers because we need to configure MySQL users per-host and configure replication. Because we want to maintain an independent and flexible playbook, we also need to dynamically create some variables so MySQL will get the right server addresses in any potential environment.

Let's first create the main playbook, playbooks/db/main.yml:

---
- hosts: lamp-db
  sudo: yes

  vars_files:
    - vars.yml

  pre_tasks:
    - name: Create dynamic MySQL variables.
      set_fact:
        mysql_users:
          - {
            name: mycompany_user,
            host: "{{ groups['lamp-www'][0] }}",
            password: secret,
            priv: "*.*:SELECT"
          }
          - {
            name: mycompany_user,
            host: "{{ groups['lamp-www'][1] }}",
            password: secret,
            priv: "*.*:SELECT"
          }
        mysql_replication_master: "{{ groups['a4d.lamp.db.1'][0] }}"

  roles:
    - geerlingguy.firewall
    - geerlingguy.mysql

Most of the playbook is straightforward, but in this instance, we're using set_fact as a pre_task (to be run before the geerlingguy.firewall and geerlingguy.mysql roles) to dynamically create variables for MySQL configuration.

set_fact allows us to define variables at runtime, so we can are guaranteed to have all server IP addresses available, even if the servers were freshly provisioned at the beginning of the playbook's run. We'll create two variables:

mysql_users is a list of users the geerlingguy.mysql role will create when it runs. This variable will be used on all database servers so both of the two lamp-www servers get SELECT privileges on all databases.
mysql_replication_master is used to indicate to the geerlingguy.mysql role which database server is the master; it will perform certain steps differently depending on whether the server being configured is a master or slave, and ensure that all the slaves are configured to replicate data from the master.

We'll need a few other normal variables to configure MySQL, so we'll add them alongside the firewall variable in playbooks/db/vars.yml:

---
firewall_allowed_tcp_ports:
  - "22"
  - "3306"

mysql_replication_user: {name: 'replication', password: 'secret'}
mysql_databases:
  - { name: mycompany_database, collation: utf8_general_ci, encoding: utf8 }

We're opening port 3306 to anyone, but according to the principle of least privilege discussed earlier, you would be justified in restricting this port to only the servers and users that need access to MySQL (similar to the memcached server configuration). In this case, the attack vector is mitigated because MySQL's own authentication layer is used through the mysql_user variable generated in main.yml.

We are defining two MySQL variables, mysql_replication_user to be used as for master and slave replication, and mysql_databases to define a list of databases that will be created (if they don't already exist) on the database servers.

With the configuration of the database servers complete, the server-specific playbooks are ready to go.

Main Playbook for Configuring All Servers

A simple playbook including each of the group-specific playbooks is all we need for the overall configuration to take place. Create configure.yml in the project's root directory, with the following contents:

---
- include: playbooks/varnish/main.yml
- include: playbooks/www/main.yml
- include: playbooks/db/main.yml
- include: playbooks/memcached/main.yml

At this point, if you had some already-booted servers and statically defined inventory groups like lamp-www, lamp-db, etc., you could run ansible-playbook configure.yml and you'd have a full HA infrastructure at the ready!

But we're going to continue to make our playbooks more flexible and useful.

Getting the required roles

Ansible allows you to define all the required Ansible Galaxy roles for a given project in a requirements.txt file. Instead of having to remember to run ansible-galaxy install -y [role1] [role2] [role3] for each of the roles we're using, we can create requirements.txt in the root of our project, with the following contents:

geerlingguy.firewall
geerlingguy.repo-epel
geerlingguy.varnish
geerlingguy.apache
geerlingguy.php
geerlingguy.php-mysql
geerlingguy.php-memcached
geerlingguy.mysql
geerlingguy.memcached

To make sure all the required dependencies are available, just run ansible-galaxy install -r requirements.txt from within the project's root.

Ansible 1.8 and greater provide more flexibility in requirements files. If you use a YAML file (e.g. requirements.yml) to define a structured list of all the roles you need, you can source them from Ansible Galaxy, a git repository, a web-accessible URL (as a .tar.gz), or even a mercurial repository! See the documentation for Advanced Control over Role Requirements Files.

Vagrantfile for Local Infrastructure via VirtualBox

As with many other examples in this book, we can use Vagrant and VirtualBox to build and configure the infrastructure locally. This lets us test things as much as we want with zero cost, and usually results in faster testing cycles, since everything is orchestrated over a local private network on a (hopefully) beefy workstation.

Our basic Vagrantfile layout will be something like the following:

Define a base box (in this case, CentOS 6.x) and VM hardware defaults.
Define all the VMs to be built, with VM-specific IP addresses and hostname configurations.
Define the Ansible provisioner along with the last VM, so Ansible can run once at the end of Vagrant's build cycle.

Here's the Vagrantfile in all its glory:

# -*- mode: ruby -*-
# vi: set ft=ruby :

Vagrant.configure("2") do |config|
  # Base VM OS configuration.
  config.vm.box = "geerlingguy/centos6"

  # General VirtualBox VM configuration.
  config.vm.provider :virtualbox do |v|
    v.customize ["modifyvm", :id, "--memory", 512]
    v.customize ["modifyvm", :id, "--cpus", 1]
    v.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]
    v.customize ["modifyvm", :id, "--ioapic", "on"]
  end

  # Varnish.
  config.vm.define "varnish" do |varnish|
    varnish.vm.hostname = "varnish.dev"
    varnish.vm.network :private_network, ip: "192.168.2.2"
  end

  # Apache.
  config.vm.define "www1" do |www1|
    www1.vm.hostname = "www1.dev"
    www1.vm.network :private_network, ip: "192.168.2.3"

    www1.vm.provision "shell",
      inline: "sudo yum update -y"

    www1.vm.provider :virtualbox do |v|
      v.customize ["modifyvm", :id, "--memory", 256]
    end
  end

  # Apache.
  config.vm.define "www2" do |www2|
    www2.vm.hostname = "www2.dev"
    www2.vm.network :private_network, ip: "192.168.2.4"

    www2.vm.provision "shell",
      inline: "sudo yum update -y"

    www2.vm.provider :virtualbox do |v|
      v.customize ["modifyvm", :id, "--memory", 256]
    end
  end

  # MySQL.
  config.vm.define "db1" do |db1|
    db1.vm.hostname = "db1.dev"
    db1.vm.network :private_network, ip: "192.168.2.5"
  end

  # MySQL.
  config.vm.define "db2" do |db2|
    db2.vm.hostname = "db2.dev"
    db2.vm.network :private_network, ip: "192.168.2.6"
  end

  # Memcached.
  config.vm.define "memcached" do |memcached|
    memcached.vm.hostname = "memcached.dev"
    memcached.vm.network :private_network, ip: "192.168.2.7"

    # Run Ansible provisioner once for all VMs at the end.
    memcached.vm.provision "ansible" do |ansible|
      ansible.playbook = "configure.yml"
      ansible.inventory_path = "inventories/vagrant/inventory"
      ansible.limit = "all"
      ansible.extra_vars = {
        ansible_ssh_user: 'vagrant',
        ansible_ssh_private_key_file: "~/.vagrant.d/insecure_private_key"
      }
    end
  end
end

Most of the Vagrantfile is straightforward, and similar to other examples used in this book. The last block of code, which defines the ansible provisioner configuration, contains three extra values that are important for our purposes:

      ansible.inventory_path = "inventories/vagrant/inventory"
      ansible.limit = "all"
      ansible.extra_vars = {
        ansible_ssh_user: 'vagrant',
        ansible_ssh_private_key_file: "~/.vagrant.d/insecure_private_key"
      }

ansible.inventory_path defines an inventory file to be used with the ansible.playbook. You could certainly create a dynamic inventory script for use with Vagrant, but because we know the IP addresses ahead of time, and are expecting a few specially-crafted inventory group names, it's simpler to build the inventory file for Vagrant provisioning by hand (we'll do this next).
ansible.limit is set to all so Vagrant knows it should run the Ansible playbook connected to all VMs, and not just the current VM. You could technically use ansible.limit with a provisioner configuration for each of the individual VMs, and just run the VM-specific playbook through Vagrant, but our live production infrastructure will be using one playbook to configure all the servers, so we'll do the same locally.
ansible.extra_vars contains the vagrant SSH user configuration for Ansible. It's more standard to include these settings in a static inventory file or use Vagrant's automatically-generated inventory file, but it's easiest to set them once for all servers here.

Before running vagrant up to see the fruits of our labor, we need to create an inventory file for Vagrant at inventories/vagrant/inventory:

[lamp-varnish]
192.168.2.2

[lamp-www]
192.168.2.3
192.168.2.4

[a4d.lamp.db.1]
192.168.2.5

[lamp-db]
192.168.2.5
192.168.2.6

[lamp-memcached]
192.168.2.7

Now cd into the project's root directory, run vagrant up, and after ten or fifteen minutes, load http://192.168.2.2/ in your browser. Voila!

Highly Available Infrastructure - Success!

You should see something like the above screenshot; the PHP app simply displays the current app server's IP address, the individual MySQL servers' status, and the Memcached server status. Refresh the page a few times to verify Varnish is distributing requests randomly between the two app servers.

We have local infrastructure development covered, and Ansible makes it easy to use the exact same configuration to build our infrastructure in the cloud.

Provisioner Configuration: DigitalOcean

In Chapter 7, we learned provisioning and configuring DigitalOcean droplets in an Ansible playbook is fairly simple. But we need to take provisioning a step further by provisioning multiple droplets (one for each server in our infrastructure) and dynamically grouping them so we can configure them after they are booted and online.

For the sake of flexibility, let's create a playbook for our DigitalOcean droplets in provisioners/digitalocean.yml. This will allow us to add other provisioner configurations later, alongside the digitalocean.yml playbook. As with our example in Chapter 7, we will use a local connection to provision cloud instances. Begin the playbook with:

---
- hosts: localhost
  connection: local
  gather_facts: false

Next we need to define some metadata to describe each of our droplets. For simplicity's sake, we'll inline the droplets variable in this playbook:

  vars:
    droplets:
      - { name: a4d.lamp.varnish, group: "lamp-varnish" }
      - { name: a4d.lamp.www.1, group: "lamp-www" }
      - { name: a4d.lamp.www.2, group: "lamp-www" }
      - { name: a4d.lamp.db.1, group: "lamp-db" }
      - { name: a4d.lamp.db.2, group: "lamp-db" }
      - { name: a4d.lamp.memcached, group: "lamp-memcached" }

Each droplet is an object with two keys:

name: The name of the Droplet for DigitalOcean's listings and Ansible's host inventory.
group: The Ansible inventory group for the droplet.

Next we need to add a task to create the droplets, using the droplets list as a guide, and as part of the same task, register each droplet's information in a separate dictionary, created_droplets:

  tasks:
    - name: Provision DigitalOcean droplets.
      digital_ocean:
        state: "{{ item.state | default('present') }}"
        command: droplet
        name: "{{ item.name }}"
        private_networking: yes
        size_id: "{{ item.size | default(66) }}" # 512mb
        image_id: "{{ item.image | default(6372108) }}" # CentOS 6 x64.
        region_id: "{{ item.region | default(4) }}" # NYC2
        ssh_key_ids: "{{ item.ssh_key | default('138954') }}" # geerlingguy
        unique_name: yes
      register: created_droplets
      with_items: droplets

Many of the options (e.g. size_id) are defined as {{ item.property | default('default_value') }}, which allows us to use optional variables per droplet. For any of the defined droplets, we could add size_id: 72 (or whatever valid value you'd like), and it would override the default value set in the task.

You could specify an SSH public key per droplet, or (as in this instance) use the same key for all hosts by providing a default. In this case, I added an SSH key to my DigitalOcean account, then used the DigitalOcean API to retrieve the key's numeric ID (as described in the previous chapter).

It's best to use key-based authentication and add at least one SSH key to your DigitalOcean account so Ansible can connect using keys instead of insecure passwords, especially since these instances will be created with only a root account.

We loop through all the defined droplets using with_items: droplets, and after each droplet is created add the droplet's metadata (name, IP address, etc.) to the created_droplets variable. Next, we'll loop through that variable to build our inventory on-the-fly so our configuration applies to the correct servers:

    - name: Add DigitalOcean hosts to their respective inventory groups.
      add_host:
        name: "{{ item.1.droplet.ip_address }}"
        groups: "do,{{ droplets[item.0].group }},{{ item.1.droplet.name }}"
        # You can dynamically add inventory variables per-host.
        ansible_ssh_user: root
        mysql_replication_role: >
          "{{ 'master' if (item.1.droplet.name == 'a4d.lamp.db.1')
          else 'slave' }}"
        mysql_server_id: "{{ item.0 }}"
      when: item.1.droplet is defined
      with_indexed_items: created_droplets.results

You'll notice a few interesting things happening in this task:

This is the first time we've used with_indexed_items. The reason for using this less-common loop feature is to add a sequential and unique mysql_server_id. Though only the MySQL servers need a server ID set, it's simplest to dynamically create the variable for every server, so it's available when needed. with_indexed_items simply sets item.0 to the key of the item, and item.1 to the value of the item.
with_indexed_items also helps us reliably set each droplet's group. Because the v1 DigitalOcean API doesn't support features like tags for Droplets, we need to set up the groups on our own. Using the droplets variable we manually created earlier allows us to set the proper group for a particular droplet.
Finally we add inventory variables per-host in add_host by adding the variable name as a key, and the variable value as the key's value. Simple, but powerful!

There are a few different ways you can approach dynamic provisioning and inventory management for your infrastructure, and, especially if you are only targeting one cloud hosting provider, there are ways to avoid using more exotic features of Ansible (e.g. with_indexed_items) and complex if/else conditions. This example is slightly more complex due to the fact that the playbook is being created to be interchangeable with other similar provisioning playbooks.

The final step in our provisioning is to make sure all the droplets are booted and can be reached via SSH, so at the end of the digitalocean.yml playbook, add another play to be run on hosts in the do group we just defined:

- hosts: do
  remote_user: root

  tasks:
    - name: Wait for port 22 to become available.
      local_action: "wait_for port=22 host={{ inventory_hostname }}"

Once we know port 22 is reachable, we know the droplet is up and ready for configuration.

We're almost ready to provision and configure our entire infrastructure on DigitalOcean, but we need to create one last playbook to tie everything together. Create provision.yml in the project root with the following contents:

---
- include: provisioners/digitalocean.yml
- include: configure.yml

That's it! Now, assuming you set the environment variables DO_CLIENT_ID and DO_API_KEY, you can run $ ansible-playbook provision.yml to provision and configure the infrastructure on DigitalOcean.

The entire process should take about 15 minutes, and once it's complete, you should see something like:

PLAY RECAP *****************************************************************
107.170.27.137             : ok=19   changed=13   unreachable=0    failed=0
107.170.3.23               : ok=13   changed=8    unreachable=0    failed=0
107.170.51.216             : ok=40   changed=18   unreachable=0    failed=0
107.170.54.218             : ok=27   changed=16   unreachable=0    failed=0
162.243.20.29              : ok=24   changed=15   unreachable=0    failed=0
192.241.181.197            : ok=40   changed=18   unreachable=0    failed=0
localhost                  : ok=2    changed=1    unreachable=0    failed=0

Visit the IP address of the varnish server and you should be greeted with a status page similar to the one generated by the Vagrant-based infrastructure:

Highly Available Infrastructure on DigitalOcean.

Because everything in this playbook is idempotent, running $ ansible-playbook provision.yml again should report no changes, and helps you verify that everything is running correctly.

Ansible will also rebuild and reconfigure any droplets that may be missing from your infrastructure. If you're daring, and want to test this feature, just log into your DigitalOcean account, delete one of the droplets just created by this playbook (maybe one of the two app servers), then run the playbook again.

Now that we've tested our infrastructure on DigitalOcean, we can destroy the droplets just as easily (change the state parameter in provisioners/digitalocean.yml to default to 'absent' and run $ ansible-playbook provision.yml again).

Next up, we'll build the infrastructure a third time—on Amazon's infrastructure.

Provisioner Configuration: Amazon Web Services (EC2)

For Amazon Web Services, provisioning works slightly different. Amazon has a broader ecosystem of services surrounding EC2 instances, and for our particular example, we will need to configure security groups prior to provisioning instances.

To begin, create aws.yml inside the provisioners directory and begin the playbook the same ways as with DigitalOcean:

---
- hosts: localhost
  connection: local
  gather_facts: false

EC2 instances use security groups as an AWS-level firewall (which operates outside the individual instance's OS).
We will need to define a list of security_groups alongside our EC2 instances. First, the instances:

  vars:
    instances:
      - {
        name: a4d.lamp.varnish,
        group: "lamp-varnish",
        security_group: ["default", "a4d_lamp_http"]
      }
      - {
        name: a4d.lamp.www.1,
        group: "lamp-www",
        security_group: ["default", "a4d_lamp_http"]
      }
      - {
        name: a4d.lamp.www.2,
        group: "lamp-www",
        security_group: ["default", "a4d_lamp_http"]
      }
      - {
        name: a4d.lamp.db.1,
        group: "lamp-db",
        security_group: ["default", "a4d_lamp_db"]
      }
      - {
        name: a4d.lamp.db.2,
        group: "lamp-db",
        security_group: ["default", "a4d_lamp_db"]
      }
      - {
        name: a4d.lamp.memcached,
        group: "lamp-memcached",
        security_group: ["default", "a4d_lamp_memcached"]
      }

Inside the instances variable, each instance is an object with three keys:

name: The name of the instance, which we'll use to tag the instance and ensure only one instance is created per name.
group: The Ansible inventory group in which the instance should belong.
security_group: A list of security groups into which the instance will be placed. The default security group comes is added to your AWS account upon creation, and has one rule to allow outgoing traffic on any port to any IP address.

If you use AWS exclusively, it would be best to autoscaling groups and change the design of this infrastructure a bit. For this example, we just need to ensure that the six instances we explicitly define are created, so we're using particular names and an exact_count to enforce the 1:1 relationship.

With our instances defined, we'll next define a security_groups variable containing all the required security group configuration for each server:

    security_groups:
      - name: a4d_lamp_http
        rules:
          - { proto: tcp, from_port: 80, to_port: 80, cidr_ip: 0.0.0.0/0 }
          - { proto: tcp, from_port: 22, to_port: 22, cidr_ip: 0.0.0.0/0 }
        rules_egress: []
      - name: a4d_lamp_db
        rules:
          - { proto: tcp, from_port: 3306, to_port: 3306, cidr_ip: 0.0.0.0/0 }
          - { proto: tcp, from_port: 22, to_port: 22, cidr_ip: 0.0.0.0/0 }
        rules_egress: []
      - name: a4d_lamp_memcached
        rules:
          - { proto: tcp, from_port: 11211, to_port: 11211, cidr_ip: 0.0.0.0/0 }
          - { proto: tcp, from_port: 22, to_port: 22, cidr_ip: 0.0.0.0/0 }
        rules_egress: []

Each security group has a name (which was used to identify the security group in the instances list), rules (a list of firewall rules like protocol, ports, and IP ranges to limit incoming traffic), and rules_egress (a list of firewall rules to limit outgoing traffic).

We need three security groups: a4d_lamp_http to open port 80, a4d_lamp_db to open port 3306, and a4d_lamp_memcached to open port 11211.

Now that we have all the data we need to set up security groups and instances, the first task needs to to create or verify the existence of the security groups:

  tasks:
    - name: Configure EC2 Security Groups.
      ec2_group:
        name: "{{ item.name }}"
        description: Example EC2 security group for A4D.
        region: "{{ item.region | default('us-west-2') }}" # Oregon
        state: present
        rules: "{{ item.rules }}"
        rules_egress: "{{ item.rules_egress }}"
      with_items: security_groups

The ec2_group requires a name, region, and rules for each security group. Security groups will be created if they don't exist, modified to match the supplied values if they do exist, or simply verified if they exist and match the given values.

With the security groups configured, we can provision the defined EC2 instances by looping through instances with the ec2 module:

    - name: Provision EC2 instances.
      ec2:
        key_name: "{{ item.ssh_key | default('jeff_mba_home') }}"
        instance_tags:
          inventory_group: "{{ item.group | default('') }}"
          inventory_host: "{{ item.name | default('') }}"
        group: "{{ item.security_group | default('') }}"
        instance_type: "{{ item.type | default('t2.micro')}}" # Free Tier
        image: "{{ item.image | default('ami-11125e21') }}" # RHEL6 x64 hvm
        region: "{{ item.region | default('us-west-2') }}" # Oregon
        wait: yes
        wait_timeout: 500
        exact_count: 1
        count_tag:
          inventory_group: "{{ item.group | default('') }}"
          inventory_host: "{{ item.name | default('') }}"
      register: created_instances
      with_items: instances

This example is slightly more complex than the DigitalOcean example, and a few parts warrant a deeper look:

EC2 allows SSH keys to be defined by name—in my case, I have a key jeff_mba_home in my AWS account. You should set the key_name default to a key that you have in your account.
Instance tags are tags that AWS will attach to your instance, for categorization purposes. By giving a list of keys and values, I can then use that list later in the count_tag parameter.
t2.micro was used as the default instance type, since it falls within EC2's free tier usage. If you just set up an account and keep all AWS resource usage within free tier limits, you won't be billed anything.
exact_count and count_tag work together to ensure AWS provisions only one of each of the instances we defined. The count_tag tells the ec2 module to match the given group + host and then exact_count tells the module to only provision 1 instance. If you wanted to remove all your instances, you could set exact_count to 0 and run the playbook again.

Each provisioned instance will have its metadata added to the registered created_instances variable, which we'll use to build Ansible inventory groups for the server configuration playbooks.

    - name: Add EC2 instances to their respective inventory groups.
      add_host:
        name: "{{ item.1.tagged_instances.0.public_ip }}"
        groups: "aws,{{ item.1.item.group }},{{ item.1.item.name }}"
        # You can dynamically add inventory variables per-host.
        ansible_ssh_user: ec2-user
        mysql_replication_role: >
          {{ 'master' if (item.1.item.name == 'a4d.lamp.db.1')
          else 'slave' }}
        mysql_server_id: "{{ item.0 }}"
      when: item.1.instances is defined
      with_indexed_items: created_instances.results

This add_host example is slightly simpler than the one for DigitalOcean, because AWS attaches metadata to EC2 instances which we can re-use when building groups or hostnames (e.g. item.1.item.group). We don't have to use list indexes to fetch group names from the original instances variable.

We still use with_indexed_items so we can use the index to generate a unique ID per server for use in building the MySQL master-slave replication.

The final step in provisioning the EC2 instances is to ensure we can connect to them before continuing, and to set selinux into permissive mode so the configuration we supply will work correctly.

# Run some general configuration on all AWS hosts.
- hosts: aws
  gather_facts: false

  tasks:
    - name: Wait for port 22 to become available.
      local_action: "wait_for port=22 host={{ inventory_hostname }}"

    - name: Set selinux into 'permissive' mode.
      selinux: policy=targeted state=permissive
      sudo: yes

Since we defined ansible_ssh_user as ec2-user in the dynamically-generated inventory above, we need to ensure the selinux task runs with sudo explicitly.

Now, modify the provision.yml file in the root of the project folder, and change the provisioners include to look like the following:

---
- include: provisioners/aws.yml
- include: configure.yml

Assuming the environment variables AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY are set in your current terminal session, you can run $ ansible-playbook provision.yml to provision and configure the infrastructure on AWS.

The entire process should take about 15 minutes, and once it's complete, you should see something like:

PLAY RECAP *****************************************************************
54.148.100.44              : ok=24   changed=16   unreachable=0    failed=0
54.148.120.23              : ok=40   changed=19   unreachable=0    failed=0
54.148.41.134              : ok=40   changed=19   unreachable=0    failed=0
54.148.56.137              : ok=13   changed=9    unreachable=0    failed=0
54.69.160.32               : ok=27   changed=17   unreachable=0    failed=0
54.69.86.187               : ok=19   changed=14   unreachable=0    failed=0
localhost                  : ok=3    changed=1    unreachable=0    failed=0

Visit the IP address of the varnish server (the first server configured) and you should be greeted with a status page similar to the one generated by the Vagrant and DigitalOcean-based infrastructure:

Highly Available Infrastructure on AWS EC2.

As with the earlier examples, running ansible-playbook provision.yml again should produce no changes, because everything in this playbook is idempotent. And if one of your instances were terminated, running the playbook again would recreate and reconfigure the instance in a few minutes.

To terminate all the provisioned instances, you can change the exact_count in the ec2 task to 0, and run $ ansible-playbook provision.yml again.

Summary

In the above example, an entire highly-available PHP application infrastructure was defined in a series of short Ansible playbooks, and then provisioning configuration was created to build the infrastructure on either local VMs, DigitalOcean droplets, or AWS EC2 instances.

Once you start working on building infrastructure this way — abstracting individual servers, then abstracting cloud provisioning — you'll start to see some of Ansible's true power in being more than just a configuration management tool. Imagine being able to create your own multi-datacenter, multi-provider infrastructure with Ansible and some basic configuration.

While Amazon, DigitalOcean, Rackspace and other hosting providers have their own tooling and unique infrastructure merits, the agility and flexibility afforded by building infrastructure in a provider-agnostic fashion lets you treat hosting providers as commodities, and gives you freedom to build more reliable, performant, and simple application infrastructure.

Even if you plan on running everything within one hosting provider's network (or in a private cloud, or even on a few bare metal servers), Ansible provides deep stack-specific integration so you can do whatever you need to do and manage the provider's services within your playbooks.

You can find the entire contents of this example in the Ansible for DevOps GitHub repository, in the lamp-infrastructure directory.

Purchase Ansible for DevOps on LeanPub, Amazon, or iTunes.

ansible

ansible for devops

high availability

infrastructure

automation

Add new comment

Comments

Hi Jeff.....
Tried your scripts to setup LAMP on AWS. Getting errors with it and unable to proceed beyond provisioning EC2 step. The script is not picking up the "public-ip" and assign the same to the inventory and failing.

I do not see any ec2-key being used in aws.yml.....Wondering how it will connect to provisioned instances without keys being used.

I've been using your github scripts (@https://github.com/geerlingguy/ansible-for-devops/tree/master/lamp-infr…) only. Not sure whether i'm missing something.

Please advise.

Thanks,
kumar