In a previous post I talked how testers need to learn virtualization technology, using Vagrant as a good example of that. Here I want to talk about a related aspect to that, which is provisioning the virtual machines that are used for testing and development purposes.
The problem of installing software on a booted up system is known as provisioning. This is often the job of shell scripts, configuration management systems, or manual command-line entry. In the context of virtualization, provisioning is the process of setting up a virtual machine so that it can be used for a specific purpose or project. Typically, this involves installing software, configuring the software, managing services running on the machine, and even setting up users and groups on the machine.
Automated provisioning offers you improved parity between development, test and production environments.
Out of the box, Vagrant supports provisioning from shell scripts, Chef, or Puppet. The easiest is no doubt shell scripts. More robust solutions are provided by tools like Chef and Puppet.
The Context
Let’s first recap the context you are operating in here. Vagrant relies on base boxes for the guest virtual machines. These boxes are specifically preconfigured virtual machine images of operating systems which have certain software packages pre-installed and preconfigured.
However, not everything will be pre-installed or pre-configured which is where you have to step in and create a provisioning approach. Let’s do this via some shell scripting first. Specifically, let’s set up Apache to serve static files from the default Vagrant synced folder. In the previous post, this was done by manually running Python’s SimpleHTTPServer module. Here we’ll configure Apache to automatically start and serve the files from the synced folder.
Setting Up Your VM
If you haven’t followed the steps in the first post, make sure you do the following:
- Install VirtualBox.
- Install Vagrant.
- Create a project directory.
(Reminder: all of this was covered in the previous post.)
In the project directory, run the following command:
$ vagrant init precise64 http://files.vagrantup.com/precise64.box
Edit the created Vagrantfile and uncomment the line that looks like this:
1 |
config.vm.network :forwarded_port, guest: 80, host: 8080 |
Then type the following:
$ vagrant up
That should get you your virtual machine.
Perform Some VM Configuration
Now let’s SSH into the box:
$ vagrant ssh
Keep in mind this is an Ubuntu Linux box, so Apt is the package manager used to install things. So you install Apache just as you would on a “normal” (non-virtualized) Linux box:
$ sudo apt-get update $ sudo apt-get install apache2
Apache defaults to serving files from the /var/www location. To ease configuration and keep things simple, I’ll set /var/www to be a symbolic link to the default shared folder directory /vagrant. What this will do is allow any files put into the synced folder to be served from Apache by default.
$ sudo rm -rf /var/www $ sudo ln -fs /vagrant /var/www
If you visit http://localhost:8080 now, you should see a directory listing of the shared folder. If you want details of how this is working, see the previous post, particularly the part about port forwarding.
Using Apache From the Host
Let’s see if we can actually serve some HTML on the Apache web server. To do this, you can create an index.html file on the host machine and then visit the above link again. Exit SSH and create the index file:
$ logout
Create the index.html file in your project folder:
1 2 3 4 5 6 7 8 9 10 11 12 |
<!doctype html> <html lang="en"> <head> <meta charset="utf-8"> <title>Testing Vagrant</title> </head> <body> <h1>Testing Vagrant</h1> <p>This page is stored on the host, but served on the guest.</p> </body> </html> |
You should now see that page displayed when you visit http://localhost:8080.
Notice what this does! You are now able to essentially create or modify elements of the web application on your host machine but have those served up via the virtual environment, perhaps similar to how they would be served up in your production environment. Once you get the configuration you want, this will allow developers or testers to work completely in their host environment, without ever having to SSH into the guest machine.
However, what we did here was a purely manual process. What we want is automated provisioning.
Automated Provisioning via Script
Let’s make sure we destroy the machine we created so that we can start from a clean starting point.
$ vagrant destroy
Now we’ll create a shell script that basically does exactly what we just did earlier with the manual steps. In your project directory, create a file called provision.sh that has the following contents:
1 2 3 4 5 6 |
#! /usr/bin/env bash apt-get update apt-get install -y apache2 rm -rf /var/www ln -fs /vagrant /var/www |
This is pretty much exactly what we did before. You don’t need sudo
here because Vagrant will run this script as root. I also added a -y
flag to the Apache install command. This is done so that apt-get will automatically answer “yes” to any prompts that come up.
Do not that even though this is a shell script that would normally be associated with Unix-like systems, you should do this even on Windows.
Now you have to configure Vagrant to actually use this script. Open up your Vagrantfile and put the following line somewhere in it:
1 |
config.vm.provision "shell", path: "provision.sh" |
With all that in place, let’s recreate the machine:
$ vagrant up
Since we destroyed the machine, this will require a recreation of the base box image. However, that box will not need to be redownloaded.
Once it completes, open your browser and visit http://localhost:8080. You should see the same web page that you saw when you manually set up Apache. Notice that your index.html file was not deleted during the destroy procedure. As covered in the previous post, files in the synced folders are never deleted.
You’ll note that you get a lot of output during the creation and provisioning of the box. With the script approach, you can use standard techniques to essentially nullify that output. For example, you could change provision.sh to look like this:
1 2 3 4 5 6 |
#! /usr/bin/env bash apt-get update >/dev/null 2>&1 apt-get install -y apache2 >/dev/null 2>&1 rm -rf /var/www ln -fs /vagrant /var/www |
This has been a pretty simple example. But imagine if you were provisioning an environment that had a web server, an application server, a messaging server, and a database. Further, that environment had specific configurations that required running scripts written in Python and Ruby, thus needing those languages to be installed, and having the configuration scripts executed by each language’s respective interpreter. Imagine how simple your provisioning process just got in that context.
Now let’s do the same steps with two other provisioning tools.
Puppet and Chef
Both Puppet and Chef are provisioning tools which you can use to set up a server for use for a project. The configuration which determines how the server needs to be set up can be stored within our Vagrant project and can be shared with teammates through a version control. Puppet and Chef are both software packages that are pre-installed with Vagrant. Specifically, Vagrant has its own interface through to these packages from the host machine. This means you can provide some configuration in your Vagrant file and Vagrant will pass this information to the relevant provisioners on the guest VM.
While both, Puppet and Chef use Ruby, you will find that Puppet is a Domain Specific Language which makes it look and feel like its own language, whereas Chef is structured more like Ruby itself. In both cases, however, you most certainly don’t need to be any sort of Ruby expert to use these tools.
Puppet and Chef are idempotent, which means that running Puppet or Chef on a machine multiple times has the same effect as running them only once. Both tools ensure that conditions are met, and if they are not, the tools will perform actions to ensure they are. So, for example, if your environment requires having a specific web server and database set up, Puppet or Chef would install those if they were not already installed. If they were already installed Puppet or Chef would do nothing.
The tools do work very similarly in many ways. For example, both hold information about how a server should be configured. In the case of Puppet this information is stored in Puppet manifests. In the case of Chef, this information is written in files called recipes that are bundled together into cookbooks. The manifests are written by using Puppet’s own language (which is a Ruby Domain Specific Language) while recipes are written as Ruby code files. Puppet compiles its manifest information into a catalog that is specific to the operating system being applied to. Chef matches its information to providers which execute the configuration on the operating system it is being applied to. So while there are some differences in terminology and artifact, you can see that the basic ideas behind the two are the same.
You can use Puppet in standalone mode, which is how Vagrant uses it. Standalone mode means that everything runs from one machine and is sometimes called masterless. Puppet also has client-server capabilities (Puppet Master and Puppet Agent), where you can define the Puppet manifests for all the servers in your environment, on a central host, and it keeps your individual servers at the required level of configuration.
Similarly, you can use what’s called Chef Solo, which is its standalone mode. This is also how Vagrant uses it. Chef Solo means that everything runs from one machine. Chef also has client-server capabilities (Chef Client and Chef Server) where you can define the Chef cookbooks and roles for all the servers in your infrastructure on a central host, and it keeps your individual servers at the required level of configuration.
Provision with Chef
Chef is a tool with a lot of details. I don’t plan on covering many of those in this post. In fact, all I want to do is show you how to take the example we’ve been doing with so far and convert it from the shell script to a Chef recipe.
Based on what we’ve done so far, I’m going to assume your Vagrantfile looks like this (with the comments removed):
1 2 3 4 5 6 |
Vagrant.configure(2) do |config| config.vm.box = "precise64" config.vm.box_url = "http://files.vagrantup.com/precise64.box" config.vm.network "forwarded_port", guest: 80, host: 8080 config.vm.provision "shell", path: "provision.sh" end |
Replace line 5 with the following line:
1 |
config.vm.provision "chef_solo", run_list: ["vagrant_test"] |
By default Vagrant will look for cookbooks in the cookbooks directory relative to the project directory. The run_list takes an array of cookbooks. The specific cookbooks are directories within the cookbooks directory. So in this case the cookbook would be vagrant_test. To get this all to work, create the following directory under your project directory: cookbooks/vagrant_test/recipes/. Then in the recipes directory create a file called default.rb with the following contents:
1 2 3 4 5 6 |
execute "apt-get update" package "apache2" execute "rm -rf /var/www" link "/var/www" do to "/vagrant" end |
This recipe matches up with the manual steps that we did with the provision.sh shell script.
To have some clue as to what all this means, understand that Chef uses resources to define the actions and operations, which can be performed against the system. Resources are mapped to a Chef code, which varies depending on the platform/operating system being used. For example, on an Ubuntu machine the package resource is mapped to apt-get. With the above code, the execute resource within Chef. This instructs Chef to run the apt-get update command. As the name of the resource (the part provided in quotes after the name of the resource) is the command to run, that is what will be executed.
You can use the package resource to ensure that, in this case, Apache is installed, and if it isn’t, have it be installed. Finally, the link resource allows you to create symbolic links to the existing files and folders on the filesystem.
With the Vagrantfile updated to use Chef as the provisioning tool and with a cookbook and recipe in place, you can now provision. Make sure that you vagrant destroy
if there is a previous guest running so that you can start from a clean slate. Then, run vagrant up
as you did before.
Once the process completes, open your browser and visit http://localhost:8080. You should see the same web page that you saw when you provisioned the virtual machine via the shell script.
What I showed you here was running Chef against a local cookbook. A common usage pattern of Chef, however, is using Chef Server where the server is responsible for determining the run list to be used and then the server sends the appropriate cookbooks to be used for provisioning. I won’t cover the details of all that here since that gets more into Chef than usage of Chef with Vagrant but just to give you an idea, your provision line in Vagrantfile would look something like this:
1 |
config.vm.provision "chef_client", chef_server_url: "http://company.chefserver.com:3000" |
Notice here that “chef_solo” has become “chef_client” and a specific server is being referenced.
Provision with Puppet
Like Chef, Puppet is a tool with a lot of details. I don’t plan on covering many of those in this post but similar to the above provisioning approach with Chef, I want to show you how to take the example and put it in the context of a Puppet manifest.
If you’ve been following along at each point, your Vagrantfile should look like this:
1 2 3 4 5 6 |
Vagrant.configure(2) do |config| config.vm.box = "precise64" config.vm.box_url = "http://files.vagrantup.com/precise64.box" config.vm.network "forwarded_port", guest: 80, host: 8080 config.vm.provision "chef_solo", run_list: ["vagrant_test"] end |
To use Puppet, let’s replace line 5 with the following:
1 |
config.vm.provision "puppet" |
This is using “standalone” Puppet. Next, you have to tell Vagrant where you have put Puppet manifests and modules and specifically what manifest should be executed. Vagrant expects manifests to be in the manifests folder and will run a manifest called default.pp in the manifests folder to kick off the Puppet run. So create the manifests directory under your project folder. Then create a default.pp file in that folder. Within that file put the following:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
exec { "apt-get update": command => "/usr/bin/apt-get update", } package { "apache2": require => Exec["apt-get update"], } file { "/var/www": ensure => link, target => "/vagrant", force => true, } |
As you can see, there are similarities in the Puppet manifest to the Chef recipe. Puppet provides different resource types which you can call for different aspects of the configuration file. For example, in order to run a command on the server you can use the exec resource. The package resource is used to to ensure that Apache is installed, and if it isn’t, to install it. The file resource type allows you to create files, folders, and symlinks.
Make sure you use vagrant destroy
to get rid of the previous guest configuration. Then run vagrant up
. Once Vagrant finishes running, open your browser and visit http://localhost:8080. You should see the same web page you have been checking for each of these provisioning attempts.
In this example, I’m showing you the use of a simple Puppet manifest. Usually in the context of Puppet you’ll have a set of components that are called modules. This makes your provision line in Vagrantfile look something like this:
1 |
config.vm.provision "puppet", module_path: "modules" |
The module_path is just what it sounds like: it’s the path to the Puppet modules on the host system. Vagrant will then copy all of the modules from that path to the guest machine.
Also, as with Chef Server, Vagrant can do something similar with Puppet by provisioning from a server called the Puppet Master. Just as the Chef Server decides on the appropriate cookbooks, the Puppet Master decides on the appropriate modules and/or manifests and then sends those down to the client to provision the guest machine. In these cases your Vagrantfile would contain a line like this:
1 |
config.vm.provision "puppet_server" |
There’s a lot more to this but I won’t cover that here since, as with Chef, this gets more into how Puppet works rather than how Puppet works with Vagrant.
Provision with Multiple Provisioners
I do want to note that you can specify multiple provision directives in your Vagrantfile. Consider this example:
1 2 3 4 5 6 7 8 |
Vagrant.configure(2) do |config| config.vm.box = "precise64" config.vm.box_url = "http://files.vagrantup.com/precise64.box" config.vm.provision "shell", inline: "apt-get update" config.vm.provision "shell", inline: "apt-get install -y apache2" config.vm.provision "chef_solo", run_list: ["my_vagrant_stuff"] config.vm.provision "puppet" end |
This would be perfectly fine. Vagrant will provision the guest machine using each provisioner in the order that they are listed. Notice also there that I’m using an inline script. You can do this when you simply want to execute a single command and don’t want to have a file that specifies that command (or set of commands, as the case may be).
Welcome to Automated Provisioning!
With this post and the previous one, you now have an introduction to three tools: Vagrant, Chef, and Puppet. These are tools that are well worth knowing and hopefully even with the incredibly simple example I walked you through, you can see the power of these solutions. I’ve shown you these tools solely in the context of Vagrant but they both can be installed and utilized entirely independent of Vagrant. In fact, I’ve barely begun to even scratch the surface of these tools, in terms of all their abilities. However, if you were able to get everything to work, you now have an ecosystem available to you in which you can practice. And this is the case regardless of your operating system of choice.
Beyond those points, learning tools like these, and becoming proficient with them, is a huge step in the direction of utilizing continuous integration tools as well as continuous delivery techniques, such as those found in the emerging DevOps movement, which is taking hold at many organizations.