Wednesday, November 4, 2009

Getting Testjour up and running - part 3

We were ultimately able to reduce our entire test suite runtime from 5-12 minutes depending on hardware down to less than 3 1/2 minutes consistently. Now all our developers can test efficiently and we can scale as our test suite grows. We used a combination of TestJour, Capistrano, and Git to accomplish this very effectively.

Using Testjour with Capistrano

I created the following capistrano recipe excerpt and loaded it in our main deploy file. We have our deploy separated between staging and production so naturally adding a testing deploy made sense. We can now run all our features in a distributed fashion using only:

cap testing deploy

To run with migrations we do:

cap testing deploy -s migrations=true

To run with migrations and less than a full test suite we do:

cap testing deploy -s migrations=true -s features=./features/subfeatures

This recipe excerpt requires a config/testjour.yml configuration file in the following format:

config/testjour.yml
master: testmaster.domain.tld
slaves: testslave1.domain.tld,testslave2.domain.tld,testslave3.domain.tld,testslave4.domain.tld,testslave5.domain.tld,testslave6.domain.tld,testslave7.domain.tld
repository: "ssh://user@testmaster.domain.tld/user/myapp.git"
master_user: user
root_path: /user
dbusername: masterdbusername
dbpassword: masterdbpassword

lib/testjour.rb


TestJour_config=Hash.new
TestJour_config.merge! YAML.load_file("config/testjour.yml") if File.exists?("config/testjour.yml")

task :testing do
role :app, testjour_master
role :web, testjour_master
role :db, testjour_master, :primary => true
set :application, "myapp"
set :repository, TestJour_config["repository"]
set :rails_env, 'test'
set :branch, get_branch
set :user, TestJour_config["master_user"]
set :deploy_to, "#{TestJour_config["root_path"]}/#{application}"

before :deploy, :role => :app do
`git push testjour #{get_branch}`
end

after :deploy, :role => :app do
begin
if migrations
run "cd #{current_path}; rake db:migrate:reset"
run "cd #{current_path}; mysqldump#{TestJour_config["dbusername"] ? " -u " + TestJour_config["dbusername"] : ""}#{TestJour_config["dbpassword"] ? " --password=" + TestJour_config["dbpassword"] : ""} -n -d myapp_development > #{shared_path}/development_structure.sql"
end
rescue
end

run "/bin/cp #{shared_path}/development_structure.sql #{release_path}/db/development_structure.sql"
run "cd #{current_path}; testjour #{get_slaves} --max-local-slaves=1 --create-mysql-db --mysql-db-name=myapp_test #{get_features}"
end
end


def get_branch
`git branch` =~ /^\*.(.*)/
$1
end


def testjour_master
TestJour_config["master"]
end


def get_features
begin
features
rescue
"./features"
end
end


def get_slaves
TestJour_config["slaves"].split(',').map{slave "--on=testjour://#{slave}/#{TestJour_config['root_path']}/#{application}/current/ "}.to_s.strip if TestJour_config["slaves"]
end



Back to part 2

Getting Testjour up and running - part 2

Under the hood

When I went to figure out how Testjour worked, I really struggled with figuring out how to actually use it since the existing README is a little out-of-date and, as I mentioned before, Google was bereft of any useful information other than "hey we're using Testjour and it cut our test time by X". After spending a day digging through the code, I reached an epiphany and decided to look at the cucumber features. Viola! Using a couple well-placed debugs I was able to figure out the command-line switches and was off to the races. Let's look at a typical usage of testjour:

testjour --on=testjour://slave1/home/user/myapp --on=testjour://slave2/home/user/myapp ./features

As you can see here, testjour is actually an executable which means you must first install testjour on your system. To do this run "rake install" from your testjour directory. This will compile testjour in to a gem and install it in your system. If you do not install this as sudo or root, it will prompt you for your sudo/root password since it must install the executable. If you don't have root access, it should be fairly easy to instead to "rake gem" and install the pkg/testjour-x.x.x.gem and put the testjour executable in your path instead.

What Testjour ends up doing is it rsyncs the directory you run testjour from to the slave1 and slave2 servers under /home/user/myapp (or whatever directory you specify). For this reason, you will want to ensure the data transfer between your master and slaves is as efficient as possible. Testjour runs:

ssh slave1 testjour --in=/home/user/myapp run:remote http://user@master.domain.tld/original/path

and

ssh slave2 testjour --in=/home/user/myapp run:remote http://user@master.domain.tld/original/path

You'll see that it uses ssh without password authentication. This means you must set up key authentication to all your slaves. A healthy Google search can assist you with this if you have never done this before. You'll also see that the master response URL is automatically constructed from the local user you run testjour from, what it thinks its hostname is, and the path you run testjour from. It is recommended that you add all your slaves to the /etc/hosts file on the master and the master to /etc/hosts on all your slaves.

Once the testjour command executes on the slave, it ends up running:

cd /home/user/myapp
testjour run:slave http://user@master.domain.tld/original/path

which runs

rsync -az -e "ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no" --delete --exclude=.git --exclude=*.log --exclude=*.pid user@master.domain.tld:/original/path/ /home/user/myapp

and then starts running cucumber on all the features in ./features.

This is where your master's application is rsync'd to the slave. It does use rsync over SSH, so you will also need to make sure that your master server allows key authentication from all your slaves.

If you are using sqlite, this should all run fine-and-dandy. Unfortunately testjour has support only for msyql currently so you'll have to modify testjour if you use postgresql or any other server-based db. To get mysql working you will need to re-run the original command as such:

testjour --on=testjour://slave1/home/user/myapp --on=testjour://slave2/home/user/myapp --mysql-create-db ./features

What this does is runs the following commands on your slaves:

/usr/local/mysql/bin/mysqladmin create testrunner_12391987
/usr/local/mysql/bin/mysql testrunner_12391987 < /home/user/myapp/db/development_structure.sql


This requires you to mysqldump your update-to-date development or test database to db/development_structure.sql (mysqldump -u 'username' --password='password' -n -d > db/development_structure.sql). Also you'll notice that the paths are fixed for mysql and mysqladmin. Since this is not where my executables were located, I simply symlinked the original executables to the /usr/local/mysql/bin path.

Another issue you might notice is that Testjour generates a random database name. This was a problem for us since our application requires a specific database name of openphin_test. For this critical reason, I forked Testjour to Dishwasha/testjour and added the --mysql-db-name switch. So we now run the following:

testjour --on=testjour://slave1/home/user/myapp --on=testjour://slave2/home/user/myapp --mysql-create-db --mysql-db-name=my_db_name ./features

The last issue we ran in to was that by default, Testjour will run two instances of cucumber simultaneously on the master. The OpenPHIN project uses bmabey-cleaner to clean the MySQL database before every scenario and each of the tests can easily clobber each other if they are hitting from the same table. Testjour has the switch --max-local-slaves which allows us to set how many cucumber instances to run on the master server. You can run --max-local-slaves=1 to include your master server as one of the cucumber processing slaves or --max-local-slaves=0 to not include the master server in your cucumber processing so the test run only on the slaves. You can also run --max-local-slaves=1 and not specify any slaves to run tests on just the master.

testjour --on=testjour://slave1/home/user/myapp --on=testjour://slave2/home/user/myapp --mysql-create-db --mysql-db-name=my_db_name --max-local-slaves=1 ./features


You can also specify specific feature files or features directories by changing ./features to the appropriate directory.



Back to part 1Continue to part 3

Getting Testjour up and running - part 1

Introduction

Our OpenPHIN application is heavily tested using rspec and cucumber. At first, our tests ran fairly quickly to a magnitude of less than 3 minutes. Now that we are up to 217 scenarios and 3133 step definitions, we have reached a critical point where we choose to run our tests less during daily development which is leading to more commits without tests passing. This can start to become a real problem, expecially if any kind of continuous integration is used. We were also seeing a wide range of cucumber processing times across all our developers between 5 minutes on a quad 3.33Ghz to 12 minutes on an older Macbook. We had heard about testjour but found there was completely zero documentation or blog posts about getting testjour working and others hadn't been able to get it to work in their environment. Regardless, we decided it was time to invest in getting testjour working now to save a lot more future development time and improve commited code quality.

Testjour is a ruby gem written by Bryan Helmkamp which is used for distributing cucumber tests across multiple systems. Since ruby/cucumber are not threaded, Testjour can be useful for distributing tests across a single or multiple systems. We used one dual quad-core 2.83Ghz Xeon, 16Gb RAM blade server running VMWare ESXi and set up 8 virtual machines for our Testjour environment. Each virtual machine was a single processor installation and has its processor affinity set to its own core since Testjour is more CPU bound than disk intensive. We found that it was most efficient to set up the first virtual machine exactly as needed, then copied the virtual machine to a new virtual machine, change the hostname, IP, sshd key, and user ssh key. With this kind of set up we are able to scale very quickly with more slaves as our test suite continues to grow, not to mention that now all our developers see the same build time for our test suite.

Testjour uses rsync to synchronize the local repository with its slaves. Since our hardware was not located at the same location as our development machines, we also decided to create a head server (as one of the 8 virtual machines) which would also serve as our Testjour master. We were losing about 1.5 minutes running our tests via Testjour due to the rsync from our development machine to our Testjour master. Rather than using rsync to the Testjour master, we instead set up a git repository on our master and added it as a remote to our local repositories. This allowed us a much more efficient process for transfering the code we are currently working on to the testing environment. We will also in the future be working on both pushing git changes to the master, then using rsync to push any further changes that haven't been committed. Currently we must commit to our local branch and then push to the Testjour master remote for the changes to be tested.

Since we currently use Capistrano for deployment, we decided to automate our testing experience with Capistrano as well. This also played well in to using a master testing git repository. Excerpts of our Capistrano recipes are included in part 3.

Testjour was actually missing one critical thing for our project so I actually created my own fork of Testjour which is located at Dishwasha/testjour. I would recommend using this version when you follow this article. I also added in a couple other things which may not be necessary for your environment.

Wherever testjour is run from (i.e. the master), it must be running a redis server. The redis server is used for master/slave communication and must run on the master server. It is not necessary for the slaves. I would recommend copying redis.conf to /etc, change the daemonize line to yes, and add "redis-server /etc/redis.conf" it to your rc.local. You'll also need to open TCP port 6379 or whatever port you configure in your redis.conf to your master's firewall configuration to allow slaves to communicate with the master. Both the master and all slaves must have the redis gem installed as well. Run a "gem install redis" or get it from ezmobius/redis. You'll need to resolve any other dependencies as well. Once you have that done, you're ready to run testjour.

Continue to part 2