svYou know you might be a nerd if… you spend your free time writing a script to benchmark your Web site and graph the results. I’d like to say that I only spent my free time on this because Kelly and I have been sick and so we didn’t do anything this weekend but sit around, but honestly, I’ve been wanting to do it and I probably would have spent my time on it anyway.

So here’s what I did.

Your standard Web server has a tool called ab, which stands for Apache Bench(mark). What it does is simple: You give it a URL, and it will hit that Web page as many times as you tell it, and give you back all kinds of metrics about how well the page performs.

So I wrote a little script to call the benchmark tool and parse the results out into a CSV file that I could then use in Excel to generate a graph to compare how well different pages perform and how well they scale as traffic increases.

Writing and Using the Script

I originally wrote a Bash script that parsed all of the metrics into a single spreadsheet. It had some fun shell scripting, but ultimately I didn’t need all of the metrics, so I was just making my job in Excel harder. Instead, I reworked it to parse out only the average “Time per request” and throw only that measurement into the spreadsheet. Then graphing was easy.

The big problem I had was that I was limited by the shared hosting server where I rent Web space. The Bash script would crash once I got to about 400 concurrent page requests, but I wanted to test the performance with more users than that. As a solution I converted the script to a Ruby program that didn’t use any of the intermediate files and I was able to go a lot higher without problems. My server still can’t handle huge amounts of concurrent requests, so it would still fail sometimes, but I had the script handle the failures gracefully.

The end result is a Ruby script that will compare the average request time of multiple Web pages as traffic increases. It accepts a list of URLs on the command line, and increases the number of concurrent requests up to 1,000. It takes the average of 100 attempts at each concurrency level to try and get accurate readings. If a benchmark attempt fails at any of the concurrency levels, it will slowly decrease that sample size and try again, ultimately just letting Excel interpolate from the nearby data points if it fails with even a single sample. It also increases the concurrency at an increasing rate, so it doesn’t waste time on concurrency levels that aren’t significant. (Basically it increases by one user at a time until it gets to ten, and then increases by ten at a time until gets to 100, at which point it increases by 100 at a time. And so on and so on, if I were to let it run high enough.)

If you’re interested in using it to make your own awesome graphs, you’ll have to download the source code and save it into a .rb file to run it. Obviously you’ll need the Apache benchmark tool installed, also. Then you can type something like this to generate your spreadsheet:

ruby ab-time-chart.rb > google-v-bing.csv

Just an idea.

Testing with a Simple Benchmark

I’ve got some ideas of how I’m going to use this, but I needed a simple test while I was running it a ton of times to make sure it worked. So, I thought I’d compare the performance of CGI to FastCGI. I know that’s a no-brainer (the answer is in their names), but this was more about testing my benchmark script than about the actual results.

What’s being tested here is the output of a simple Web page being generated in different ways. The contents of the page in each case is simply the current Unix timestamp. The version measured by the blue line is generated by a Ruby script running over CGI, the red line is the same Ruby code running over FastCGI, and the green line is a static HTML version that I threw in just for comparison.

The graph shows the time (in milliseconds) that it took to generate the page as traffic increased. The lower lines mean it’s faster.

Performance and Scalability of HTML, CGI, and FastCGI

No big surprises here. Obviously the static HTML version is fastest, followed by FastCGI, and CGI being the slowest. I was a little surprised by how constant the increase was, especially with FastCGI, but there’s no missing data points there. I did notice during my many (many) different runs as I was tweaking my script that a higher sample rate generally produced straighter lines. I think they probably scale pretty linearly, and any bumpiness is due to unrelated activity on my server.

So, there you have it. I’ve you’ve managed to read until the end of this post, I congratulate you on finishing my first blog post that really gives details about how big of a computer nerd I am. Sorry in advance, but it probably won’t be the last.