I currently manage a high traffic Image Hoster with 10 million Page-Impressions per day causing high load on the Web Frontend Server and the DB Backend Server for some months now. My budget did not allow me to scale horizontally so I had to optimize the web application by killing a slow MySQL query with the usage of Xcache. Due to the website’s structure I was not able to use the Smarty Caching function as this would easily generate 2 million files and cause high disk i/o.

Pictures are just more than words.. so have a look at the screenshot of the MySQL-Server’s load before and after the optimizations (which went online on 6th October) – its like day and night 😉

Our Web Server uses Lighttpd 1.5 / SVN + PHP-FPM 5.3.3 (guys.. spawn-fgci is deprecated 😉 ) + Xcache (PHP Accelerator and varcache) to deliver static files and dynamic pages which connect to a separate MySQL 5.0 Server.

Unfortunately with a load of 5-10 (8 CPU cores) and 60-100% CPU usage on each core (!!) our MySQL Server was pretty much overloaded 😉 .

A big downside when having bottlenecks in your PHP-Script – usually caused when relying on external resources (like file_get_contents, cURL, massive non-asynchronous DNS-Lookups, MySQL queries, etc.) – is obviously the much higher execution time. This results in having a lot of PHP (or even worse Apache) processes being spawned or in use. You will easily get a over filled backlog or in worse case your Web server will start swapping – either way your website will slow down dramatically and you will lose a lot of visitors.

At first I checked the php-fpm.log.slow for scripts with too long execution times, just to make sure that this was not a PHP problem. There were a lot of scripts hanging during mysql_query() – so it was pretty clear where to look next.

Next I took a look at the MySQL Slow Query log and summarized queries which appeared most of the time. I was able to filter out the following query (simplified):

1
SELECT DISTINCT col1, col2, FROM table WHERE col3 = col4 AND id IN (SELECT id FROM table2 WHERE x = $variable) AND (SELECT id FROM table3 WHERE a = $variable) OR col5 = 1

A query with two sub-SELECTs and DISTICNT did not sound fast to me – especially not in that frequency it was requested – which was the key factor, as querying it on an empty MySQL-Server did not cause any problems .

Before putting the query into Xcache I checked all conditions and figured out that “… OR col5 = 1″ was never true, as currently no data had that value. I decided that if some feature / function based on that condition was not used since years, it will not be needed in future anyway, so I removed it.

Now I was finally ready for Xcache. This is only a very simple example how to cache individual SQL-Queries like

1
SELECT * FROM table WHERE name = '$variable'

in your PHP-Script:

1
2
3
4
5
6
7
8
9
if(xcache_isset("prefix_" . md5($variable)))
{
    $result = xcache_get("prefix_" . md5($variable));
}
else
{
    $result = mysql_query("SELECT * FROM table WHERE name = '$variable'");
    xcache_set("prefix_" . md5($variable), $result, (60 * 60 * 6));
}

So from now on, every SQL-Query will only be done once every 6 hours. Remember: we don’t want to fill up our Xcache for no reason.. so try only to SELECT columns which we really need. The biggest advantage with Xcache in contrary to other caching systems (e.g. Smarty Cache) – it has a garbage collector! So you don’t need to worry about zombie cache entries. Just try not to go out of memory, e.g. assign enough memory for your needs in the php.ini under the xcache section.

And set a reasonable time-to-live, not too short so enough data gets cached and load goes down, but not too long which could cause a too high memory usage.

Thats all 🙂

I was able to lower the load and CPU usage of our MySQL server by approx. 850%! How about you, were you able to optimize your website? Show off your awesomeness! I did by committing with following comment into svn 🙂