Earlier in the year I was part of the team that put live a new site for my employer, moving from a predominately flat HTML site with a little PHP to one using Drupal 7. Here I will go through some of the issues that have been encountered around performance and the steps that I lead to improve site performance over time and as new features have been added.
Please note this is not my site so at times I don’t go into lots of detail around numbers.
Out of the box, a Drupal 7 site with any level of complexity will in my experience perform pretty poorly in terms of page speed and resource usage. There are a number of particulars issues for this site so that performance became an issue and in some cases limited the responses available.
- Complex content types, templates & views that we had to use to display our content.
- A complicated user setup with 4 ways of user authentication and a significant amount of premium content that requires the user to be authenticated rather than anonymous.
- Fortnightly peaks in readership from a new issue being released.
XHProf is great to look at bottle necks in particular scenarios, and I also found Siege a good tool for some load testing. The MySQL tuner script gave some great clues along with the slow query log.
But I must say a great investment was in using NewRelic for monitoring. Their web app made highlighting problems areas much quicker with the added benefit of a quick drill down to help identify the cause.
Before wen went live, the obvious elements of cache pages for anonymous users along with concatenate and compress css were turned on. Initially there were issues with js that required some code changes to turn concatenate and compress for it as well.
Under load, we started to find the web server (Apache) waiting on an over stretched database (MySQL), and eventually Apache would chew up all the RAM, go into swap, kill SOLR search and grind the system to a halt. Given our single server there were limits to what we could do, so the first step was to double resources and split the web and db servers to separate machines. This allowed us to tune MySQL and in particular throw a lot more RAM at MySQL through the Innodb cache. Now all the db contents could came from RAM and not the disk. This change had a minor positive impact on standard site performance but a huge improvement when it came to handling periods of higher load.
Getting the cache headers setting right on static content like css, js and images helped take the strain from the web server.
Our site visitors are spread around the world, but just over 50% of visits come from our home country in Australia. One big issue, that our servers were located in the USA. My past positive experience had been with Amazon CloudFront, so we went with them. Again this helped page load overall, but the impact was mixed depending on location. The bigger boost came a couple of weeks later when AWS opened a new edge location in Sydney. A notable difference for our Australian readership.
An update to the Views module gave us stable caching of views db results and HTML results. This along with block caching now being available to us, meant that even though whole pages could not be cached for authenticated users, the largest and most complicated parts could now.
A large second phase of functionality was built and given an expected increase in traffic we moved to a web server with double the amount of RAM. The extra RAM and more stable Drupal modules allowed us to install 2 caching systems: APC & Memcache.
APC (Alternative PHP Cache) was used just as an opcode cache, which meant that on each page load all the PHP scripts did not need to be parsed again. Within a few minutes of being turned on, all PHP scripts had been parsed and cached, and there was an immediate reduction in both CPU and memory usage. CPU usage essentially flat lined. In retrospect the 128M for APC should have been done earlier.
With all the extra Drupal caching more work was being put back to MySQL as that is what Drupal uses by default. Installing Memcached to store the Drupal cache bins had a noticeable effect on on generating HTML pages and reduced MySQL usage.
PHP/Apache has now become a small part of total page load time, with the biggest areas now being around DOM processing. Standard user performance has been improved, as average page load time as measured by NewRelic has halved. The ability to handle load has also been verified with the site performance after 2 large scale email mail outs during local business hours caused no noticeable degradation in performance.