Optimizing Apache
brian at brianhouk dot com
for 1.3.0 and greater only.
For the organization which I'm currently working for apache has filled our needs 100% and is showing no signs of not being able to do so anytime soon. At times serving upwards of 1000 requests per second for dynamic content and filling several megabit per second of our bandwith it's safe to say that apache can and will perform good for most organizations. Also, bear in mind that this is just the default install of apache which we're running. The only option it was installed with on this server was the --prefix=/www option.
Hardware And OS Issues:
If you're going to have a webserver which is going to be under a heavy load replying to hundreds of requests per second hopefully you're going to have the funds to put it on a machine which will be able to handle it. One thing is for sure though either way you're going to want to optimize your install of apache to use the least amount of ram CPU time as well as just get the most performance out of your webserver that you can. One thing that is going to definately help you on the side of hardware would be ram. Get as much ram in that machine as you can afford to put in it, if your machine is going to start swapping it's going make a big difference in how fast requests can be handled. Aside from ram you're going to need a fast enough CPU and of course a fast enough network card.
Configuration Issues
HostnameLookups are one thing you're going to want to avoid in all scenarios. When you're requiring the system to do a HostnameLookup you're adding latency to every request made because it has to do a DNS lookup on the IP requesting the file before the request is completed. Additionally if you are using the allow from or deny from directives in your conf files you're going to slow down your serving considerably. A reverse, and then a forward lookup for security reasons. Avoid using those directives as those can hinder performance, so instead of using hostnames you'll want to be using IP addresses.
Symlinks are something which you're going to want to avoid at all costs. If the Option FollowSymLinks is used every time that apache serves up /usr/local/apache/htdocs/index.html apache will have to run a system call on /usr /usr/local /usr/local/apache /usr/local/apache/htdocs/ and /usr/local/apache/htdocs/index.html. The results of apache's checking on symlinks in this scenario means 5 more system calls before that file is served. Additionally these results are never cached.
If possible avoid using AllowOverride. Wherver in your apache's servers filesystem you allow overrides (usually .htaccess files) apache will attempt to open .htaccess for every directory leading up to that. For example if you had:
<Directory />
AllowOverride all
</Directory>
and then someone requested /usr/local/var/apache/htdocs/index.html apache is going to attempt and open /.htaccess /usr/.htaccess /usr/local/.htaccess /usr/local/var/.htaccess /usr/local/var/apache/.htacces /usr/local/var/apache/htdocs/.htaccess before serving up index.html. For best performance you're going to want to AllowOverride none for your entire filesystem, that is if you can afford to do that.
Negotiation, if you're really intrested in sucking as much performance out of your machine as possible avoid content negotiation. In configuration instead of DirectoryIndex index use a line more like DirectoryIndex index.cgi index.html index.pl index.shtml index.php3 with the one more likely to be served the furthest up in that list.
You can tweak the MinSpareServers, MaxSpareServers, and StartServers settings to whatever you think that they should be and then do testing on your own on your own hardware to determine the best settings for these. In most instances the default httpd.conf which comes with apache will do more than fine in your settings. Also other settings which you may want to consider tweaking are MaxRequestsPerChild settings as well as the KeepAliveTimeout. If you're going to be changing either one of these, then you're going to want to make sure you read up on them at apache's website so you have a good understanding of what they do before you go monkeying with them.
Logging, well many sites do logging but on some of our servers with even the minimal amount of logging we get HUGE logfiles and large amounts of I/O in writing to these logfiles. How big of logfiles you ask, well in a hour 500 - 700 megs. It's a waste of space and most the time information which we're not going to need so we just blackhole them and send em to /dev/null.
Apache Source modifications, by default apache will only give you 256 servers running at the same time. That in some situations may not be enough to handle the requests. In the event that it isn't enough to handle all the requests which your webserver(S) get you're going to need to optimize more than just apache on your server, look back to my mainpage soon for a paper on optimizing linux systems. So you're limited to 256 servers on your apache install unless you've specifically change the setting HARD_SERVER_LIMIT in your src/include/httpd.h the default is 256 in it, you can set it to whatever you want. I've never really had to modify this in the source so I think i would just double it up to 512 then recompile. From there you need to edit your httpd.conf because there's a setting in that as well "MaxClients 256" which you'll want to set to whatever you recompiled and set the HARD_SERVER_LIMIT variable to.
There are patches which can be obtained for the 1.3.x apache series to help speed up apache. These patches of course will have to be applied pre-compiletime. These patches can be found at http://arctic.org/~dean/apache/1.3/ I'm not going to go into any explination of them here as you can grab the patch and read its explination, chances are it will have a better one than I can offer you. If little performance tweaks are what you're looking for this is definately something which you're going to want to look into.
More patches: http://oss.sgi.com/projects/apache/ has apache patches which have been stable for me, I haven't had a problem with them. I haven't however benchmarked them. They claim that these patches can make in some cases apache run up to 10 times faster and on 2.0 up to four times faster. I belive it, SGI just plain rocks for the most part.
Note: If you do end up using these patches, please e-mail the ASF and tell them to merge the patches into the stable source tree.
Ramdisk, I've been trying to get time to toy with using a ramdisk as the root of your apache filesystem and it is something which I plan to do in the near future. Most of our content is contained in less than 256 megs of space so I would be able to capture all of our data on our high end servers in a ramdisk. I'm hoping that I'll have time to attempt this and see if it works. For the cheap price of a little extra ram and the price to put it in if it gives a little bit of a performance increase it's well worth the work. I'll keep this updated if I do get around to attempting this.
If you have any questions or find any information contained in this document which is incorrect I would appreciate an e-mail. brian at brianhouk dot com