How to Migrate from Apache to NGINX
Friday, 2009-06-26 00:10, 1245975012 seconds since Unix epoch
Just imagine you’ve got a few web sites to look after, like me, have a tendency to over-engineer things, like me, want the best out of your hardware, like me, and have some free time on your hands, unlike me. What do you do? Right! Migrate Apache to NGINX. “Why NGINX?” you might ask. Well, Apache eats too much RAM for starters. If a website is lagging because of a faulty database, Apache will prefork itself to death and claiming all of your precious RAM in the process. NGINX is fast. Really fast. It uses less resources while doing way more useful work. It has it’s shortcomings too of course. It can’t handle that many configuration options. You can throw anything HTTP related to Apache and it’ll have some kind of module that understands it. NGINX can understand HTTP. That’s about it. But on the other hand, that’s all I want a web server to do. And finally, the logo. Whereas Apache has a purple feather, NGINX is the People’s Server of the Great Soviet Union. I mean, how cool is that? In Soviet Russia, NGINX serves you!
So first, we’ll need NGINX. My machines are running Debian GNU/Linux amd64, so it should just be an apt-get install. The latest stable NGINX is stuck in experimental, and because I didn’t want to be working with the legacy version, I rolled my own package. It’s available from the WasdaPuntEnEl apt repository if you’re too lazy to build your own. I’ve also uploaded the source, so you’re welcome to port it to your own platform.
The first thing you’ll need for a proper migration is a second IP address. You should know how to configure your OS to get that effect. If you can’t get a second external IP address, I’d suggest reading SSH or OpenVPN documentation. With this dual-IP method you can test your web sites on both Apache and NGINX, and verify the absence of difference. Well, except for the obviously enhanced speed that is. Reconfigure your Apache to only listen at the first IP address, the one you’re already using to serve from. Check for Listen 80 and friends. On Debian you can usually find these directives in /etc/apache2/ports.conf. Change it to only listen on your primary IP address like this: Listen 1.2.3.4:80.
The Debian package I’ve forked shipped with a decent default configuration. It has, just like Debian’s Apache, the sites-available and sites-enabled directories, a conf.d and sane defaults. The /etc/nginx/nginx.conf file is quite understandable.
user www-data; worker_processes 2; error_log /var/log/nginx/error.log; pid /var/run/nginx.pid; events { worker_connections 1024; } http { include /etc/nginx/mime.types; default_type application/octet-stream; access_log /var/log/nginx/access.log; sendfile on; keepalive_timeout 65; tcp_nodelay on; gzip off; include /etc/nginx/conf.d/*.conf; include /etc/nginx/sites-enabled/*; }
We only need two worker processes, this is really enough to completely saturate the 100Mbit/s pipe this particular 2-core server is connected to. It’s basically a rule of thumb to take a worker process for each available CPU, with a limit of six or so. Unless you’re serving 1×1 gifs, which NGINX can do insanely fast from RAM by the way, you’ll be fine. You can increase the worker_connections value to compensate for increased traffic. So this config can serve up to 2084 parallel connections, which is enough for this particular server.
Next up: virtual hosts. Just like Apache, you can set up a virtual host in a file located in /etc/nginx/sites-availabale and symlink it into /etc/nginx/sites-enabled to enable that virtual host. The configuration of the virtual host is almost like Apache’s, with a few gotchas.
server { listen 87.253.149.111:80; server_name jbisc.org www.jbisc.org; access_log /var/log/nginx/jbisc.access.log; if ($http_host !~ www.jbisc.org) { rewrite ^(.*)$ http://www.jbisc.org$1; break; } set $webroot /var/www/jbisc; location / { root $webroot; index index.html; } }
This little blurp configures NGINX to serve JBISC’s site from /var/www/jbisc/, the same place Apache reads it from. The rewrite syntax is more readable than Apache’s. Any code monkey can read, and understand, the www-forcing code. You can also use variables in configuration files, something that you can’t do without once you’re used to it. I’ve set $webroot just for good practice, because you’ll need it’s location on more places than one in complex configurations.
Most of the sites hosted on my machines are written in PHP. Including this blog. So we’d need some kind of PHP support. Apache has mod_php, which includes the PHP interpreter into the Apache process. It’s the fastest way to communicate to PHP from the web server, but it just doesn’t scale that well. Every Apache process will have this interpreter on board, even if it’s sending you a 1×1 gif image. The best way, that I know of, to serve PHP from NGINX is to use FastCGI. Unlike CGI, which starts an interpreter for every request, FastCGI has a number of interpreter processes running, listening on a socket. Luckily PHP has a built-in FastCGI method. First, you’ll need PHP listening on a socket. To get this thing running we’ll use spawn-fcgi from Lighty. It’s a nice little wrapper program making it easier to manage the PHP processes. I’ve modified an init-script that I found somewhere on the web to start a few PHP processes, which will listen on a Unix socket for incoming FastCGI requests. It will also drop PHP’s privileges to Debian’s www-data user. Copy it to /etc/init.d, make it executable and (if you wish) add it to your boot using update-rc.d. Don’t forget to start it before adding the following config to NGINX.
location ~ \.php$ { fastcgi_pass unix:/var/run/php-fastcgi.sock; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME $webroot$fastcgi_script_name; include fastcgi_params; }
This bit of configuration should go into your existing server { } section. It will match the URL for the PHP extension, and send the request to the eagerly waiting PHP FastCGI socket. The fastcgi_params include is another configuration file listing all of the default fastcgi_param directives. It’s just a shorthand. Don’t forget to add index.php to your index directive. Otherwise you’ll end up with a 403.
You won’t miss mod_rewrite at all. I’ve easily migrated all of the mod_rewrite configuration to NGINX rewrites. They both share basically the same regular expression syntax. You should get rid of, or block, the .htaccess files. NGINX doesn’t support those.
if (-f $request_filename) { break; } if (!-e $request_filename) { rewrite ^(.+)$ /index.php?q=$1 last; }
These two should go in the location / { } part of you virtual host. These rewrites are used to serve WordPress for instance. Everything that can be translated to a file name gets hosted directly, otherwise it’s the q argument for index.php.
There’s one final little piece of config I’d like to share. I love mod_userdir. It’s just a nice way of putting stuff online without difficult configuration. The following should go into a file, which can be included in the server { } part of your virtual host. It realizes the same behavior, including PHP support. This won’t work server-wide, just for the domains you include this file for.
location ~ /~([^/]+)(.*\.php)$ { alias /home/$1/public_html$2; fastcgi_pass unix:/var/run/php-fastcgi.sock; fastcgi_param SCRIPT_FILENAME $request_filename; include fastcgi_params; } location ~ /~([^/]+)(.*)$ { autoindex on; index index.html index.htm index.php; alias /home/$1/public_html$2; }
I’ve also managed to get both Rails and Django running, using mostly the same technique. The only difference between those frameworks and PHP scrips is that the frameworks ship with either their own web server you can proxy (like Thin for Rails) or a FastCGI server (like manage.py runfcgi for Django).
You can test your migrated sites by editing your local DNS resolving to resolve the hostname to the second IP address. You can either change you local hosts-file, or make your local DNS caching server apply the changes. If everything checks out you can reconfigure both NGINX and Apache to respectively listen on the primary IP address and stop listening on it. A quick restart of both of them finishes your migration.
This entire server has been migrated in little under four hours, with some relatively complex web sites. It’s also still running Apache to host the Subversion and Trac sites. Subversion over HTTP is one of those things NGINX just isn’t designed for (yet).
Angela Says:
Wat is ties abaut? It saunds laik joe are praten abaut indians wit wigwams und so. But ai doe not get iet. Wat is NGINX? Ies tat some sort of tent werrr teh indians r smoking bat stuff und so? Das is nicht nais.
Ant ai doe not get teh story about rails ant trains and so. Is ties about cowjongens?