Saturday, September 3, 2011

Add a Nginx Reverse Proxy to Your LAMP Setup


Apache is a reliable HTTP server that still holds more than 65% of the web server market, according to Netcraft. Unfortunately, Apache was not designed with performance or scalability in mind. While alternative solutions might be more efficient, switching is not always the best option. But that doesn’t mean you don’t have other alternatives to improve your web server’s performance.
Switching away from Apache can be more trouble than it’s worth. Almost all popular web applications – including content management systems (CMS) such as WordPress and Drupal – officially support only Apache servers. This can cause headaches for admins of other web servers. Most CMSes use .htaccess files for URL rewrite rules to create pretty permalinks or to set other directory-specific server options on the fly without accessing the main server configuration files. Alternative servers lack direct support for .htaccess files and Apache configuration directives, so you need to manually recreate the .htaccess in each virtual host configuration. This can be time-consuming, especially if you host a number of websites. Plus, each time a website’s users or developers need to change an option they must contact a server admin. Another problem is that some useful features, such as bandwidth consumption monitoring, are typically available only for Apache through third-party modules, and all popular hosting control panels support Apache exclusively. Finally, switching to another web server means learning a new product and creating a new setup, with a corresponding investment in time and money.
Rather than give up Apache, you can speed up your current HTTP server while keeping your current setup (almost) intact by installing a reverse proxy server in front of it. A reverse proxy fetches resources from one or more servers and returns them to the client as if they originated from the proxy server itself. Apache can act as a reverse proxy with the mod_proxy module, but there is no actual benefit to running mod_proxy on the same system the Apache web server runs on, plus it consumes more system resources. Therefore, for this setup, we will use an alternative web server, nginx, which is lighter and more efficient. With this approach you can have Apache serve all dynamic content and nginx handle all static files without consuming lots of system resources, combining the benefits of both servers without changing the whole infrastructure.

Nginx Installation and Configuration

Debian and Ubuntu users can install the nginx package right off the official repositories. No official package is available for CentOS, but if you enable the EPEL repository you can install nginx with a regular yum install nginx.
Stop the nginx server if it was started automatically by the package manager and create a new nginx.conf configuration file – installed in /etc/nginx by default – by pasting the following and adjusting the paths to those of your installation:
user apache; #change to the same user apache runs as
worker_processes 2; #change to the number of your CPUs/Cores
worker_rlimit_nofile 8192;

error_log /var/log/nginx/error.log;
pid /var/run/nginx.pid;

events {
  worker_connections 1024;
  use epoll;
  accept_mutex off;
}

http {
  server_names_hash_bucket_size 64;
  include /etc/nginx/mime.types;
  default_type application/octet-stream;
  access_log /var/log/nginx/access.log;
  sendfile on;
  tcp_nopush on;
  keepalive_timeout 65;

  # reverse proxy options
  proxy_redirect off;
  proxy_set_header Host $host;
  proxy_set_header X-Real-IP $remote_addr;
  proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

  # gzip compression options
  gzip on;
  gzip_http_version 1.0;
  gzip_comp_level 6;
  gzip_min_length 0;
  gzip_buffers 16 8k;
  gzip_proxied any;
  gzip_types text/plain text/css text/xml text/javascript application/xml application/xml+rss application/javascript application/json;
  gzip_disable "MSIE [1-6]\.";
  gzip_vary on;

  # include virtual hosts configuration
  include /etc/nginx/virtual.d/*.conf;
}
Nginx should run as the same user Apache runs, to avoid file permission problems. Replace apache with the Apache user of your setup – for instance, www-data in default Debian and Ubuntu installations. Besides the proxy setup this configuration file includes some generic performance tuning, such as use epoll as the event model method, which works effectively on Linux 2.6+ kernels. This works in tandem with the next line, accept_mutex off, to improve performance a bit more. Enabling sendfile allows nginx to use the kernel’s sendfile support to send files to the client regardless of their contents. This can help with large static files, such as images, that have no need for a multiple request/confirmation system to be served. Enabling gzip compression for static files can make a big performance difference. The lines starting with gzip enable compression for common web files, such as .css and .js files, on supported browsers. You can find more information about these options, as well as the complete documentation for nginx, on the project’s wiki.
Create the /etc/nginx/virtual.d directory and in it create a file named DOMAIN.TLD.conf for each virtual host you have defined in Apache, with the following content:
server {
  listen 80;
  server_name DOMAIN.TLD www.DOMAIN.TLD;
  access_log off;
  error_log off;
  location / { proxy_pass http://127.0.0.1:8080; }
  location ~* ^.+\.(htm|html|jpg|jpeg|gif|png|ico|css|zip|tgz|gz|bz2|pdf|odt|txt|tar|bmp|rtf|js|swf|avi|mp4|mp3|ogg|flv)$ {
    expires 30d; #adjust to your static content's update frequency
    root /srv/DOMAIN.TLD/html;
  }
}
Replace DOMAIN.TLD with the actual domain name and /srv/DOMAIN.TLD/html with that domain’s public HTML directory. Also add or remove file extensions that represent your static content; these files will be served by nginx.

Apache Configuration

Since nginx now acts as the front-end web server – waiting for requests on port 80 – you need to configure Apache to listen on a different port (8080 in this case) and preferably only on localhost. On Debian and Ubuntu systems open the file /etc/apache2/ports.conf and change the line Listen 80 to Listen 127.0.0.1:8080. On CentOS the same changes should be applied to the /etc/httpd/conf/httpd.conf file. Regardless of your distribution, if you use name-based virtual hosts you should have a line NameVirtualHost *:80 in the same file. Change that to NameVirtualHost *:8080.
Get expert support for 500+ open source packages If you have configured Keep-Alive support in Apache you should disable it since it is already enabled in nginx. Change KeepAlive On to KeepAlive Off in /etc/httpd/conf/httpd.conf (CentOS) or /etc/apache2/apache2.conf (Debian/Ubuntu). You can also disable the mod_deflate module since nginx already provides gzip compression. A good practice is to disable all unused Apache modules to reserve more system resources.
Both servers should start now without problems. If either one fails to start, check your configuration with nginx -t and apachectl configtest (or apache2ctl configtest in Debian/Ubuntu).
At this point if you check the Apache access log files you should see that all incoming requests are coming from 127.0.0.1. To fix this you need to install mod_rpaf, the reverse proxy add forward module for Apache. Debian and Ubuntu users can install the libapache2-mod-rpaf package and restart Apache. On CentOS you need to install the module from source. Additional required packages are httpd-devel and gcc. After grabbing and unpacking the current version (0.6 at the time of this writing) you should type /usr/sbin/apxs -i -c -n mod_rpaf-2.0.so mod_rpaf-2.0.c in the source directory. Then create the file /etc/httpd/conf.d/mod_rpaf.conf with the following content:
LoadModule rpaf_module modules/mod_rpaf-2.0.so


  RPAFenable On
  RPAFsethostname On
  RPAFproxy_ips 127.0.0.1

Debian users should also check in /etc/apache2/mods-enabled/rpaf.conf that the line RPAFproxy_ips looks exactly like the CentOS example above. After restarting Apache your access logs should show the correct client IP addresses, but also contain less information than before. This is normal, because all static files are now served by nginx.
If you use a log monitoring or visitor analysis program you should change the access_log line in the nginx virtual hosts configuration to log to actual files, and periodically merge those files with the Apache access logs before having them parsed by the analytics program.
After enabling mod_rpaf you should have no problem with CMS plugins that use the client IP address, such as Akismet for WordPress. To further improve the performance of your hosted CMS installations you should also use a caching plugin such as W3 Total Cache for WordPress and convert dynamic pages to static HTML, which allows them to be served directly by nginx and minimizes database access. Another option is to add caching to the nginx reverse proxy; you can learn how to do that from the nginx documentation.
This setup might not be as lightweight as straight nginx (with PHP-FPM) but is comparable in performance and easier than migrating away from Apache altogether. With it, your hosted websites should scale to serve many more visitors than with a standard LAMP setup, without dumping Apache or upgrading your hardware.

2 comments:

  1. No matter what I do I can't get this to work. I only get my site's favicon and the default apache greeting index.html to show. This has been an ongoing frustration for weeks now no matter whos instructions I use on the web.

    ReplyDelete
  2. I have no problem to help; just give me a ring :)

    ReplyDelete