High availability web applications are often touted as the solution for staying in business in the face of failing hardware. By having multiple instances of every critical component of your application stack, you can stay open for business even when core components unexpectedly cease to function. However, there is another big advantage to the high availability of your web application which is often overlooked.
Active system administration of web applications in this day and age implies keeping up to date with a constant stream of software updates. Also, many of these updates require a restart of your applications or even a reboot of an entire server. Without a resilient infrastructure and high availability features, this could mean that the application will be unavailable for a period of time. This is unfortunately often a reason to postpone updates.
Leveraging redundancy in web applications enables us to apply software updates without needing to wait for a scheduled maintenance window and without interrupting the service we deliver for our customers.
In this post, I aim to focus on one component of a Liferay cluster built for high availability. I will take an in-depth look at the configuration for Nginx running as a reverse proxy for this cluster.
Highly available Liferay portal
Let’s first describe the cluster I’ll be writing about. The main application is a Liferay 6.2 EE portal which is running in Tomcat on 2 servers. Only one of these should be active while the other one is standby. Sessions are being replicated by Tomcat meaning that a visitor who is logged in on the Liferay portal on the active server will also be logged in into Liferay on the standby server. The Liferay document library is shared between both instances using an NFS mount.
As a database service, we are currently using Percona XtraDB Cluster version 5.5. This flavour of MySQL uses a version of the InnoDB storage engine enhanced by Percona and uses the synchronous replication layer provided by Galera Cluster from Codership. The database cluster is running on 3 servers and the Liferay application can read and write database queries to any database server. Should one of the database servers become unavailable, then Liferay will automatically use one of the two remaining servers.
In between the public internet and the Liferay application servers, there are two servers running Nginx configured to pass on requests to the Tomcat backends. These reverse proxy servers are tasked with a few things to make the Liferay application’s life a bit easier:
- they encrypt and decrypt SSL traffic over HTTPS;
- they compress outgoing requests. So pages, CSS styles and Javascript files are a bit faster to download;
- they cache a large number of requests, meaning the Liferay application servers do not have to be bothered with requests if Nginx determined it can resend a version of the same file.
Redundancy between both Nginx servers is facilitated by using IP failover provided by Pacemaker with Corosync.
Connecting to the upstream backends
First, we will have to tell Nginx where it can find the upstream backend servers. You declare this in an upstream configuration block. At /etc/nginx/upstream.d/liferay.conf
include /etc/nginx/upstreams.d/*.conf;
This is the upstream configuration file in our example:
upstream liferay_upstream { server 10.0.0.3:8080 fail_timeout=3m weight=2000000000; server 10.0.0.4:8080 fail_timeout=3m weight=1; ip_hash; }
In Nginx, an upstream block starts with the keyword upstream
liferay_upstream
Individual backends are declared with a server
An upstream server declaration can accept many parameters. In this case, we picked a fail_timeout
Backup parameter effects
The weight
server
10.0.0.3
backup
In testing, however, we noticed that users would receive an “HTTP 50x error page” when Nginx was switching over to the backup backend before actually getting the content returned from the backup backend. Therefore we don’t use the backup
proxy_next_upstream
The ip_hash
ip_hash
Adding caching to improve performance
Next up I’ll quickly mention one line of the configuration relating to caching. In the configuration file at /etc/nginx/conf.d/caching.conf
proxy_cache_path /var/cache/nginx/liferay_cache levels=1:2 keys_zone=liferay_cache:10m inactive=60m max_size=256M;
Right after the keyword proxy_cache_path
/var/cache/nginx/liferay_cache
levels
keys_zone
liferay_cache
inactive
max_size
By using Nginx to cache certain requests you prevent the Liferay application having to deal with sending a response, but instead, Nginx will quickly look up the response in its cache and return the response directly. In fact, Nginx is nearly as fast as Varnish for caching. If you have more complex caching requirements then Varnish does offer a lot more in caching rule flexibility. With that flexibility also comes complexity and in some sense, another potential component which can fail in your stack.
Looking at the virtual host
Let’s round things up by looking at the server configuration for the virtual host in question. In this case, the configuration could be located in the file /etc/nginx/sites-available/example.firelay.com.conf
server { listen 443 backlog=4096; server_name example.firelay.com; access_log /opt/www/sites/example.firelay.com/logs/access.log main_timed; # See https://mozilla.github.io/server-side-tls/ssl-config-generator/ for appropriate SSL settings ssl on; ssl_certificate /opt/ssl/example.firelay.com/example.firelay.com.crt; ssl_certificate_key /opt/ssl/example.firelay.com/example.firelay.com.key; add_header Strict-Transport-Security max-age=15768000; location / { proxy_pass http://liferay_upstream; proxy_set_header X-Real-IP $remote_addr; proxy_set_header Host $host; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; proxy_read_timeout 180s; proxy_connect_timeout 10s; proxy_redirect http:// https://; proxy_next_upstream error timeout invalid_header http_502 http_503 http_504; add_header X-Cached $upstream_cache_status; proxy_cache_use_stale off; proxy_cache liferay_cache; gzip_comp_level 3; gzip_proxied any; gzip_types text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript; } } server { listen 80 backlog=4096; server_name example.firelay.com; rewrite ^ https://$server_name$request_uri? redirect; }
The important part of this configuration is the location /
liferay_upstream
proxy_pass
proxy_set_header
proxy_next_upstream
Some performance tweaking
To improve performance there are also two noteworthy bits of configuration. With the proxy_cache
X-Cached
Taking advantage of these redundant features we were able to deploy critical patch relating to for instance the Heartbleed and the Ghost vulnerabilities as soon as possible without noticeable impact for end users of these clusters.
Further opportunities
Does this mean that we are satisfied? Never! I am sure we can and will be improving our deployments further. Testing has shown us that our setups are still struggling with the C10K problem. While we are halfway there (so in essence more like “C5K”), it would be nice to be able to cope with the onslaught of 10000 concurrent connections stampeding through our stack. I’m sure we can also do additional tweaking on the caching and compression configuration and squeeze a little bit more performance. Nginx’s commercial version (Nginx Plus) also appears to have many additional features tailored to improving high availability. It’s definitely worth investigating that a bit more. Also, as a stack is only as strong as its weakest component, we will be working on improving the other components like Liferay, Tomcat and the database continuously.
In the end, there will always be opportunities for improvement to strive for!
Why managed Liferay hosting?
Discover how Firelay boosts your Liferay in our extended features document.