How-to: Highly available Liferay portal with NGINX

The information in this article is for an outdated version of Liferay. Due to popular demand this article will stay accessible, but we strongly recommend to upgrade to a more recent Liferay version. Visit our contact page to learn about what our hosting solutions can do for your product.

High availability web applications are often touted as the solution for staying in business in the face of failing hardware. By having multiple instances of every critical component of your application stack, you can stay open for business even when core components unexpectedly cease to function. However, there is another big advantage to the high availability of your web application which is often overlooked.

Active system administration of web applications in this day and age implies keeping up to date with a constant stream of software updates. Also, many of these updates require a restart of your applications or even a reboot of an entire server. Without a resilient infrastructure and high availability features, this could mean that the application will be unavailable for a period of time. This is unfortunately often a reason to postpone updates.

Leveraging redundancy in web applications enables us to apply software updates without needing to wait for a scheduled maintenance window and without interrupting the service we deliver for our customers.

In this post, I aim to focus on one component of a Liferay cluster built for high availability. I will take an in-depth look at the configuration for Nginx running as a reverse proxy for this cluster.

Highly available Liferay portal

Let’s first describe the cluster I’ll be writing about. The main application is a Liferay 6.2 EE portal which is running in Tomcat on 2 servers. Only one of these should be active while the other one is standby. Sessions are being replicated by Tomcat meaning that a visitor who is logged in on the Liferay portal on the active server will also be logged in into Liferay on the standby server. The Liferay document library is shared between both instances using an NFS mount.

As a database service, we are currently using Percona XtraDB Cluster version 5.5. This flavour of MySQL uses a version of the InnoDB storage engine enhanced by Percona and uses the synchronous replication layer provided by Galera Cluster from Codership. The database cluster is running on 3 servers and the Liferay application can read and write database queries to any database server. Should one of the database servers become unavailable, then Liferay will automatically use one of the two remaining servers.

In between the public internet and the Liferay application servers, there are two servers running Nginx configured to pass on requests to the Tomcat backends. These reverse proxy servers are tasked with a few things to make the Liferay application’s life a bit easier:

they encrypt and decrypt SSL traffic over HTTPS;
they compress outgoing requests. So pages, CSS styles and Javascript files are a bit faster to download;
they cache a large number of requests, meaning the Liferay application servers do not have to be bothered with requests if Nginx determined it can resend a version of the same file.

Redundancy between both Nginx servers is facilitated by using IP failover provided by Pacemaker with Corosync.

Connecting to the upstream backends

First, we will have to tell Nginx where it can find the upstream backend servers. You declare this in an upstream configuration block. At Firelay we put upstream configuration in a separate file. In this example that would be “/etc/nginx/upstream.d/liferay.conf“. We include files in this directory by adding “include /etc/nginx/upstreams.d/*.conf;” in the “http” block.

This is the upstream configuration file in our example:

upstream liferay_upstream {

    server 10.0.0.3:8080 fail_timeout=3m weight=2000000000;
    server 10.0.0.4:8080 fail_timeout=3m weight=1;

    ip_hash;

}

In Nginx, an upstream block starts with the keyword “upstream“, after which you give it a unique name. In our case we named it “liferay_upstream“.

Individual backends are declared with a “server” keyword with at the very least an address. In our case, this is an IP address plus port number where the Tomcat servers will be listening and accepting a connection, but if you are connecting to a locally installed Tomcat you could also use a UNIX socket as an address.

An upstream server declaration can accept many parameters. In this case, we picked a “fail_timeout” of 3 minutes. This means that if a backend is found to be unavailable, Nginx will not try to send requests to the unavailable backend for the next 3 minutes.

Backup parameter effects

The “weight” parameter of the “server” keyword is used to give a weighted preference to particular backends. Here we choose to maximize the likelihood that backend “10.0.0.3” will be used for normal requests as a way to configure failover behaviour. Alternatively you could skip setting this relative weight, but instead, use the parameter “backup” for 1 backend to indicate that this backend should only be used if all other backends are unavailable.

In testing, however, we noticed that users would receive an “HTTP 50x error page” when Nginx was switching over to the backup backend before actually getting the content returned from the backup backend. Therefore we don’t use the “backup” parameter. Using the maximum weight difference, in combination with a “proxy_next_upstream” setting mentioned, later on, we actually let Nginx reattempt sending the request to the backup backend and send back the content of that request instead of an error page. This way users would only be able to notice a few requests being slower when the primary backend fails, instead of receiving error messages.

The “ip_hash” keyword indicates that Nginx will look at the IP address of a visitor and will try to pick the same backend to send the requests to. This setting is more important if you load balance requests equally among the backend servers. In particular, when you are not using session replication this is pretty crucial. Otherwise, if you are logged in on one backend and your next request were to be sent to the other backend, then you probably are not logged into the Liferay portal on the other Tomcat. An additional benefit of using “ip_hash” is that Liferay’s and Tomcat’s caches are better utilized this way. (NOTE: In the case described above the effect, however, is limited, because effectively every request will be going to the primary backend under normal circumstances).

Adding caching to improve performance

Next up I’ll quickly mention one line of the configuration relating to caching. In the configuration file at “/etc/nginx/conf.d/caching.conf” we tell Nginx where and how it can store the data it will be caching.

proxy_cache_path /var/cache/nginx/liferay_cache levels=1:2 keys_zone=liferay_cache:10m inactive=60m max_size=256M;

Right after the keyword “proxy_cache_path” we define that Nginx can store data caches in the directory “/var/cache/nginx/liferay_cache” on the local filesystem. The “levels” parameter tells Nginx something about the form and number of subdirectories used to store data. A shared memory zone is used to store a bit of metadata about the requests being cached. Using “keys_zone” we define the internal name of this zone as “liferay_cache” and set its maximum size to 10 MegaBytes. If cached data isn’t used for a period of 60 minutes it will be removed from the cache as indicated by the keyword “inactive“. Finally, we tell Nginx to not cache more than 256 MegaBytes of data with the keyword “max_size“.

By using Nginx to cache certain requests you prevent the Liferay application having to deal with sending a response, but instead, Nginx will quickly look up the response in its cache and return the response directly. In fact, Nginx is nearly as fast as Varnish for caching. If you have more complex caching requirements then Varnish does offer a lot more in caching rule flexibility. With that flexibility also comes complexity and in some sense, another potential component which can fail in your stack.

Looking at the virtual host

Let’s round things up by looking at the server configuration for the virtual host in question. In this case, the configuration could be located in the file “/etc/nginx/sites-available/example.firelay.com.conf“.

server {
    listen        443 backlog=4096;
    server_name    example.firelay.com;
    access_log    /opt/www/sites/example.firelay.com/logs/access.log main_timed;
    
    # See https://mozilla.github.io/server-side-tls/ssl-config-generator/ for appropriate SSL settings
    ssl                 on;
    ssl_certificate     /opt/ssl/example.firelay.com/example.firelay.com.crt;
    ssl_certificate_key    /opt/ssl/example.firelay.com/example.firelay.com.key;
    add_header          Strict-Transport-Security max-age=15768000;
        
    location / {
        proxy_pass              http://liferay_upstream;
        proxy_set_header        X-Real-IP $remote_addr;
        proxy_set_header        Host $host;
        proxy_set_header        X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header        X-Forwarded-Proto $scheme;
        proxy_read_timeout      180s;
        proxy_connect_timeout    10s;
        proxy_redirect          http:// https://;
        proxy_next_upstream     error timeout invalid_header http_502 http_503 http_504;
        add_header              X-Cached $upstream_cache_status;
        proxy_cache_use_stale   off;
        proxy_cache             liferay_cache;
        gzip_comp_level         3;
        gzip_proxied            any;
        gzip_types              text/plain text/css application/json application/x-javascript text/xml application/xml application/xml+rss text/javascript;
    }
}

server {
    listen          80 backlog=4096;
    server_name     example.firelay.com;
    
    rewrite         ^ https://$server_name$request_uri? redirect;
}

The important part of this configuration is the “location /” block. By mentioning the name “liferay_upstream” at the “proxy_pass” keyword we are telling Nginx to use the backends we defined earlier. To make sure Liferay knows the specifics about the request as Nginx has received it, we use “proxy_set_header” to pass on some additional information about the request. The “proxy_next_upstream” keyword is used to tell Nginx that if it encounters the conditions defined after the keyword, that it should instead send the request again to the next available backend and return this response instead. This the key to gracefully switching over to the second backend if the first backend is unavailable.

Some performance tweaking

To improve performance there are also two noteworthy bits of configuration. With the “proxy_cache” keyword we are using the caching settings mentioned earlier. We’re not overriding any other caching settings, meaning that Nginx will use the request headers supplied by the Liferay backends to determine if requests can be cached. We added an “X-Cached” to be able to check easily if a request has been cached or not. In testing so far Liferay does an excellent job of setting HTTP headers correctly for requests which can be cached.

Taking advantage of these redundant features we were able to deploy critical patch relating to for instance the Heartbleed and the Ghost vulnerabilities as soon as possible without noticeable impact for end users of these clusters.

Further opportunities

Does this mean that we are satisfied? Never! I am sure we can and will be improving our deployments further. Testing has shown us that our setups are still struggling with the C10K problem. While we are halfway there (so in essence more like “C5K”), it would be nice to be able to cope with the onslaught of 10000 concurrent connections stampeding through our stack. I’m sure we can also do additional tweaking on the caching and compression configuration and squeeze a little bit more performance. Nginx’s commercial version (Nginx Plus) also appears to have many additional features tailored to improving high availability. It’s definitely worth investigating that a bit more. Also, as a stack is only as strong as its weakest component, we will be working on improving the other components like Liferay, Tomcat and the database continuously.

In the end, there will always be opportunities for improvement to strive for!

Why managed Liferay hosting?

Discover how Firelay boosts your Liferay in our extended features document.

Download Features Document