Saturday, February 26, 2011

Load balancing with Apache: a tutorial on mod_proxy_balancer

Load balancing is a technique aiming at distributing workload in a computer network, in order to optimally utilize resources, avoid overload and maximize throughput.
Computer clusters rely on load balancing to distribute workload across network links, CPUs, web servers, etc.
A server farm is a common application of load balancing, where multiple servers seamlessly provide a single Internet service. In this case the load balancer accepts requests from external clients and forwards them to one of the available backend servers according to a scheduling algorithm (e.g. round robin, random choice, on a reported load basis, etc.)
Load balancers can be implmented using dedicated hardware or ad-hoc software.
In the remainder of this tutorial we deal with configuration and features of the Apache web server’s mod_proxy_balancer, the Apache module developed to provide load balancing over a set of web servers.
The tutorial covers basic installation and configuration under any Linux distribution.
What is mod_proxy_balancer?
mod_proxy_balancer is an Apache module available since Apache 2.1. It allows turning an Apache installation into a load balancer retrieving requested pages from two or more backend web servers and delivering them to the user’s computer.
One important feature of mod_proxy_balancer is that it can keep track of sessions which means that a single user always deals with the same backend webserver (sticky sessions).
Requirements and installation
The module requires:
• an Apache HTTP Server installation version 2.1 or later (at the time of writing, 2.2.15 is the latest version available);
• mod_proxy extension.
In order to install Apache and the required extensions, download the sources from the Apache HTTP Server website, untar the archive and run the following commands from the main directory:
• ./configure --enable-proxy --enable-proxy-balancer [run ./configure -h to list all the available options]
• make
• make install
Basic configuration
We need three servers to run an example: a load balancer (lb.seco.com) and two worker nodes (wn1.seco.com and wn2.seco.com).
From now on we refer to the Apache HTTP Server installation folder on the load balancer host as $APACHE2_HOME (the default position is /usr/local/apache2).
Find the Apache HTTP Server configuration file ($APACHE2_HOME/conf/httpd.conf) and edit it adding the following instruction:
Include conf/extra/httpd-proxy-balancer.conf
The instruction points at an external configuration file to be included (relative path wrt to the Apache installation) where balancer configuration will be provided.
Now create the httpd-proxy-balancer.conf file in the $APACHE2_HOME/conf/extra folder and add the following lines:

BalancerMember http://wn1.seco.com
BalancerMember http://wn2.seco.com

ProxyPass /test balancer://mycluster
An instance of cluster is created (mycluster) with two members.
The /test URL of the load balancer (i.e. http://lb.seco.com/test) is mapped to the two members.
Requests to the load balancer will be alternatively forwarded to the workers.
Load balancer methods
Three load balance methods are currently available:
• byrequests: weighted request count balancing;
• bytraffic: weighted traffic byte count balancing;
• bybusyness: pending request balancing.
In order to choose the desired method, the following line can be added in the balancer definition (i.e. the Proxy tag):

...
lbmethod=method

where method is one of the three listed before. Deafult is byrequests.
A load factor could be applied to members of the cluster, in order to define the weighted load of each member.
In the following example 30% of the requests will be forwarded to the first member, whereas 70% will be forwarded to the second one. The load factor is an integer number ranging from 1 to 100.

BalancerMember http://wn1.seco.com loadfactor=3
BalancerMember http://wn2.seco.com loadfactor=7
lbmethod=byrequests

Session management
While balancing the load of a web application, it is always possible to implement sticky sessions: requests of the same user will be forwarded to the same member of the cluster.
Balance members will be tagged with a route value as follows:

BalancerMember http://wn1.seco.com loadfactor=3 route=seco1
BalancerMember http://wn2.seco.com loadfactor=7 route=seco2

Session identifiers will be then defined at the application level as the concatenation of a value (independent of the member the client has been assigned to) and the route value as follows:
SESSION_ID=.
Finally, the cluster URL mapping must be declared as follows:
ProxyPass /test balancer://mycluster stickysession=SESSION_ID
where SESSION_ID is the name of the variable at the application level storing the session identifier.
Balancer manager
The balancer manager enables dynamic update of balancer members and their load factor. The mod_status extension is required.
To enable the manager, the following lines of code are required in the httpd-proxy-balancer.conf file:

SetHandler balancer-manager
Order Deny,Allow
Allow from all

The instructions will enable the manager, accessible via browser at http://lb.seco.com/balancer-manager.
Statistics and configuration details will be displayed and settings could be edited.

No comments:

Post a Comment