Saturday, December 06, 2008

Load Balance Clustered Ejabberd Servers

I recently completed setting up our XMPP infrastructure. After spending some time reviewing the current capabilities of jabberd2, openfire, djabberd, and ejabberd, I decided that ejabberd had the best combination of features for our needs: virtual hosting, LDAP integration, clustering support, shared rosters, and reasonably good documentation!

So after setting up the first ejabberd node (im1), with a test virtual host and working LDAP integration, I setup our second ejabberd node (im2) by copying /etc/ejabberd/ejabberd.cfg to the 2nd node, then running through the following steps:

  • First launch an erlang shell as the ejabberd user, with erl -sname ejabberd@im2 -mnesia extra_db_nodes "['ejabberd@im1']" -s mnesia

  • Then, to replicate all ejabberd tables in my configuration, I ran a: mnesia:change_table_copy_type(schema, node(), disc_copies).mnesia:add_table_copy(offline_msg,node(),disc_only_copies). mnesia:add_table_copy(privacy,node(),disc_copies). mnesia:add_table_copy(sr_group,node(),disc_copies). mnesia:add_table_copy(sr_user,node(),disc_copies). mnesia:add_table_copy(roster,node(),disc_copies). mnesia:add_table_copy(last_activity,node(),disc_copies). mnesia:add_table_copy(disco_publish,node(),disc_only_copies). mnesia:add_table_copy(pubsub_node,node(),disc_copies). mnesia:add_table_copy(pubsub_state,node(),disc_copies). mnesia:add_table_copy(pubsub_item,node(),disc_only_copies). mnesia:add_table_copy(session,node(),ram_copies). mnesia:add_table_copy(s2s,node(),ram_copies). mnesia:add_table_copy(route,node(),ram_copies). mnesia:add_table_copy(iq_response,node(),ram_copies). mnesia:add_table_copy(caps_features,node(),ram_copies). mnesia:add_table_copy(motd_users,node(),disc_copies). mnesia:add_table_copy(motd,node(),disc_copies). mnesia:add_table_copy(acl,node(),disc_copies). mnesia:add_table_copy(config,node(),disc_copies).

    After you quit the shell, you'll most likely need to move the result mnesia database files to the ejabberd user's $HOME folder.

    Once, both nodes were working correctly I setup a LVS-DR load balancer with ldirectord. This proves to be rather straightforward.

    First the realservers (each ejabberd instance, im1 and im2) had to configured with a local interface that listens to the load balancer's VIP (virtual IP). The most reliable way I found to set this up was with a simple
    ip addr add 172.16.254.60/32 brd + dev lo label lo:vip
    in /etc/rc.local.

    Then I setup a /etc/sysctl.d/60-ipvs-arp-rules.conf with
    net.ipv4.conf.eth0.arp_ignore = 1
    net.ipv4.conf.eth0.arp_announce = 2
    net.ipv4.conf.all.arp_ignore = 1
    net.ipv4.conf.all.arp_announce = 2
    On Ubuntu (and I think debian as well), you must also tweak /etc/sysctl.d/10-network-security.conf to disable source address validation
    net.ipv4.conf.default.rp_filter=0
    net.ipv4.conf.all.rp_filter=0
    That's pretty much it for the realservers.

    Setting up the loadbalancer involves setting up the VIP in /etc/network/interfaces
    auto eth0:vip0
    iface eth0:vip0 inet static
    address 172.16.254.60
    broadcast 172.16.254.60
    netmask 255.255.255.255
    Then setting up ldirectord (apt-get install ldirectord) in /etc/ldirectord.cf with
    /etc/ldirectord.cf
    # Global Directives
    checktimeout=3
    checkinterval=15
    autoreload=yes
    logfile="/var/log/ldirectord.log"
    logfile="local0"
    emailalert="joel.reed@nvizn.com"
    emailalertfreq=3600
    emailalertstatus=all
    quiescent=yes

    virtual=172.16.254.60:5222
    real=172.16.254.70:5222 gate
    real=172.16.254.72:5222 gate
    scheduler=wlc
    protocol=tcp
    checktype=negotiate
    service=simpletcp
    request="junk"
    receive="jabber.org"
    It'd be really cool if there was some kind of builtin heathcheck call you could do on an ejabberd node, but alas there isn't so I just send it a string of garbage ("junk" to be exact), and look for the jabber.org string in the XMPP response. Seems to be working OK thus far...
  •