Americans are Ambitious Wolves

In which a wolf strives toward the moon.

Web southoftheclouds
Posted on Wednesday, December 21st, 2005 at 2:49. About Diary, Development Log, Work, Free World.

Someone Asked Me How to Build a Scalable Cluster

Someone asked me how to build a scalable cluster using commodity parts. Commodity clusters are easy, if you break them into independently scalable layers:

Cheap scalability means load balancing over commodity components, which you can add quickly to a set for linear scaling. The first challenge is where the client traffic comes in the door. If you can’t them in, you can’t serve them. When you add commodity components, you reduce MTTF, so your configuration needs to do dynamic-failover and rebalancing.

The best way I know to scale your front door is to start with two netfilter firewalls sharing a MAC, and getting load balance by MAC layer filtering rules. It’s pretty easy to plug in additional firewall transit capacity and to script-in failover using a heartbeat daemon. You can do firewalls in failover pairs more quickly and easily than you can do odd-numbered rings, but both are quite doable by relatively straightforward scripting and configuration.

I strongly recommend against breaking your traffic into categories, like static pages, etc., and balancing load by moving different categories to different servers. If you do that, you end up with way too much hardware underloaded, and way to much hardware overloaded, and either no failover provisioning, or else a very complex failover configuration. Instead, make the individual servers identical, and cheap. Just add more clones to the pack as needed, and keep the traffic balanced.

By this time you’re starting to see my basic approach to scalable commodity ‘nix clusters. See this lame ASCII art for detail. It amounts to a series of independently scalable layers, Firewalling, app serving, db caching, db serving. One advantage of this scheme is that each functional layer is scaled independently, so that you don’t need to operate excess capacity in any of the functional areas. Another advantage is that each layer pair can communicate on its own network, separating the traffic, and thus preventing interference of DB traffic with AS traffic, for example. The networks can be scaled independently as well, and performance adjusted to the requirements of the application, even when those requirements change, without allocating resources to any area where they are not needed.

The memcached layer is indicated if you have a lot of read-only db traffic. These nodes are cheap, don’t even really need hard drives. You could boot them off of CD or off the network, diskless. They hold as much RAM as possible. The number of MC servers required depends strongly on how much RAM each can hold but the amount of RAM required per DB node depends on the characteristics of your application DB traffic.

I’d rather install a memcached server and keep a hot DB spare than try to maintain transparent failover on a DB cluster. Coherence requirements complicate the performance curves when you have multiple DBs accepting write operations, which can lead to unpleasant surprises. Delay scaling your DB cluster as long as you can.

Better still is using C-JDBC to architecturally separate the DB layer. It offers more flexibility in the DB backend, and places less onerous software development demands on the AS layer. You don’t have to use a Java AS in order to use C-JDBC, either.

No responses to 'Someone Asked Me How to Build a Scalable Cluster'.

RSS feed for comments and Trackback URI for 'Someone Asked Me How to Build a Scalable Cluster'.

Leave a Comment

XHTML: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <code> <em> <i> <strike> <strong>

There is a Full RSS feed.
Americans are Ambitious Wolves runs on WordPress via rphd
Author login and new user registration.