Scaling my file server with my audience

Colin Dermott colindermott at
Wed Jan 17 18:28:42 CST 2007

I have been running a HTTP download site for free software on a single
PowerEdge server for the past 13 months and it has now grown to the
stage where I need to scale up, but I can't actually scale this
machine any further (vertically).

My first instinct was to find another machine of similar specs, get
another RAID array and mirror my system entirely, then round-robin
DNS them.  This is affordable but I'm not convinced it is the most
cost-effective solution, nor an efficient use of the resources.  It
would also become increasingly difficult to manage with the addition
of each node.

Right now I deliver files directly to the client from direct-attached
RAID storage.  My file set is large and growing and I have plenty of
room to scale there by simply adding more PowerVaults.  What I am
yet to settle on is a method of scaling my *delivery capacity* as my
audience expands.

What are some common ways of going about this?  Whilst my file set is
quite large, the hot set is relatively small but will also grow.

I am imagining I would place a cluster of reverse-proxy servers in
front of my current server, effectively making it the "origin" server.

The proxy servers would each have a fraction of the disk space of my
current server, just enough to handle my hot set of files.  Perhaps a
couple of small 15KRPM SAS disks in RAID-0.

Each new request should be forwarded to the proxy server bearing the
least load at the time of the request (using something like
mod_backhand for Apache 1.3).  The proxy will then fetch the file
from the origin server and deliver it to the client.

If anyone can offer me some advice, or at least can tell me if I'm on
the right track, I would really appreciate it.



More information about the Linux-PowerEdge mailing list