Tuesday, November 1, 2011

Building a SAN for backup and remote replication, part 1

I've often said that any idiot can build a computer and a lot of idiots do. Likewise, it is remarkably easy to build a SAN from off-the-shelf parts and open-source software, but it’s much harder to build one that works well. This series documents what I learned – and the mistakes I made – while designing and building an inexpensive iSCSI SAN solution for backup and remote replication.

A client wanted to create a disaster recovery backup system and replicate it offsite. The client had too much data for an ad-hoc solution but was too small to afford the often breathtaking prices of replication solutions from vendors like EMC. They were already replicating some data using a pair of StoreVault S500’s, but they were flakey and difficult to manage.

I designed a pair of SANs that met the following requirements:

1) Least cost. I needed the lowest possible cost while meeting the system’s functional requirements.

2) Replication. I needed to replicate data from a local device to a remote device, over a slow and insecure Internet connection.

3) Data integrity. Loss of data should be extremely unlikely.

Explicitly absent from my list of requirements were:

1) High performance. This was a backup target used by a single computer. It did not need to be fast.

2) Maximum uptime. I actually ended up with a system that has good uptime, but it wasn’t something I focused on.

Some terminology

If you’re new to iSCSI, you’ll need to know some terms. An iSCSI target is the computer that holds the actual storage. This is the SAN. An iSCSI initiator is the computer that accesses the storage. The initiator pretends to have a SCSI controller card, the network pretends to be a SCSI cable, the target pretends to be one or more SCSI drives, and everything works great until the network fails.

Picking the software, or why OpenFiler sucks

Microsoft offers an iSCSI target, free with the purchase of Windows Server. Windows Server 2008 starts at around $800.

Linux and BSD offer iSCSI targets, free with the download of your favorite distribution. There are even a few distributions that include an iSCSI target built-in and ready to run. One such distribution is OpenFiler.

It has been about a year since I evaluated OpenFiler, so maybe things have changed since then. OpenFiler is a general-purpose Linux-based file server distribution. It has a web-based GUI. I found that it had issues:

  • The port for the GUI is 446, instead of the standard port 443. Why? This is a single-purpose server; I can’t imagine what other website they would expect to serve. Fortunately, a scholar and a gentleman by the name of Justin J. Novak published some simple commands to switch the GUI to port 443.
  • Whoever(s) assembled the GUI focused on functional groupings rather than use-case scenarios. To set up iSCSI I had to construct a RAID array, allocate an LVM volume, create an iSCSI target volume, set up initiator authentication, and set up target volume authentication – all on different tabs, sub-tabs, and sections. Oh, yes, this is how things were arranged in OpenFiler – main tabs along the top of the page, sub-tabs below them, and sections, not below the sub-tabs as sub-sub-tabs, but as menus along the left-hand side of the page. And sometimes little popup windows demonstrating some web developer’s l33t coding skillz. Navigating OpenFiler’s GUI was an exercise in confusion.
  • Fortunately, if you want to do anything the least bit out of the ordinary with OpenFiler, you need to resort to the command line. This includes replication, which was touted as a feature of OpenFiler but was completely unsupported by the GUI. In fact, to get replication to work, you needed to hack the Linux boot script (for Windows users, this is the glorified equivalent of autoexec.bat). I had to go even further and manually alter the order in which daemons loaded, since LVM kept taking control of my replication volume.
  • It was difficult to add packages to OpenFiler. I tried and failed to install various VPN packages, finally concluding that OpenFiler and/or rPath Linux (the base distribution) were overtly hostile to customization.

Eventually I entirely abandoned the GUI, doing everything from the command line. At that point I realized that there was no point in using OpenFiler at all.

Instead, I fell back on my favorite server distribution: CentOS. To be fair, it’s the only server distribution I use, but it works great. On top of CentOS, I installed iSCSI Enterprise Target (iet), Distributed Replicated Block Device (DRBD), and the not-so-free replication helper drbd-proxy. I’ll get to those, but in the next article I want to talk about hardware, which brings us back to the beginning.

1 comment:

jonathan said...

Very informative blog... Here I found very nice information on building a SAN for backup and remote replication. It is really important to have best disaster recovery backup.