Journal

When sites go bad

PC Pro logo Posted: 1st October 1999 | Filed under: Press Articles, Technical
Author: Paul Ockenden
First Appeared in PC Pro 1999

As we write this column, news has just come in about how the giant US-based on-line auction site, eBay (www.ebay.com) was kept out of action for around 22 hours, allegedly due to technical difficulties. In doing so it managed to wipe around $3.7 billion (£2.3bn) off the company's market capitalisation, and we'd hate to guess at the knock-on effects suffered by the multitude of companies that have come to rely on eBay as their main sales outlet. Once upon a time, such news might have been reported on in the specialist IT press, or perhaps one of the financial papers, but that was then and this is now. The Internet has gone mainstream and the eBay mishap was deemed so newsworthy that it actually made it into the pages of The Sun, although it missed a fine opportunity to headline it 'Net Clanga Costs Lottsa Wonga'. How long before we see The World's Worst Web Disasters: 2 on Sky One?

As so often happens when disaster strikes, forums like Cix and the Web-related newsgroups instantly lit up with armchair 'experts' telling us all how stupid the eBay systems people must be, and how if they'd have been running things the site would never have gone off air in the first place. Many of these experts would also, apparently, have been quite capable of getting eBay up and running again within seconds. Yeah, right.

These are the same bunch of people who endlessly moan about every popular software package, but who, despite their obvious 'expertise', never seem to get around to writing anything better themselves. The same people who pontificate about the rules of global corporate finance and yet despite their financial acumen never seem to get rich quickly - or even slowly, for that matter. And these are the same people who'll criticise almost any form of marketing, but whose only experience of selling things involves asking people, 'Do you want fries with that?'

The problem with IT disasters is that they're almost always unforeseen. No matter how good your backups are, how multiply redundant your hardware is, how rigorously tested your software, there'll always be something unanticipated that can and does go wrong. It's important not to get complacent, as by accepting that there's always a chance that your Web site will go wrong, you can start planning for what to do when that day comes.

While you can obviously try to minimise the chances of something nasty happening, you should also remember your boy scout days and 'be prepared'. You need to have tools, systems and procedures in place to get things up and running as quickly as possible.

Backups are obviously essential: in fact they're the key to getting your site, erm, back up. You'll want a combination of rapid access, on-site archives - most useful for those 'oops, I've just deleted the most important file in the system' moments - as well as secure off-site backups which in turn are useful for those 'oops, I've just burned down the building' moments. For on-site backup the easiest thing to do is to take regular copies of your important files, doing a full directory copy on to a spare drive. This is easy to do with most Unix-based systems, but NT users might have more difficulty automating things, because there's no immediately obvious way to get a date into a directory name. The following commands should do the trick with NT 4:


for /f "tokens=2" %i in ('date /t') do set dte=%i
xcopy /c/e/i d:\inetpub\wwwroot e:\backups\www%dte:~6,4%%dte:~3,2%%dte:~0,2%

This code first extracts the data into an environment variable, and then subtracts the various components of the date to make a new directory name - you can't just use the variable dte directly because the date will contain slash characters, which would confuse xcopy into thinking that they're command switches.

Of the genuine switches in this listing, the /c tells xcopy to continue even if it finds an error (you're bound to hit the odd locked file when doing this), /e forces it to copy the complete directory structure, and /i gets rid of the 'is the destination a file or a directory?' message. A couple of other points to note: this code expects to see the date in dd/mm/yyyy format and you'll need to edit it slightly if you have a different format in your Control Panel settings. Second, when you come to run this script from a batch file, you'll have to replace each %i with %%i. Thanks to the resourceful John Rennie (jrennie@cix.co.uk) for the ideas that inspired this backup technique.

For off-site backups to play a useful part in your disaster recovery plans, don't forget that you'll probably need to be able to restore your tapes on to someone else's machine - usually one belonging to an ISP or hosting service - in order to get your site back up as quickly as possible. This will probably determine your choice of backup software, as there's no point in using some super-dooper new backup product if it generates tapes that no-one else can read.

For Linux or other systems with a Unix flavour, your best option is probably to write TAR archives directly to your tape drive, so you can be sure that just about anyone will be able to read them. There are software tools available that'll automate this, our current favourite being Amanda (www.cs.umd.edu/projects/amanda/).

With NT the obvious candidate is the standard NT backup software, although this can be limiting when it comes to backing up SQL servers or other machines over a network. For these, you'll probably have to buy a software product, and obviously it makes sense to pick the one that's most likely to be in use at your disaster partner's site. In our experience this will probably be Backup Exec, formerly produced by Seagate Software but now under the wing of Veritas (www.veritas.com/products/ bent/index.html). Although there are a number of other backup products available, we're starting to see Backup Exec emerge as almost an industry standard.