Journal

DOS Site Monitoring

PC Pro logo Posted: 1st January 2000 | Filed under: Press Articles, Technical
Author: Paul Ockenden
First Appeared in PC Pro 2000

All of our clients are lovely, but sometimes they can be difficult, and this month one lot in particular has been more difficult than most. It wanted a machine set up on its LAN to monitor its intranet - making sure the server was up, checking for script errors and so on. This is the kind of thing we've done many times before, and normally we'd just set up a Linux box with some appropriate scripts. But oh no, this client didn't want Linux on its site - it had to be NT.

Now you know that's silly, we know that's silly, and even the client knows it's silly deep down, but it has its procedures and these say NT only. What's more, because of this client's 'controlled IT environment' (its words) we weren't allowed to install any software on the machine. We went through the obvious discussion to the effect of 'what's the point of having a computer if you can't install software on it', but its IT people wouldn't budge. Actually, it did allow us to copy EXE files into a single directory on the machine and run those, but it wouldn't allow anything that touched the Registry, or made any changes to the %systemroot% directory, or installed any services. The more this conversation progressed the more it seemed this IT department was testing us, or setting an impossible task to make us look stupid in front of their directors.

Challenge accepted!

The first problem was that the client had decided that it wanted its site checked every five minutes. Now remember that we weren't allowed to install any services, so there was no way that we could install a third-party scheduler, and we were stuck with NT's default 'AT' task-scheduling service. If you've ever played with this you'll know that it has a user interface from hell - unless you have IE 5 installed on the machine. This particular machine was running IE 4. The normal mode of operation for the standard task scheduler is via the DOS prompt, and although there's a GUI front end to it in the NT Resource Kit, it has exactly the same limitations. The fundamental problem is that this scheduler assumes that you'll be setting up one-off events, and there's no way to tell it to schedule a task every five minutes; to achieve such a periodicity you need to add 288 separate events to cover a whole day.

To the rescue in fulfilling this nightmare task comes the humble DOS prompt and the FOR command. In truth, the FOR command is anything but humble - it can pull off some very clever tricks (such as that date manipulation we used in a backup routine in issue 60). In this case, we make use of the /L switch and use a pair of nested FOR statements. Type the following code, all on one line, and save it as chron.bat:

FOR /L %%x IN (0,1,23) DO FOR /L %%y IN (0,5,55) DO AT %%x:%%y /EVERY:monday,tuesday,wednesday, thursday,friday,saturday,sunday "c:\monitor\monitor.bat"

Now when you execute chron.bat it will add all 288 events required to execute C:\MONITOR\MONITOR.BAT every five minutes.

We're getting there, as we now have the hook to hang our site monitoring system from. The next question is, how are we going to check that the site is actually working? The first thing we need to do is to try and grab a page, and in an ideal world we'd be able to say:

copy http://www.mysite.com/default.asp fred.txt

and the file system would take care of things for us. Unfortunately in the real world, life isn't quite as easy. Luckily, help was at hand. If you thumbed through your back issues when we mentioned issue 60 earlier, you'll have seen that the date-manipulation routine was inspired by some original work by John Rennie, and it's John who came to the rescue again this time. Take a look at his utilities site at www.winsrc.freeserve. co.uk and click through to the index, where you'll see a number of very useful tools and hacks. The one that we need now is called wwwget, an incredibly useful 36Kb chunk of code that simply grabs a file from a Web server and dumps it onto your local disk. Of course, because it's grabbing that file via HTTP, any script within the page will be executed as well.

The intranet site that we were to monitor had been built using standard ASP on an NT box, and as anyone who develops sites using this environment will know, IIS can get a bit temperamental after a certain number of script errors. There seems to be no way to predict what this number is, but you can be sure that if there are any VBScript errors within your site, then after a while you'll get the dreaded 'ASP 0115' error and your site will stop working. So that's the first thing to check for. We'll also check that the server is actually up and running, and finally we'll look for some text that we know should be on the page. Failure notification for any of these events has to be via an email message, delivered through SMS to a mobile phone.

Under normal circumstances we'd have used something written using the Windows Scripting Host, and using the CDO object installed with the NT 4 Option Pack to send the email. In this instance we were restricted to using simple batch files, but with a bit of inelegant coding, standard batch and DOS commands are more than adequate for the searching task. What they can't do is send the email, but remember how the client said that we could use standalone EXE files? Well, luckily our old favourite Blat (http://gepasi.dbs.aber. ac.uk/softw/Blat.html) fitted the bill there.

Inside monitor.bat

So what does this wonderful batch file look like then? First, we make sure that we're sitting in the right directory:

c:
cd \monitor

We then delete any existing content file:

del content.txt

and use wwwget to grab a copy of the main content page from the intranet:

wwwget -a http://www.intranet.site/main.asp content.txt myusername mypassword

Now we use the standard FIND command to check the content. First we just check for a space character. If the intranet had been unreachable then content.txt would be an empty file and this find would fail:

find " " content.txt
goto d%ERRORLEVEL%

The GOTO command sends the script off to either :d1 if the space wasn't found, where we use blat to send an email:

:d1
echo ERROR FOUND
blat content.txt -server mail.isp.co.uk -t user@sms.genie.co.uk -s content_blank
goto g0

If space was found OK, the next thing we search for is the word 'error'. We know that this wouldn't normally appear in the main content page, except when something has caused the dreaded 'ASP 0115' problem to occur:

:d0
find "error" content.txt
goto f%ERRORLEVEL%
:f0
echo ERROR FOUND
blat content.txt -server mail.isp.co.uk -t user@sms.genie.co.uk -s content_error

Then finally we check for the word 'optimised'. This is because we know that the very last thing, in tiny letters at the bottom of this content page, is a line saying 'This design is optimised for 800 ¥ 600 screens'. We know that if the word 'optimised' isn't found then there has probably been an error somewhere on that page:

:f1
find "optimised" content.txt
goto g%ERRORLEVEL%
:g1
echo ERROR FOUND
blat content.txt -server mail.isp.co.uk -t user@sms.genie.co.uk -s content_incomplete
:g0

So there you have it. One of the most inelegant solutions to site monitoring that you'll ever see, but one that we're proud of because despite all the obstacles thrown in our path by our client's IT people, we still managed to produce a solution that worked. We've since extended this monitoring system to check for free disk space on the intranet server, and again it's all done using DOS commands. If you want to know more about how we did this, drop Paul an email.