More results...

Generic selectors
Exact matches only
Search in title
Search in content
Post Type Selectors

Protected by Scarecrow

Loading Dashboard....

What the heck is an outage, anyway?

by David Young
Published: July 15, 2011

ebsite uptime monitoring softwareDon’t look at me like that. The question gets sort of interesting, if you’ll just stare at it (instead of me) long enough.

Still not interesting? Well, keep trying.

…I swear, some people just have no attention span at all.

All right, I admit it: I didn’t think there was much to it, way back when we decided to mess with this stuff. So, what are we talking about?

If my website goes down, I want to know about it. I might need to take some immediate action, or I might just want to record how often it happens to inform some future decision. So, being a software company after all, we got together with some potential customers and created an online app that’ll tell us what we need to know.

Like hell.

Scarecrow gives it a good shot. It checks fairly regularly to see whether a site is responding. It tries three times before it decides there may be a problem. If something’s going on, it checks the site from several different locations. It stores information about outages in a database.

We combined a couple of apps, actually. The first just checked to be sure a website was running. The second used SFTP/FTPS/FTP to look at actual files on a web server, and notify customers if something changed (backups/restores came later). Then, as it became more common for content to live in a database, we started checking for specific words at specific URLs. Then, even later, we improved support for backing up databases. And restoring them. Right from Scarecrow’s Dashboard.

Sounds pretty good. So how come I’m still not happy?

Because there are things we call “disputed” outages. Suppose a Scarecrow agent in Los Angeles says a site’s down. Say Chicago agrees. But Dallas says it’s up. What’s that all about?

Simple: the web server itself is running (at least intermittently…oh, let’s assume the geographical stuff is real just for a minute). I therefore conclude that some internet problem that doesn’t affect Scarecrow’s agent in Dallas is getting in the way.

So, no problem, right? The server’s up, and that’s what I (as a business owner) am paying for, and the rest is kinda like mosquito bites in Alaska: part of the price of doing business in the digital age (if you’re writing an article from an off-the-grid cabin via laptop, tethered iPhone & two-stroke generator, that is).

But. What if the mysterious routing problem is caused by my internet hosting provider? Or if not caused by them, what if it’s caused by their provider? What if it all appears fine to me, but it’s not fine for people using another ISP? What if the problem is actually DNS-related, so customers can’t find my site even though it’s up and running? What if it happens a lot?

So a lot of disputed outages can mean something. After staring at the data we’ve gathered, and talking to customers about what they need, we decided we needed to make specific error messages (rather than binary up/down info) available when customers download lists of their outages.

That helped, some. Scarecrow also gives a best-guess suggestion about what to do with various types of outage.

But what do we really want? Lots more data! Tying outage information from Scarecrow’s agents to a map of the internet. We want our customers to be able to figure out whether an outage was limited to their server, their hosting provider, or the state of Kansas.

We want, eventually, to publish information about hosting providers and what they really offer their customers. I’ll bet some of them wouldn’t like it. Others might be glad somebody finally noticed the differences between them and their competitors.

Scarecrow in general is about improving transparency, making commercial website development and hosting less of a black box to business owners. So it fits the mission, and all, but we’re nowhere near ready to publish this kind of information. (Nor do we think it’s okay to do so without asking our customers’ permission.)

It’s more of a long-term goal than a specific plan, at this point. But I thought I’d give you folks an idea of where we’re going with this.

Got some related ideas? Think I’m full of it? Let us know.