I'm with Richard here on getting in touch with reality.
I always like to quote a report written by the Yankee Group. Yeah I know
reports like these written by these professional 'opinionators' at are
questionable at best, but one of the things they quantify which is
interesting to me is a 31% attribution to human error. I'll give them
the benefit of the doubt.
The reason why that is significant is the only way you'll get over that
is institutional diversity. Change the associated humans and you'll
avoid a 31% contributor to the cause of your potential downtime in your
future. Not to mention all the other causes.
I've had this argument before.
Another point of reference I have is a conversation with someone
involved with APNIC (previously was with Telstra) back in '99 who had
the view the boils down to this:
"Why does a telco waste money on making their network 5 nines, when
they'll never practically realise that? Why does a customer spend money
on that? Why doesn't a customer spend money of parallel sourcing 2 x '3
nines' networks and combining that together with the knowledge that
they'll actually truly realise a 5 nines solution?"
Now of course the customer being able to achieve that is dependent on
their own internal reliability, but you're all at 5 nines availability
yourselves right guys?
Course you are. Silly me. That's why there's been little noise.
Now, back on topic. The failure over the weekend was I presume not due
to operator error. But if you've developed a strategy to deal with
operator error, you've probably also dealt with an outage that is a
little bit longer than normal. And in that case you're probably feeling
cool right now. Otherwise you're a little hot under the collar.
Nevermind, lesson learnt.
From: Richard Naylor [mailto:email@example.com]
Sent: Monday, 30 June 2008 6:34 p.m.
To: Pshem Kowalczyk; Chris Hodgetts
Subject: Re: [nznog] Vector, did you try turning it off and then on
At 01:15 p.m. 30/06/2008, Pshem Kowalczyk wrote:
Updates 4h into the problem saying - no ETA and about
2h later - it
will be fixed tomorrow are just not enough. The fact that it was
virtually impossible to get hold of anyone that actually had a clue is
not something that an average business is willing to accept.
If a network connection is that important to the "average" business,
then it should do something about it.
What worries me is that people on this list think that a network
operator should be infallible. If the Internet is that important to your
business, then have more than one provider. If you have clue, you'll use
one with a different layer 3, 2, 1, or maybe even 0 route topology. Its
far simpler to have multiple providers and networks than paying a
fortune to have some undeliverable level of reliability. And besides
*you* have control over it.
My business relies heavily on the Internet. We use 5 providers for our
network and around 5 for the services. We mix dsl with wireless
(3 forms), we have 4 satellite links, fiber and are known to run our own
cable when we can't get what we want where we want it. A Km or two is no
If a company uses a UPS for its computer, why doesn't it invest in a
second link. Even crappy dsl will at least keep something going.
NZNOG mailing list