On [Various Times], [Many People Wrote] wrote:
A 9 hour outage seems long, but it's hardly absurd. Though we haven't
seen a cause report from Vector yet, if it was spade fade, 9 hours is
I wasn't on call when it broke, but my colleague who was tells me he
was told "Hardware Failure" was the problem. If something like a 6509
plane died, I can see how you might hit 9 hrs:
The thing faults, alarms go off, customers call, the on-duty tech at
Vector has to do some basic diagnosis, wave his hands in the air and
run around like a monkey for a bit, call the on-call call engineer.
Then that engineer has to wake up and/or finish his beer and get home
from the pub, then do some more diagnostics, see the dead module and
then do that thing where you put your hands flat on your forehead and
drag them down your face while going "Gahhhhh". Then he gets to go
pull a spare from stores, or call Cisco for a part and then THEY get
to kick off THEIR internal process to get the thing to you. Then
there's truck time to the site, waiting for an elevator to L48 of the
sky tower that isn't already full of tourists in orange jumpsuits
ready to jump off and/or carts full of crab canapes, swaping out the
dead unit, sanity checking the restored services and making config
changes if required, etc. And all that is just if everything does go
to plan, and you don't find out that your spare hardware is in the
lab, and the lab is locked and the guy who has the key has gone
fishing, or that Cisco have already given their only spare WS-X6516 to
someone else, and so on and so on.
So, anyway, thing is, 9hrs, sure.
However it really shouldn't matter that much. The ISP network I am
currently fussing over, for example, has vector connectivity to the
Sky Tower, which carries APE and some other stuff. This all broke when
Vector went down. Domestic traffic, however, simply switched over to
other peering links, as it should because it is, you know, The
Internet. Some Vector-only stuff broke, of course, but core services
just failed over and carried on.
If your network connection is absolutely critical for your business,
and it's wholly dependent on one vendor, you should perhaps rethink
your approach. Talk to your ISP, explain that you need redundancy in
your connection. Chuck in something like a DSL connection beside that
Vector link. Ask your ISP about sourcing you a router than can connect
to both, and setting up BGP or MPLS-based failover to your secondary
link in the event that your Vector link fails. Now your connection is
vendor independent (if not ISP-independent - if they fail, you fail.
Try to pick an ISP that Does Not Fail Much). If you can't get DSL, ask
your ISP about Wireless or IPStar or something similar. If you can
fail from optical to satellite, that's pretty good diversity. And it's
not complicated to do. You may not get the same performance, of
course, but you'll have _a_ connection, which is better than _no_
connection, especially if it's only for a short time.
This setup won't, of course, solve the problem of a local power cut
killing your Vector link. I'm not even certain why we're discussing
that. Talk to Vector and your electrician to get a cable run from your
generator and/or UPS-backed distribution board to wherever your
building vector switch is. Plug the switch into it. And you're done.
It's a one-off cost, and likely not a large one. Even without a UPS,
the worst that can happen is that the switch powers down when the cut
takes place, then boots back up when your generator starts. A few
minutes, tops. If you _don't have_ a generator, and your building
power is out, I guess you'll be sitting in the dark, looking at your
blank screen, and won't care if your internet connection is down. This
is a perfect opportunity to go to the pub, and have a drink with the
Vector engineer. Assuming he's awake.
John S Russell
Big Geek. Doing Geek Stuff.