Mon Jun 17 09:01:34 PDT 2002 — There was a wide spread Internet outage this morning. We are still investigating why this happened and will post an update as soon as we have more info. It does appear to have cleared up and all is working well now. -Steve
Update: The trouble affected some T1 and T3 connected customers, plus dialup customers calling our remote POP numbers. Basically, customers and sites connected to our network via mega.sonic.net were impacted because Mega lost it’s ability to peer at a BGP level with it’s upstream routers.
The problem was caused by changes to the edge router structure in preparation for our moves of the coming weeks. Mega had an incorrect upstream route that depended upon one of the two edge routers for BGP peering adjacency. What this meant was that customers downstream from Mega could not communicate with the Internet, but were able to get to local systems without any trouble.
From the perspective of our internal monitoring systems which manage paging of operations staff in case of trouble, everything looked fine. This delayed our response, as from inside the network, everything registered as normal both toward the remote sites via mega, and toward the Internet itself.
As we complete our datacenter move over the next few weeks, we’ll exercise care and planning to assure that problems like this do not affect our customers, and that we’re monitoring well in case of any unexpected trouble. To this end, Eli is working with a remote partner to deploy end to end testing of equipment in our network, so that a problem like this won’t catch us unprepared.
The MOTD is likely to be pretty busy in the next few weeks, as we’ll be posting information about all moves and network changes here in advance, though many will be transparent to most customers. -Dane, Kelsey, Nathan, Scott, Eli