Fri Nov 1 16:00:38 PST 2002 — Ultra, one of our 5 load balanced mail servers entered an unusual failure mode where it could no longer resolve the IP of our outbound SMTP server cluster. 797 outbound email messages were returned to their local senders before we were made aware of the problem and removed ultra from the mail server pool. This only seems to have affected outbound email delivery from ultra. We apologize for the problem and are working to make sure that it does not occur again. -Kelsey, Eli and Scott
Month: November 2002
On Monday night at midnight we will be making
Fri Nov 1 14:42:28 PST 2002 — On Monday night at midnight we will be making a full test of our power generation facilities. While we certainly don’t have any reason to expect an interruption of power during this transition, it is possible. We plan to run the ISP on diesel for about 30 minutes during this test.
This will be the first full load test of the generator. Two previous partial load tests and periodic no load tests have gone well. Once the system is proven at full load, we will be doing periodic full load runs at least once per month during the daytime.
Sonic.net’s power generation system is a 24 liter V-12 twin turbocharged Detroit Diesel, which generates 1024 horsepower and 750,000 watts of power. This is enough electricity to power a small town of about 750 homes – or, one rather large ISP. A huge Leibert UPS array keeps us online during generator startup.
Tue Nov 5 10:32:16 PST 2002: Update; due to a scheduling difficulty, this test has been delayed until Tuesday at midnight.
Wed Nov 6 00:35:06 PST 2002: Update; as I write this, Sonic.net is running entirely on diesel power. The full transition test went smoothly, and all power generation and transfer systems operated as expected.
Our power generation plant can keep Sonic.net running indefinitely in the event of a utility failure. We have enough diesel on site currently to run for a week, and a fueling truck is scheduled to visit as often as we need. -Dane
BroadLink had a scheduled power outage at one
Fri Nov 1 14:23:47 PST 2002 — BroadLink had a scheduled power outage at one of their tower sites this morning, but the UPS system failed. They replaced the equipment quickly to get customers back online. -Dane
Update, Fri Nov 1 17:32:08 PST 2002: A second outage occurred, and has been resolved. We expect at least one more once PG&E competes their work. As it’s both informative and funny, I’ll include an excerpt from the internal Sonic.net/BroadLink staff discussion list that explains the trouble. The following was written by BroadLink’s wonderful Jason Kane:
Regarding what happened:
As noted in the previous message PG&E was putting up a new power poll across the street from the tower site. As a result everyone in that area lost power for the day. It’s my understanding that wire line power will be restored in a few hours.
We originally believed that the scheduled power outage would not effect our customers since we have battery backup and a generator to re-charge it. The UPS failed immediately and the tower went dark. That was this morning.
To fix the problem we replaced the UPS with a mostly-charged unit, gassed up the generator, plugged our hardware into the UPS and plugged the UPS into the generator. Everything came back up and we figured our only problem was making sure the generator had plenty of gas. But the universe decided that today would be a good day for tweaking with the otherwise idyllic lives of BL and it’s faithful customers. As you may be aware, the tower took a nose-dive about twenty minutes ago. We had Tim nearby so he checked it out. The assumption was (of course) that the generator ran out of gas. But low and behold the generator was still cranking along without a hitch (a questionable metaphor but you get the idea). The UPS on the other hand was in a world of it’s own. And in that world, restarting every few seconds is some sort of imperative.
When we unplug the UPS from the generator it emerges from this malady and runs the tower off battery. But as every little gelfling knows you can’t run on batteries forever, even if you’re a pink bunny. So we rework the setup and plug everything into the generator. Fingers crossed the switches are flipped and the information age continues.
It is my belief that the power from the generator was not smooth/clean enough for the UPS. While it’s rather strange to have a UPS that’s more picky about clean juice than the switches, radios, management units and imported dancing hula girl lamps that are stuffed into the tower we’re forced by irrevocable circumstance to continue living without an adequate answer to such questions.
Here’s the basic sequence of events:
8:20am – PG&E disconnects power to tower 8:23am – broadlink battery backup system fails … outage 9:33am – battery system replaced, generator added 2:43pm – power supplied by generator kills replacement battery system … outage 3:33pm – tower rewired to run directly off generator, service restored
PG&E is supposed to finish their work today so we can rewire to run off primary power once again. We’ll test the battery system and replace it if needed to prevent a similar problem from recurring. We also now know that our generator can’t be used to re-charge our battery backups.
-Jason (BroadLink)
I hope that you found this as amusing as I did. -Dane