Month: November 2008

Emergency Router Maintenance

One of the interface cards in a customer-serving router in Santa Rosa has suffered a software glitch this afternoon. We will be performing invasive maintenance on the affected card tonight at midnight. All T1 customers served by this card will be down for 5-10 minutes during this maintenance. There is a chance that the card maintenance will require the entire router to be rebooted. If this occurs, all T1 and wireless customers served by this router will be offline for 5-10 minutes.

Update: The router linecard maintenance has been completed without issue. The affected card is now functioning properly, and there was no need to reboot the entire router.

-Jared and Matt

Mail Service Interuption

One of four NFS filers that supports our backend mail spool storage suffered from a broken FCAL loop and went off-line at 23:24. While the other three remained in service the mail cluster doesn’t take kindly to this situation and users may have noticed delays, timeouts or other errors while checking their mail. All services were completely restored by and operational by 23:56. The loop was most likely broken by a disk that had failed earlier this evening that was awaiting removal from the system in the morning. This is one of the rare failure scenarios that our clustered filers are not able to handle without manual intervention. -Kelsey and Don

Broadlink WDSL Outage

Broadlink Wireless DSL customers served off one of their main towers in the Santa Rosa area are currently offline, due to a power failure in the area. Broadlink is en route to the tower site with generators, and we expect service to be restored shortly.

MySQL5 support added.

Our Database Member-Tool now offers you the choice between MySQL4, MySQL5 or PostgreSQL.

We encourage you to try it out and even try migrating your data from your MySQL4 instance to MySQL5 as this will enable all the features and performance increases in MySQL5 as well as landing you on a brand new piece of hardware specifically designed to host Customer SQL Data.

For future flexibility we have provided each Customer with a DNS entry in the form of [database name].db.sonic.net; giving you the ability to configure your programs once with one Host Name entry and be able to switch between various database back-ends simply by pointing DNS at the new server.

Please let us know what you think by sending feedback and comments to support@sonic.net .

Datacenter was on diesel

Our Santa Rosa datacenter facility is currently running on our backup generator due to problems with the automatic power transfer system. While we do not anticipate this will cause any impact for customers, it’s certainly a failure of a sort. Normally, the transfer switch starts the generator during a utility failure, but today it has triggered without cause. We are working to get back onto primary utility now.

Update: We’re back on utility, but managed to cause a fault our air handlers in the process of going back and forth multiple times. This bumped temperature in the datacenter from the typical 69 degrees up to over 90. All systems are back online now, and temperatures are now close to normal. There was no customer impact during this partial failure. -Dane, Russ, Kelsey, Don, Nathan and Jen

Update: ATS Fault Analysis and Repair.  After reviewing the situation with a GE/Zenith support tech last night it was concluded that the modular timer responsible for automatic genset exercising and transfer tests was triggering the erroneous transfers to emergency power.  This timer was ‘off’ when it failed and, indeed, never been used as we prefer to manually initiate our weekly genset exercise tests.  The faulty timer has been removed and we have every confidence that our ATS will work as expected from now on.  -Kelsey, Russ and Nathan