While troubleshooting our new ATM OC-12 circuit in San Francisco, an AT&T tech inadvertently removed the wrong fibers from a multiplexer card in the SNFCCA17 central office. This caused two of our other ATM OC-12 circuits to go off-line. These links serve DSL, Business-T and Frame Relay customers. The AT&T tech quickly realized his error and fixed the problem. Total customer downtime was approximately 75 seconds, lasting from 1:29:05pm to 1:30:20pm. We’re currently in touch with management at AT&T to ensure that this does not happen again. -Nathan and Jared
Customer SQL v4 system maintenance
Tonight I will be performing maintenance on custsql.sonic.net, our mySQL 4x host. Just after 12:01AM I will be performing a necessary reboot of the system. Estimated customer impact is approximately 10 minutes at most. This does not effect mySQL v5 databases. -Don
Update: Maintenance completed. Customer impact was just under three minutes.
Squirrelmail config database maintenance
This morning at 12:01AM I performed some needed maintenance on our Squirrelmail configuration database. Some customers may have been unable to update their settings in Squirrelmail during this time. Customer impact was approximately 3 minutes. During this time, I also upgraded our Squirrelmail installation to the latest stable production version. -Don
Sebastopol DSL Outage
Hardware failure on the Sebastopol Central Office has caused DSL customers to lose connectivity. We are working with AT&T to resolve the issue, and hope to have an estimated time of repair shortly.
-Adam, John and Steve
Update: As of 5:00PM, service appears to be restored to all affected customers.
DSL Aggregation Router Reboot
The DSL aggregation router that serves DSL to the Chico area rebooted itself approximately 20 minutes ago, causing about 5 minutes of downtime for all DSL customers in that area. Currently all traffic levels and customer connectivity look normal at this time, and we will continue to monitor the router, as well as investigate the cause of the spontaneous reboot.
We apologize for any inconvenience this outage may have caused.
-Jared
Trouble with the Sonic.net Website.
Fri Oct 10 10:05:08 PDT 2008 — Trouble with the Sonic.net Website. We are currently experiencing trouble with our corp.sonic.net Web Server that serves part of the Sonic.net Home Page and all of our Blogs. We are working on the problem and will have it back soon. -Augie, William, and Kelsey.
Update: We have restored services and determined the problem being a hardware failure. We apologize for the inconvenience this may have caused. -Don and William
Mail Storage Maintenance.
Tonight at Midnight we will be performing maintenance on the NetApp Filers that store Customer E-Mail; there is no expected downtime during this period and will take less than an hour to be completed.
During this period we will be adding additional capacity to these Filers; this capacity will allow for future growth of Customer E-Mail storage and improved performance for all of Sonic.net’s E-Mail Customers.
–Augie, Don, William, Sal, and Kelsey.
Update: maintenance has been completed; no problems were encountered (other than starting later than scheduled); capacity was nearly doubled on our E-Mail Storage System. –Augie
Webmail IMAP performance problems solved.
Separate from our earlier post about slow imap.sonic.net performance (http://corp.sonic.net/status/2008/09/26/imap-performance-problems-solved/) – we have also received reports of slow Webmail IMAP performance and timeouts when Customers were using the Webmail clients on http://webmail.sonic.net.
We believe we have isolated the problem, which was a bug in our IMAP Proxy software, and have not received any reports of new problems since the beginning of the week when we implemented a fix for the problem software.
If you see timeouts when using http://webmail.sonic.net, please contact Technical Support (support@sonic.net or 1.707.547.3400) immediately, and provide the error message you receive and the time at which the problem occurred.
Webmail Web Site Time-Warp.
A misguided attempt to update some software on our Webmail Cluster inadvertently took the software, associated web pages, and server configuration back to January of this year.
As a result Customers would have seen inconsistent or broken behavior while trying to access the website from around 2:30am to 8:00am, at which point the data was restored from backups.
We applogize for any inconvenience this caused to our Customers; we will be reviewing our documented procedures so that this type of mistake does not occur in the future.
–Augie
Emergency Router Maintenance.
At 3:40PM this afternoon we will be performing an emergency router reload on one of our ATM customer aggregation routers. All connected Business-T and FRATM customers will experience approximately 5 minutes of downtime during the reload. -Tim, Nathan, Matt and Jared