CLEC Intrusive Maintenance

This evening, between 12:01AM and 6:00AM, we will be performing intrusive maintenance on equipment serving Fusion and FlexLink ADSL2+ customers in the Pacific Heights, Richmond District, and Inner Sunset areas of San Francisco. Affected customers should experience less than 15 minutes of downtime.

-Tim and Juston

Non-Intrusive Backbone Maintenance

Tonight between 10 and 11PM, we will be replacing a core switch at our San Jose POP. We will be moving all traffic off that switch, via redundant paths before the maintenance begins, so there should be no customer impact. This switch replacement is one of the first steps in our network backbone upgrade.

-Jared and Nathan

Update: This maintenance was completed at approximately 1AM. The switch replacement was completed without incident.

ATM Customer Aggregation Router Reload

This Tuesday, March 23 at 12:01 AM, we will be performing a maintenance reload on our ATM customer aggregation routers. This will result in 5-10 minutes of downtime for Business-T and FRATM customers.

-Jared

Update: The maintenance reload has been completed without incident. All affected customers are back online at this time.

San Francisco ATM switch failure.

At approximately 10:40 we had a hardware failure on an ATM switch in San Francisco. We are presently rebooting it. Approximate downtime should be 5-7 minutes. -Sonic NOC

Update 11:10AM: The ATM switch reload is complete and traffic appears to be returning to normal. If you continue to have DSL sync-no-surf connectivity issues, please contact our tech support.

San Francisco datacenter running on generator

Currently PG&E utility is offline at the 200 Paul San Francisco facility, and it is running normally on generator.  Automatic transfer switching worked as designed, and all is well.

After the failure last week of a transfer switch in this San Francisco datacenter, it’s good to see that repairs to that redundant power system worked and that the repaired transfer switch did its job.

-Dane & Nathan

2n CRAC Redundancy Pays Dividends

At 7:11PM tonight one of our two Core4 CRAC (Computer Room Air Conditioning) units unexpectedly shut itself down.  Nothing instills fear more than receiving pages titled “Sys A Enable Switch Turned Off / Service Now” and “High Discharge Air Temperature” in rapid succession.  After the initial panic passed, it was clear that all redundant systems we operating correctly and the second system had responded correctly and ramped up to handle the total cooling load.  Once on site, there was no outward indication why System A had shutdown as the system enable switch was correctly in the “On” position and it had supply power.  However, upon further investigation, it was apparent that the enable switch was water-logged, oxidized and shorted-out, signaling the system to shutdown.  The switch has been removed from service and both systems are 100% operational again.

Although it is disappointing to see a system failure caused by something as simple as an improperly weatherproofed mechanical room and control panel, it is rewarding to see our commitment to and investment in redundancy pay off.  And, ultimately, that the prototype Core4 CRAC system behaved as expected.

Special thanks goes out to Jimmy and Kent of Bell Products who interrupted their dinners to come out and verify that all systems were functioning correctly.

-Kelsey, Nathan and Russ

Non-Impacting Transport Issue

One of our backbone network transport links began having issues this morning. We have removed that link from service, and are routing traffic internally around the problem as we work with the transport circuit provider to diagnose the intermittent problems. These problems did not cause any customer impact, but as we route around the problem, customers may notice sub-optimal paths inside Sonic’s network (i.e. from San Francisco to San Jose via Santa Rosa).

We are keeping a close eye on the situation and will restore normal routing once we are certain that the transport circuit is fully resolved.

-Jared and Nathan

Secure Server Service maintenance

This Wednesday (24, Feb.) morning at 12:01 AM the Secure Server Services will be un-available for approximately 30 minutes; this affects all customers with services on ssl.sonic.net or secure web hosting services with us.

After the maintenance these customers will see a substantial performance boost.

–Augie

Large Outage in San Francisco

We’re currently experiencing a rather large outage in San Francisco, presumably caused by a power failure to some of our colocation space there. All available resources are being brought to bear, and we’ll have more shortly. This outage is likely impacting a large portion of our DSL customers as well as some Dial, Biz-T, FRATM, and other services. -Nathan, Jared, and the rest of the NOC

Update 8:25AM: We have confirmed that the issue is caused by a power failure. Apparently the facility at 200 Paul Avenue is currently without utility power. The vast majority of the site’s generators, ATSes, and UPS systems worked properly, but the UPSes feeding suite 502 (the location of some of our equipment at that facility) did not transfer correctly. More to follow.

Update 9:19AM: Power has been restored to our equipment, and we’re ensuring that everything returns to service cleanly.

Update 9:32AM: Service to the vast majority of our customers has been restored. There appears to be one DSL aggregation device that is still having trouble — we’re taking a look at that now.

Update 10:22AM: All services have been restored. Please let us know if you still have any outstanding issues. We’ll be having a serious chat with our colocation provider about this event, as this outage occurred despite a massive investment in fully redundant A+B power — both of which died simultaneously.

Changes for former Humboldt Internet customers.

Earlier today we migrated the remaining humboldt1.com Internet services over to Sonic.net.

This will provide those customers with improved service and reliability that all Sonic.net customers enjoy.

If you are a former Humboldt Internet customer and you are having trouble checking your e-mail or your website, please contact our Support team :

http://www.sonic.net/contactus.shtml
1 (707) 547-3400
support@sonic.net

–Augie