Author: admin

LATA1 ADSL1 Outage

Tonight shortly before 7:30pm we lost connectivity on one of the backhaul circuits serving ADSL1 customers in the Bay Area and North Coast. This has resulted in many customers experiencing a sync-no-surf outage, where the DSL modem indicates it has a connection to the local DSLAM, but no traffic passes through.  Out network engineers have been working on the problem since immediately after it happened, but we have no ETR yet. Fusion, wireless, and dialup service, as well as ADSL1 service outside the Bay Area and North Coast are unaffected.  — John F and the NOC staff

Update: Our network engineers have been working counterparts at AT&T and have narrowed the problem down to what appears to be a faulty switch on their end. Still no ETR. — John F and the NOC staff

Update: As of 11:03 PM tonight, the ATM backhaul circuit was restored to service by ATT. All affected customers should be back online at this time. If you are still experiencing a problem with your DSL connection, please reboot your DSL modem.

-Jared

Internal SQL Server Failure

One of our core internal MySQL servers has experienced a failure off the SSD based storage system that it was recently migrated to.  We’re restoring from backups now, but it is too early to estimate when things will be back up and running.  This does not directly impact any of or access or hosting  services but does disable access to our member tools, signup forms and many other internal systems.

Update:  We’ve successfully migrated back to the old spinning disk RAID and while performance isn’t as good, it is a stable config that has operated for several years without issue.  At this time all affected services are back up and running.  Back to the drawing board for our next generation SQL server platform.  -Kelsey

DSL Aggregation Router Failure

One of our Redback DSL aggregation routers failed this morning at approximately 11:39 AM. We are in the process of migrating all customers off that Redback to a hot spare device. We expect that all affected DSL subscribers will be restored to service in 5 minutes.

-Jared

Non-Intrusive CLEC Maintenance

This Friday, Jan 14th at 12:01 AM we will be performing non-intrusive power maintenance on our equipment in the downtown Santa Rosa CO. We do not anticipate any customer impact during this work.

-Nathan and Tim

Update: This maintenance has been pushed to Friday, Jan 14th at 12:01 AM

Unexpected DSL Aggregation Router Reboot

At 8:03PM this evening one of the DSL aggregation routers that serves legacy DSL customers in LATA 1 rebooted unexpectedly. Service for affected customers was restored by 8:05PM. We are investigating the cause of this reboot and will monitor closely for any further issues.

-Tim and Jared

DSL Outage in San Francisco

As of 3:30pm PST, DSL customers served out of many San Francisco wire centers are currently without internet due to an AT&T equipment failure. Our estimated time of repair is within several hours.  We are working with AT&T to fix the problem, and we hope to update our MOTD with a more exact repair time shortly.

Fusion customers in San Francisco are not affected by this outage.

Update: At 4:20pm, most of our customers appear to be surfing again, and AT&T expects the remaining customers to be fixed shortly.

Los Angeles DoS Attack

Starting at about 9:57 AM today, a distributed denial of service attack was launched at one of our DSL subscribers in Los Angeles. This resulted in reachability issues for many of our customers served out of LA. We blocked the DDoS attack at our edge within 10 minutes, thus restoring service to the affected customers.

-Nathan

Customer Service Closing Early

Sonic.net Support and Sales  will close at 9:30pm Friday December 17th for our employee movie night. Support and Sales will open at the regular time Saturday morning. Programs are required to carry their data disks at all times.

-John F. and the Sonic.net staff

DSL Aggregation Router Failure

This morning at approximately 7:15AM one of our legacy DSL aggregation routers failed. We have re-routed all customers served by that router to one of our hot spares, and are investigating the cause of the initial router failure. All customers affected by this outage should now be back online.

-Jared

Update: During the migration, some of the affected customers were not built out properly on the new redback. This has been fixed and those customers should be up and running at this time.

Corrupted Email Messages

From approximately noon yesterday to 11:00AM this morning some specific kinds of messages with attachments over 512k may have been partially corrupted by some changes introduced into ours spam filtering systems enabling customers’ blacklists to be applied to all inbound email.  While these changes passed our testing and behaved as expected in our lab some clients appear to format messages in a way that causes a single carriage return in the message source to be lost.  This typically results in an email client not displaying the message body correctly while it still displays the attachments.  We are about to start an automated repair process but it is expected to take several hours to complete scanning and repairing any affected email in all customer’s email spools.  We’re sorry for any inconvenience this has caused.  –Kelsey and William

Update – As of 14:45, all corrupted messages have been repaired.