Category: Uncategorized

DSL Aggregation Router Reload

This Saturday, May 8 at 12:01 AM we will be performing maintenance reloads of two Redbacks that terminate traditional DSL service. This will affect all some of our Los Angeles and Sacramento DSL subscribers. Expected downtime is 5 minutes.

-Jared

The scheduled reloads have been completed without incident. All affected customers are back online after about a 5 minute outage.

CLEC Intrusive Maintenance

This evening, beginning at 12:01AM, we will be performing maintenance on equipment serving FlexLink Ethernet customers in Napa, Oakland, and Windsor. Affected customers should experience less than 10 minutes of downtime.

-Tim

CLEC Intrusive Maintenance

This evening, beginning at 12:01AM, we will be performing maintenance on equipment serving FlexLink Ethernet customers in Rohnert Park and Petaluma. Affected customers should experience less than 10 minutes of downtime.

-Tim

Backbone Maintenance

Tonight at 12:01 AM, we will be performing backbone maintenance at our San Jose POP. This maintenance will be moving our existing transit and transport load to our new T320 backbone router. We should be able to perform this work without disruption to customer traffic, but work of this magnitude always carries a risk of impact.

-Jared and Nathan

Update: All appears well. We’re calling it a night, without any observable customer impact.

-Nathan and Jared

CLEC Intrusive Maintenance

This evening, beginning at 12:01AM, we will be performing maintenance on equipment serving FlexLink Ethernet customers in Healdsburg. Affected customers should experience less than 10 minutes of downtime.

-Tim

CLEC DSLAM Maintenance

This evening, at 10:00PM, we will be performing a software upgrade on the DSLAM that serves Fusion customers in the Rincon Valley area of Santa Rosa. Expected downtime is less than 15 minutes while the DSLAM is rebooted onto the new software release.

-Tim and Nathan

Update:

Things did not go as planned. The upgrade failed to import a wide variety of important customer settings, causing us to attempt our pre-scripted roll-back procedure to undo the software upgrade. That process went even worse, and caused our DSLAM to forget a large chunk of even more important stuff. What was left was severely corrupted.

We keep a full library of historical device configurations, so the logical course of action was to re-program the device from one of those saved copies. This operation didn’t work. We thought it was a version mismatch problem between the saved copies (they’re in binary — PLEASE, device vendors, don’t keep your configurations in binary!) and the exact software load we were attempting to restore on. We tried 4 or 5 different combinations. Nothing worked.

Typically, our devices are provisioned by automated systems. Due to changes wedged into this code, our automated systems don’t quite know how to talk to the new version properly, so the automation was next to useless. In the end, we re-configured the whole device by hand on the code we were attempting to upgrade to.

Despite the saga above, this particular issue affected less than 20 of our customers. Our sincere apologies to those folks, who experienced an outage from around 10:30pm until 1:30am or so. We’ll be hammering out these issues with our equipment vendor to ensure this doesn’t happen again.

-Nathan + Matt and Jared for moral support

CLEC Intrusive Maintenance

This evening, beginning at 12:01AM, we will be performing maintenance on equipment serving FlexLink Ethernet customers in Forestville and Sebastopol. Affected customers should experience less than 10 minutes of downtime.

-Tim

DSL Service Disruption

We are currently tracking a network event that is disrupting network connectivity for several of our DSL aggregation routers. We are working to identify and resolve this event as quickly as possible.

-Jared and Nathan

Update:  We believe the issue is a backplane problem on the ATM switch serving many of our DSL customers.  A reload of the affected device should resolve the trouble — we’ll let you folks know how it goes. -Nathan and Jared

Update: Things are looking much better after a reload of the ATM switch. We’re still working to ensure that all services are up and functional, and will be working with the equipment vendor to diagnose the trouble that we’re having. Sorry to all affected! -Nathan and Jared

ATM OC-12 outage

One of our ATM OC-12s suffered a five minute outage, we are investigating, and a further update will follow.

Update: The ATM OC-12 has remained stable and we have tickets open with the provider on the circuit to ensure that the circuit will not have further problems. The outage we had from 16:55 to 17:00 would have affected some of our DSL, Business-T and FRATM customers in the bay area.

-Jared

Internal database failure.

Update : 5:20pm all services have been restored.

We suffered an internal database failure around 4:30pm today; the data is currently being restored, and the repair time is estimated to be 20 minutes.

You may see errors when trying to use some of our Member Tools because of this.

Updates will be published as they happen.