Intrusive Maintenance FTTN/Hosted Voice

Tonight at midnight I will be updating firmware for Grandstream ATA’s affecting FTTN and FlexLink Hosted Voice Customers.  This will be a very small downtime.  The device will pull its new firware on the hour and load it in +~5 minutes.

-Brandon

 

This is now Complete

-Brandon

Intrusive Flexlink/Fusion Maintenance

Beginning tonight at midnight I will be performing intrusive maintenance on equipment serving Fusion and FlexLink customers in San Francisco, Albany, and Oakland.  Expected downtime is ~30 minutes.

-Brandon

 

Taking a little longer than expected, estimated completion is 1:00 am

-Brandon

Maintenance is now complete

-Brandon

System Maintenence Tonight.

Update: Maintenance complete

Tonight starting at 11:59pm SOC will be updating software on some of our core systems. The following services may experience brief interruptions:

  • Website hosting
  • IPv6 tunnels
  • Incoming and outgoing mail

We will also be upgrading the SSL certificate for imap.sonic.net from SHA1 to SHA256. This is the last of our SSL certificates that we need to upgrade so we don’t expect most clients to have problems, but very old mail clients may not support the new certificate.

 

-Grant, Joe, and SOC

Fusion VDSL2 Intrusive Maintenance – Forestville

Update: This maintenance is complete.

Beginning tonight at midnight I will be performing intrusive maintenance on equipment serving a small portion of Fusion customers in the Forestville area. Expected downtime is around 15 minutes.

– Robbie

Credit card processor down

UPDATE: Our vendor got back to us and we now have the problem resolved.

Currently our credit card processor is down and we are unable to process new payments. We have already contacted our vendor but unfortunately we do not expect to have a resolution until early tomorrow morning.

-William

Intrusive Network Maintenance – Brentwood

Tonight beginning at 11:59PM PDT we will be performing a software upgrade of equipment serving the Brentwood/Pittsburgh/Antioch/Concord areas. This maintenance is expected to last 30-45 minutes and may potentially be service impacting for the duration.

-Tim J.

Network Maintenance – Legacy DSL

Update (2:22AM): This maintenance is now complete.

Beginning tonight at midnight, I will be performing maintenance on equipment that serves legacy DSL customers in northern California. Although the majority of the equipment I will be working with is redundant, a small portion of customers may experience some downtime.

– Robbie

UPS Failure Redux

First, we’d like to clarify the extent of the problems causes by the UPS failure and subsequent dropping of load in the Datacenter.  This had no impact on any residential or enterprise connectivity services including Legacy DSL, Fusion and Fusion FTTN.  The UPS that failed was the smallest of the three UPSes in Santa Rosa and we had been working to migrate load from it.  As such, less than 20 customers in total lost some or all of their power circuits, some of which may have been part of redundant A/B circuits.  Some colo customers lost connectivity as several distribution switches did loose power.  Most sonic services, including pop, imap, webmail were not affected or only saw a brief outage as single PSU equipment rebooted and/or clusters converged as load shifted to systems that were unaffected.  The only public service that had lingering issues was our webhosting cluster which required a little manual attention for it to come online.

The outage was eventually caused by a physical failure of the maintenance bypass switch – one of the phases in the switch stuck and/or didn’t close correctly –  in the bypass cabinet for the PDU we were moving.  In hindsight, it is unfortunate that we chose to operate the switch in the first place as it wasn’t strictly the simplest way to migrate the load.  The last power failure in the datacenter was in Oct ’04 — where the same, UPS failed.

We will schedule migration off of the temporary feeds put in place in the coming weeks.  This final move is significantly easier to execute and has an exceedingly low likelihood of causing any service interruptions.

-Kelsey, Russ, and the rest of System and Network Operations.

 

intermittent dns failure

Between 5:00 pm yesterday and 9:00 am today, customers may have experienced intermittent DNS failures or slower than normal name resolution. At 9:00 am this morning we noticed a configuration failure on one of our name server clusters. We immediately disabled the cluster which allowed traffic to flow over to our other redundant cluster. We have since addressed the issue and restored the cluster to working service. We are currently investigating our monitoring procedures to identify why this issue wasn’t detected earlier and to make sure it doesn’t happen again. We apologize for any inconvenience this may have caused.

– William & Kelsey