Category: Uncategorized

FlexLink Long Range Emergency Maintenance

Within the next 30 minutes, we will be performing emergency maintenance on equipment serving a subset of FlexLink Long Range customers in the Santa Rosa area. We expect minimal, if any, impact but there is a chance of up to 5 minutes of downtime for affected customers.

-Tim, Matt, and Robbie

Intrusive CLEC Maintenance

This evening, beginning at 12:01AM, we will be performing maintenance on equipment serving Fusion and FlexLink customers in one portion of San Francisco. Affected customers should only experience 5 minutes of downtime.

-Jared

Emergency CRAC Maintenance

The primary compressor on the B-side of our Core4 CRAC (Computer Room Air Conditioning) system that cools our Santa Rosa datacenter is being taken out of service and electrically swapped with it’s backup compressor.  The primary compressor experienced loss of oil pressure and while it is apparently still functional it has clearly suffered significant internal damage.  Once the primary compressor has been replaced, the backup compressor will be returned to its normal standby operation.

The Core4 CRAC system is 2N redundant so we can loose the entire A or B side of the system and still satisfy the cooling requirements of our datacenter.  Additionally, each primary compressor  on each side  is paired with a backup compressor so should this damaged compressor have actually ceased to function, the backup compressor would have taken over automatically to meet the cooling demand.

The swap was completed in the time it took to draft this notice.  Both sides of the CRAC are currently fully operational.

ATM Customer Aggregation Router Reload

This Thursday, February 24 at 12:01 AM, we will be performing a maintenance reload on our ATM customer aggregation routers. This will result in 5-10 minutes of downtime for Business-T and FRATM customers.

-Jared

Emergency Router Maintenence

At 9:40AM this morning we will be performing an emergency router reload on one of our ATM customer aggregation routers. All connected Business-T and FRATM customers will experience approximately 5 minutes of downtime during the reload.

-Tim and the NOC

Santa Rosa ADSL1 outage

One of the remote terminals serving a large number of customers in Santa Rosa is currently off line resulting in a sync no surf situation. AT&T reports that they have a technician dispatched to repair but we do not have an estimated time of repair.

CLEC Intrusive Maintenance

This evening, beginning at 12:01AM, we will be performing maintenance on equipment serving Fusion and FlexLink customers in Windsor. Affected customers may experience up to 30 minutes of downtime while the work is performed.

-Nathan, Jared, and Tim

Colocation UPS Maintenance

This Friday, starting at 9AM, one of the three UPSes that serves our Santa Rosa datacenter will be undergoing its annual preventative maintenance.  The maintenance is fully scripted with our vendor and no interruption of services are expected.  During the maintenance window our standby genset will be kept running should our PG&E service fail.  -Kelsey and Russ

Update: Fri Feb 18 19:49:15 PST 2011, the schedule maintenance has been completed with some complication.  One of of the bypass contactors jammed preventing the UPS from going back online.  While the contactor eventually actuated and the UPS was brought online, it’s replacement will be scheduled as soon as possible.  -Kelsey and Russ.

Internal SQL Server Failure

One of our core internal MySQL servers has experienced a failure off the SSD based storage system that it was recently migrated to.  We’re restoring from backups now, but it is too early to estimate when things will be back up and running.  This does not directly impact any of or access or hosting  services but does disable access to our member tools, signup forms and many other internal systems.

Update:  We’ve successfully migrated back to the old spinning disk RAID and while performance isn’t as good, it is a stable config that has operated for several years without issue.  At this time all affected services are back up and running.  Back to the drawing board for our next generation SQL server platform.  -Kelsey

DSL Aggregation Router Failure

One of our Redback DSL aggregation routers failed this morning at approximately 11:39 AM. We are in the process of migrating all customers off that Redback to a hot spare device. We expect that all affected DSL subscribers will be restored to service in 5 minutes.

-Jared