Month: October 2003

Graton Rooftop outage.

Fri Oct 31 14:26:20 PST 2003 — Graton Rooftop outage. A switch failure has taken down our wireless backhaul to the Graton Rooftop customer deployments. We are working to repair the problem, but we don’t have an ETA at this time. – Bryan, Eli

Update Fri Oct 31 14:45:45 PST 2003 — Services have been restored. The switch that serves our head-end deployment failed, and was replaced with an onsite spare. Cause of failure is not known. – Bryan, Eli

Local number calling problem.

Thu Oct 30 13:27:34 PST 2003 — Local number calling problem. We are experiencing problems with calls from Santa Rosa SBC phones to our main office number and our non-SBC dialup numbers. We believe this to be a problem with LNP, the local number portability system, that the telephone companies use to direct calls from one carrier to another. We are opening trouble tickets with all the carriers involved. If you need to reach our office, you can use our Focal numbers: (707) 237-9616 Sales and Accounting; (707) 237-9617 Technical Support. -John and Russ

Update Thu Oct 30 16:13:30 PST 2003 — Our carriers report the problem solved. SBC shut down one of their switch routing servers (SS7 SCP) that was corrupted. The other three answer correctly and can carry the load. -John

Mail Delays: Freezer, the NetApp filer that…

Thu Oct 23 09:43:39 PDT 2003 — Mail Delays: Freezer, the NetApp filer that handles among other things, ‘/home’, had a disk failure this morning. The filer behaved properly, failed the disk and began to rebuild onto a spare. However, the failed disk continued to cause problems for the filer, but triggering repeated bus resets. Meanwhile, the SpamAssassin cluster, which relies heavily on ‘/home’ suffered terribly. After removing the failed disk from freezer things are starting to get better but it will still take some time before mail delivery is returned to normal and all queued mail is delivered. At this time, we have disabled SpamAssassin in order to allow mail delivery to resume. -Kelsey

Update: 12:15:32 PDT — All services have been restored: Freezer has finished rebuilding it’s raid, the SpamAssassin servers are back online and all of the back-logged mail queues have been processed. -Kelsey, Nathan, Jared and Russ

Circuit Outage: Our 100mbit link from Santa…

Wed Oct 22 14:17:27 PDT 2003 — Circuit Outage: Our 100mbit link from Santa Rosa to Equinix, San Jose, has been shutdown due to excessive errors and packet loss. We’re currently working with the vendor and hope to have it resolved shortly. At this time, reachability to the Internet in general should be fine, but users may experience some performance impact. This is especially true as one of our Cisco 7200VXR routers in San Francisco just (as I’m writing this MOTD) took the opportunity to crash in a red zone violation, further impacting the network. -Kelsey, Nathan and John.

Update 15:14:27 PDT — All service has returned to normal. We’ll continue to work with our vendors to resolve both issues encountered. -Kelsey

Mail Delay: Some routine work on one of our…

Tue Oct 21 12:22:29 PDT 2003 — Mail Delay: Some routine work on one of our NetApp NFS filers ended up becoming invasive and impacting performance. The SpamAssassin servers were affected the most, causing some mail to pass unfiltered. In order to restore the system, we shutdown mail delivery for a few minutes to let the systems settle down. This only affected inbound email delivery, not outbound mail or users’ ability to check their mail. -Kelsey

Multicast and IPv6 outage.

Tue Oct 21 10:28:46 PDT 2003 — Multicast and IPv6 outage. The router that handles’s multicast and IPv6 traffic locked up this morning and required a power cycle to restore. This event caused a multi-hour outage for these services. -John and Nathan

E-Mail Anti-Virus Progress: Our new…

Thu Oct 16 15:24:25 PDT 2003 — E-Mail Anti-Virus Progress: Our new Anti-Virus filtering solution is working quite well at this time. Initially we encountered some stability and configuration issues that we’ve been able to resolve. The filters blocked 27,964 infected E-Mail’s from reaching our customers yesterday. Many of these viruses would not have been caught by our old, hand maintained filters.

In response to the good results, we’ve moved the new system onto the outbound mail cluster that handles ‘’ so our outbound mail flows from customers will also be cleaned. -Kelsey

Issues with SpamAssassin Cluster: Two of the…

Sun Oct 12 17:13:21 PDT 2003 — Issues with SpamAssassin Cluster: Two of the four dedicated SpamAssassin processing servers started to fail open at 6:00 PM last night due to a debugging log which reached the 2 gigs file size limit on both boxes, nearly simultaneous. The verbose debug log has been disabled on all of the servers and the two servers have been returned to service. The failure was not detected by our monitoring since the servers were accepting connections but were unable to complete the requests. -Kelsey

Issues with new Anti-Virus Filters: Our new…

Thu Oct 9 17:26:41 PDT 2003 — Issues with new Anti-Virus Filters: Our new anti-virus filter may have caused some MIME multipart/alternative messages to loose their text/html part, but only when a text/plain part was present. We resolved this earlier this afternoon after it was brought to our attention by a couple of users. We are sorry for any trouble or confusing this may have caused. So far, our beta results have been very promising. The new filters have caught over 24,000 viruses so far today, many of which were most likely not caught by our old virus filters. -Kelsey

New National Access dialup provider.

Wed Oct 8 15:46:27 PDT 2003 — New National Access dialup provider. We are pleased to announce that we are switching National Access providers. By the end of October we will discontinue using MegaPOP. We are now displaying GlobalPOPs phone numbers in our pop finder page. They have over twice as many access numbers as our old provider. If you use this service, please check the web page and find the new access numbers. -John and Kelsey