In the past 10 minutes, we’ve had a number of

Thu Jul 18 19:08:38 PDT 2002 — In the past 10 minutes, we’ve had a number of reports from DSL customers in Sebastopol that they are offline. We have opened trouble tickets with ASI, and will followup to work out what the situation is (likely down DSLAM) and estimated time to resolution is. -Dane, Zeke and Jared

Update: It appears that the DSLAM is back up and operational. -Kavan and Scott

Problems on our 522-1003 dial group.

Thu Jul 18 17:37:35 PDT 2002 — Problems on our 522-1003 dial group. In order to reduce downtime in our xxx-9811 Focal dial group we shuffled equipment out of our 522-1003 dial group so it could be relocated to San Francisco. In order for this migration to work properly, the gear had to be reconfigured on both ends. The new xxx-9811 gear was properly transitioned to its new configuration but the 522-1003 wasn’t properly configured. The MPIP server address wasn’t changed, which prevented MLPPP (Multilink PPP) from negotiating properly. As a result, all attempted MLPPP sessions to 522-1003 failed to negotiate multiple channels until this was resolved this afternoon. Overall, this only affected a small number of customers, primarily those with ISDN. -Steve

Brief busy signals on dial-up numbers ending…

Wed Jul 17 08:20:45 PDT 2002 — Brief busy signals on dial-up numbers ending with 9811. We discovered a misconfigured card in the middle of our dial group which caused the 9811 numbers to give false busy signals for about 5 minutes. The problem has been corrected and we will monitor this POP closely all day. -Matt and Steve

Traffic to and from the Focal xxx-9811 dial…

Wed Jul 17 18:32:48 PDT 2002 — Traffic to and from the Focal xxx-9811 dial up servers destined to some parts of the Internet was prevented from reaching its destination by one of our old edge routers, delta. A slight misconfiguration in a ACL used to prevent DOS attacks was catching some of the traffic sent by people using these numbers. This may have resulted in some users being unable to reach some destinations off of our network. Oddly enough, our other border router had the same incorrect ACL installed but due to what appears to be a Cisco IOS bug, it wasn’t affecting traffic. We will be consulting with TAC to aide in resolving the apparent bug in the other router. -Kelsey, Nathan and Chris B.

Night Operations: Tonight at 1AM we will be…

Wed Jul 17 17:00:26 PDT 2002 — Night Operations: Tonight at 1AM we will be moving our news servers to our new data center. We do not anticipate more than one hour of downtime as we migrate to the new location. During this time, we encourage our customers to use our alternate news service, supernews.sonic.net, which will not be affected by the migration. -Kelsey

DNS server issues.

Tue Jul 16 23:51:50 PDT 2002 — DNS server issues. We experienced hardware problems with our secondary DNS server which caused mail servers to refuse connections for about 5 minutes. Our mail servers use this secondary for their primary DNS server as it is more available. They fail over to the primary nameserver when the secondary fails. However, the health check that the mail servers perform to determine if DNS is available didn’t identify this error. We are reworking the health check algorithm to avoid this problem in the future. The server has been taken offline and the nature of the problem is being investigated.

We also experienced a DNS configuration error this morning which caused some mail delivery to be delayed. The problem was resolved as soon as it was discovered. -Matt, Nathan, Dane and Kelsey

On Tuesday during the day, we’ll be turning…

Mon Jul 15 22:54:14 PDT 2002 — On Tuesday during the day, we’ll be turning up our new 100Mbps fiber link to Equinix, and deploying a Cisco 7206 and other equipment there. If all goes well, we’ll be moving our Cable & Wireless link from downtown Santa Rosa to San Jose sometime shortly afterwards.

On Wednesday morning at around 3am, we will be moving Focal (9811) dialup lines from Santa Rosa to San Francisco. Expected downtime is ten to fifteen minutes. No, we don’t drive that fast, we’ll have extra equipment in place in San Francisco to allow for a quick transition. Matt, Steve and Russ will be doing this migration for us. -Dane

Update: Wed Jul 17 07:31:10 PDT 2002 — Night Operations complete. The Focal dial-up lines were successfully moved to our San Francisco POP. There was less than 5 minutes of downtime during the move.

Update: Wed Jul 17 16:22:53 PDT 2002 — As a part of the move we renumbered the xxx-9811 dial pools NAS servers. A minority of ISDN routers may have to have their gateway address changed in order to function properly. -Matt, Nathan, Russ and Steve

ssl.sonic.net, our shared secure web server…

Mon Jul 15 00:18:40 PDT 2002 — ssl.sonic.net, our shared secure web server has had a system disk failure. We are in the process of working on the on the system now and should have services fully restored in a few hours. Only the core OS is affected by this disk failure, all customer data is stored on our redundant Netapp cluster. – Kelsey and Nathan

Update: ssl.sonic.net has been completely restored on a new disk and it’s data has been verified by our backup software. It’s been up on the new hardware for over an hour and appears to be working properly.