Month: June 2003

Maintenance on Stockton DSL 6/17/03.

Fri Jun 13 09:42:02 PDT 2003 — Maintenance on Stockton DSL 6/17/03. Sonic.net and Inreach will be doing a minor network change on Tuesday, at 5 am. SBC DSL customers in the Stockton service area will have a less than 5 minute outage. -Nathan, Andy and John

Update: Maintenance is complete. Downtime was approximately 2 minutes. -Nathan and Andy

SpamAssassin server failure.

Wed Jun 11 10:53:48 PDT 2003 — SpamAssassin server failure. This morning we had several of our SpamAssassin servers lock up, causing the others to become overloaded. Rebooting the servers restored service. While they were down, you may have received unfiltered spam email. Our email system is designed to “fail open” so that no email is lost. -Kelsey, Matt and John

DNS server upgrades completed.

Fri Jun 6 16:24:19 PDT 2003 — DNS server upgrades completed. All three of our name servers have been updated with faster CPUs and additional RAM. These new machines should provide better performance and should also be more reliable than the old name servers. This upgrade may not resolve some of the periodic DNS resolution problems that stem from large inbound zone transfers; we are still seeking a long term resolution to this problem. We are also working on other DNS server upgrades. We hope to have deployed 4 authoritative non-recursive servers deployed at geographically diverse locations soon and will also be deploying caching name servers at our access POPs. -Kelsey

Shell server bugfix.

Thu Jun 5 11:27:34 PDT 2003 — Shell server bugfix. We corrected a minor configuration issue that was preventing /tmp on the shell server from getting cleaned on a regular basis. Files left in /tmp will now be removed after 7 days from their last modification time. -Kelsey and Nathan

DNS server hiccup.

Thu Jun 5 22:17:15 PDT 2003 — DNS server hiccup. Our primary DNS server, ns1.sonic.net, stopped answering queries for approximately two minutes. However, since most users have two DNS servers configured, impact was negligible. If you saw name resolution failures, you may want to check your DNS configuration. Information on DNS setup can be found at:

www.sonic.net/support/index.shtml

under the “Quick Reference Guide” and machine-specific “Setup Guide” links. -Nathan

Routing issues at Sebastopol POP.

Wed Jun 4 21:16:10 PDT 2003 — Routing issues at Sebastopol POP. The last portmaster in the dial group that servers Sebastopol had an apparent misconfiguration that was just uncovered. It has been fixed and now customers using this box are fully functional. -Kelsey, Nathan and Russ

Broadlink head-end router reboot.

Tue Jun 3 17:41:15 PDT 2003 — Broadlink head-end router reboot. The Broadlink router which connects their subscribers to the Sonic network rebooted ten minutes ago. At this time, all subscribers should be up and functional. -John and Nathan

News server lockup.

Tue Jun 3 15:58:35 PDT 2003 — News server lockup. Our news server, news.sonic.net, locked up a few minutes ago. We are working to restore service, but in the mean time, altnews.sonic.net and supernews.sonic.net remain available. -Kelsey and Nathan

Update: The server hardware for news.sonic.net appears to be having some difficultly: the box has rebooted itself a few times. We are investigation options we have to replace the hardware. At this time it is back online, no wait, it’s down. -Kelsey and Nathan

Update: News.sonic.net is back online, one of it’s CPU had a failed cooling fan, it has been replaced. -Kelsey, Kevan, Zeke and Nathan

Name server issue.

Tue Jun 3 00:30:36 PDT 2003 — Name server issue. We experienced an issue with one of our primary name servers where an old configuration directive caused the server to shutdown. The problem has been fixed. Sonic.net has multiple name servers and customers should not have noticed the issue. -Matt and Kelsey

Sundry Updates to DNS, Authentication and…

Mon Jun 2 16:52:13 PDT 2003 — Sundry Updates to DNS, Authentication and SpamAssassin. We were able to track down the sources of the periodic problems with our DNS servers and made some changes that appear to have resolved the problems. The DNS server failures were also causing cascading failures on our mail and authentication servers. We’ve taken additional measures to insulate these services from failures of our core DNS servers. We’ll still be going ahead with the DNS server upgrades and replacement.

We also did some profiling and performance tuning on our SpamAssassin cluster and were able to drive the average time to inspect a message down to half a second from between 2 to 6 seconds. This marked increase in performance should result in snappier email delivery and less load on our core mail servers. -Kelsey and Nathan