We’ve been seeing what appears to be denial…

Fri Apr 20 16:49:56 PDT 2001 — We’ve been seeing what appears to be denial of service (DOS) attacks in our statistics here today. This has caused sluggish performance for a few brief intervals today. We’re applying some filters to our outbound links to prevent us becoming a source of spoofed IP attacks, and if we see additional traffic, we’ll try to isolate the source and nail this down. -Dane, Nathan, Scott and Kelsey

Public MySQL Server: A kernel that we…

Wed Apr 18 17:42:29 PDT 2001 — Public MySQL Server: A kernel that we installed a few nights ago showed some signs of instability and we took the box down to replace it with the old stable kernel. Just to be safe we also verified the integrity of all of the SQL databases on it and this delayed it from coming back on line sooner. It was offline for about 15 minutes. There are just a few tools at sonic that depend on this server, twig and the pop finder being two of them. All of the customer hosted MySQL database are also hosted on this server. No data was lost. We will be investigating the problem with the new kernel and, after fixing it, will upgrade to it during the next maintenance window. -Kelsey

We just experienced an odd set of…

Tue Apr 17 11:51:41 PDT 2001 — We just experienced an odd set of circumstances which caused outbound email from customers to be delayed. If you were trying to send email and found it to time out, please do send/receive again to dump the messages in your out-box. No email was lost. Downtime for actual transmission of email was about fourteen minutes.

Our four primary mail servers, hosting SMTP (outbound) and POP (inbound) mail are hosted behind a load balanced switch designed to prevent single point failures impacting end-users. However, due to the reboot of a secondary nameserver, we found a set of conditions that could trigger a failure. Each mail server uses at least two nameservers, but the primary one on all four mail servers was rns2.sonic.net, 208.201.224.33. When this system was undergoing maintenance, all four mail servers fell back to their secondary for reverse DNS lookups, rns1.sonic.net, 208.201.224.11, but took an extra 30 seconds for each new connection to fall back. This caused the Alteon load balancing switch to mark the mail server as unresponsive. With all four running slow due to the DNS server being down, the Alteon effectively shut down SMTP services.

To prevent this possibility in the future, the four email servers now use different primary and secondary DNS search orders. We’ve also asked Alteon for changes to the health monitoring where if all servers for a particular service are slated to be removed from service, it performs more lenient health checks on them to see if they’re just running slow.

-Dane, Russ, Kelsey, Scott and Eli

Some web sites were experiencing cgi program…

Tue Apr 17 09:55:16 PDT 2001 — Some web sites were experiencing cgi program issues this morning, thunder.sonic.net is one of our 3 web servers, had lost its bind to the yp server. This prevented cgi-wrapped programs from executing. After a quick reboot the yp server came back up. This only affected cgi-wrapped scripts and lasted a very short period of time. -Steve

Catastrophic core switch failure.

Tue Apr 17 16:34:19 PDT 2001 — Catastrophic core switch failure. During routine maintenance, our core Extreme Networks Black Diamond 6800 switch failed. This $120,000 bit of equipment transports almost all network traffic, and without it, we’re totally dead in the water. It’s redundant core management switch module did not successfully take over for some reason, and we’ll be meeting with Extreme to ask them to explain exactly what that $22,000.00 investment was worth to us.

Downtime began at 3:28pm, and lasted 47 tense minutes. During this time, all network services were unavailable. Seven operations team members franticly dove into the guts of the switch, and in the end, a factory default boot with minimal configuration was used to bring over a recently stored config from our main admin server. This config was brought online and the rest of the network was booted. Meanwhile, back office staff pitched in with technical support, and hold times were kept under a minute.

We apologize for the service interruption of 47 minutes, and we’ll be doing a post-mortem shortly to determine what changes we can make to prevent this from ever happening again. We’ll be hauling Extreme in to answer for their equipment. I will post an update myself here when we’ve taken final steps to assure that this can’t happen again.

Sonic.net has made large investments in network redundancy, but it’s been difficult for us to isolate all potential failures. Our operations group will work hard to assure that we nail them all down, and we will “fire drill” our network with simulated failures in order to prove to ourselves that it will not break.

Thank you for your understanding and patience. -Dane, Scott, Kelsey, Eli, Nathan, Scooter, Russ, Steve, Chris, and the entire tech staff.

Alcatel ADSL Modem vulnerability.

Fri Apr 13 16:52:25 PDT 2001 — Alcatel ADSL Modem vulnerability. While there is currently no known exploit of the Alcatel ADSL Modem vulnerability, we have implemented safeguards to prevent compromise of customer Alcatel ADSL Modems. These include rejecting access to UDP port 7, as well as denying transit of packets with a source address of all 1’s. Please note that the UDP echo service (port 7) has nothing to do with ping, which uses ICMP echo packets. More information about the vulnerability can be found at the San Diego Supercomputing Center:

security.sdsc.edu/self-help/alcatel/

-Scott and Dane

Cable and Wireless isolated the trouble to a…

Mon Apr 9 19:52:54 PDT 2001 — Cable and Wireless isolated the trouble to a bad VIP card in the Cisco Router on the far side, and their NOC staff has replaced the card. The circuit has been up and stable for over 15 minutes now, and both Sonic.net and C&W staff will continue to monitor the link over the next few hours. – Eli, Russ, Dane

We’re currently seeing some instability in…

Mon Apr 9 19:02:54 PDT 2001 — We’re currently seeing some instability in Cable & Wireless’s network or in our link to them. While our UUNet T3 is bearing the additional load, we’re seeing some sluggish performance to some sites or intermittent un-reachability due to the time required for route convergence. We’re working with C&W to isolate the trouble now. -Dane and Russ

Sonic.net was featured in today’s Press…

Thu Apr 5 16:38:39 PDT 2001 — Sonic.net was featured in today’s Press Democrat story about Northpoint’s shutdown of their DSL network. Sonic.net does not partner with Northpoint, a DSL provider who’s bankruptcy has turned into a nightmare for their connected customers. We’ve been able to step in and provide quick access solutions to end-users stranded without access by Northpoint’s sudden demise. For the full story, see:

216.167.95.20/business/news/05dsl_e1.html

Sonic.net is offering free DSL equipment…

Wed Apr 4 17:40:52 PDT 2001 — Sonic.net is offering free DSL equipment while supplies last! Pacific Bell today increased the price of DSL equipment to $253, up from $199. In response to this, Sonic.net has obtained 130 DSL modems, and we’re going to give them away to the next 130 viable orders for DSL service that we receive. This is first come, first served, so if you’ve been considering moving to DSL service, now is the time to do it! We expect that we will run out of these units very fast, so order quickly. The order form and sales page now have a count-down object which will show you how many units are left; look for the number of units still available in red.

www.sonic.net/sales/dsl/pacbell/

Note that equipment is not reserved until the order form has been submitted completely. If you visit and find that there are units available, please fill the form quickly to avoid someone else getting them before you! There are only 130 units, and it’s strictly first come, first served, once they’re gone, they’re gone. -Dane, David, JenD, Lori, Jen and Steve

Update – as of now, Mon Apr 9 17:23:17 PDT 2001, there are 44 units left. If you’re interested in getting the free DSL equipment, place your order quickly. -Dane