Excessive disk usage on /home.

Sun Sep 10 13:08:36 PDT 2000 — Excessive disk usage on /home. Between 10am this morning and 12:59 this afternoon, the “/home” filesystem was filled up — shell users saw this as a “no space left on device” error. Impact: shell users couldn’t write data to the filesystem. This includes procmail processes running on a user’s behalf that filter mail into individual files under their /home directory. (If you use procmail in this manner, your mail may have been delayed.)

After executing a “quota resize” on our Network Appliance (NetApp) to clear the condition, we found the reason for the excessive disk usage on /home: -rw——- 1 culprit user 1402852697 Sep 10 12:59 ErrorLog While 1.4 GB files are not normally allowed for members, investigation of the culprit’s shell environment reveals that he had selected options that, as a side effect, did not implement the standard file size checks for user files. This will be corrected shortly.

Additionally (and much to our chagrin), the monitoring tool that pages us when free space on NetApp filesystems shrinks too low didn’t detect the problem. So, though the tool detects physical space problems, the part that checks free space for NetApp quota trees isn’t working properly — and we didn’t know that was the case because we’ve never had a quota tree fill up while using the tool. This, too, will be corrected shortly.

Finally, we apologize to anyone who noticed the problem, most especially those who contacted tech support and got a wrong answer. We will review our processes — both computer and human — to ensure this doesn’t happen again. -Scott, Eli, and Dane

Our 1003 dial group was returning a reorder…

Sat Sep 9 18:36:31 PDT 2000 — Our 1003 dial group was returning a reorder tone for about 15 minutes; it seems that one of the T1 PRI cards lost it’s tiny mind, I’ve swapped it with the very last card in the group and a reboot seems to have fixed it. -Dane

Fiber Cut in Bay Area.

Fri Sep 8 11:34:57 PDT 2000 — Fiber Cut in Bay Area. Global Crossing (gblx.net) has experienced what has been termed a “catastrophic” fiber cut. Multiple circuits are effected, and there is the possibility that multiple ISP’s are affected. Sonic.net members may experience high latency and packet loss to gblx-connected sites while gblx’s network is repaired. According to our “Internet-weather” monitoring, neither UUNet or Cable & Wireless are affected, nor is Sonic.net. -Scott

BroadLink will be doing radio work that may…

Fri Sep 8 16:50:42 PDT 2000 — BroadLink will be doing radio work that may affect customer links this evening. Customers may experience a brief interruption or degraded service levels at 11 p.m. tonight while equipment is exchanged. Downtime is expected to be approximately fifteen minutes. If you are experiencing degraded service levels outside this time-frame please contact support@broadlink.com. -Dane and Anna

BroadLink found that a network switch had…

Thu Sep 7 14:19:32 PDT 2000 — BroadLink found that a network switch had locked up, and a reboot fixed the trouble. Total downtime for BroadLink customers was 15 minutes. They will be working with the manufacturer to see if a cause can be pinned down. There has been no previous trouble with this equipment, and it’s connected to a remote power management unit, so it can be recycled from BroadLink’s offices. -Scott, Dane, Kelsey, Eli and Shane (the doughnut guy)

BroadLink’s backhaul circuit down.

Thu Sep 7 14:09:12 PDT 2000 — BroadLink’s backhaul circuit down. BroadLink’s optical circuit to their distribution hub is currently down. BroadLink engineers are scrambling to correct the problem, and Sonic.net is assisting wherever possible. -Scott, Dane, Kelsey, Eli

A few minutes ago, scurrilous ruffians…

Mon Sep 4 17:35:07 PDT 2000 — A few minutes ago, scurrilous ruffians launched a denial-of-service attack on a Sonic.net customer’s network. The attack stopped before we could catch the perpetrators. The attack lasted about 10 minutes, and caused intermittent loss of connectivity. -Scott

Earthquake.

Sun Sep 3 01:43:54 PDT 2000 — Earthquake. That was a 5.2 earthquake epicentered 3 miles WSW of Yountville. More information can be found at the USGS: quake.wr.usgs.gov/recenteqs/Maps/123-39.html Aftershock prediction information for this quake can be found at quake.wr.usgs.gov/recenteqs/QuakeAddons/nc51101203.afterwarn.html And finally, you can file a “Did You Feel It?” report with the Community Internet intensity map here:

pasadena.wr.usgs.gov/shake/STORE/X51101203/ciim_form.html (You may want to wait a bit on that last link, as it appears to be impacted by other earthquake-feelers.) -Scott, Nathan, and Mitch

SOLVED: High latency and packet loss on UUNet

Sat Sep 2 16:59:19 PDT 2000 — SOLVED: High latency and packet loss on UUNet in San Francisco. As of 16:43 PDT, the high latency and packet loss on UUNet’s network in San Francisco has cleared. Our new Internet monitoring tool — “see” — detected the problem while monitoring the path from here to www.ora.com. Since most of our traffic to the Internet did not travel over the latent path, we didn’t expect customers to notice the problem (unless they were visiting www.ora.com, of course) — and from talking with Sonic.net tech support, it appears we received no calls about the problem. Here “see” has fulfilled its role of notifying us about potential problems before you, our customer, notice them. -Scott