Sun Sep 10 13:08:36 PDT 2000 — Excessive disk usage on /home. Between 10am this morning and 12:59 this afternoon, the “/home” filesystem was filled up — shell users saw this as a “no space left on device” error. Impact: shell users couldn’t write data to the filesystem. This includes procmail processes running on a user’s behalf that filter mail into individual files under their /home directory. (If you use procmail in this manner, your mail may have been delayed.)
After executing a “quota resize” on our Network Appliance (NetApp) to clear the condition, we found the reason for the excessive disk usage on /home: -rw——- 1 culprit user 1402852697 Sep 10 12:59 ErrorLog While 1.4 GB files are not normally allowed for members, investigation of the culprit’s shell environment reveals that he had selected options that, as a side effect, did not implement the standard file size checks for user files. This will be corrected shortly.
Additionally (and much to our chagrin), the monitoring tool that pages us when free space on NetApp filesystems shrinks too low didn’t detect the problem. So, though the tool detects physical space problems, the part that checks free space for NetApp quota trees isn’t working properly — and we didn’t know that was the case because we’ve never had a quota tree fill up while using the tool. This, too, will be corrected shortly.
Finally, we apologize to anyone who noticed the problem, most especially those who contacted tech support and got a wrong answer. We will review our processes — both computer and human — to ensure this doesn’t happen again. -Scott, Eli, and Dane