Sun Jul 14 02:53:44 PDT 2002 — There seems to have been a power outage in Northern Santa Rosa, and we’ve noticed a number of T1 and DSL customers went offline. The T1 and ATG DSL customers now seem to be back online, but we have not been able to verify that the PacBell DSL customers are back up. Please let us know ASAP if you have any problems with your connection due to this event. -Dane, Matt and Russell (Xponentia)
Our move of primary core storage and servers…
Sun Jul 14 02:47:37 PDT 2002 — Our move of primary core storage and servers has been delayed due to problems getting the tape backup completed, indexed and verified. The full backup started at 5am Saturday, and as of 2am Sunday, is complete but not yet fully indexed, so it cannot be tested. For this reason, we’ve postponed the move of the storage array and all associated servers until the same time next weekend.
Instead, we’ll be doing a number of clean-up activities this evening, many of which are slightly service affecting. Steve will be doing a bit of tidying and rearranging of the 1003 dial pool, and customers will be disconnected as he moves equipment around. Nathan is planning to swap a T3 port adapter in one of the Cisco 7507s here for a HSSI, and expects about 30 seconds of inability to reach sites connected via Cable & Wireless. -Dane, Matt, Steve, Nathan and Kelsey
News Server Updates: We’ve just completed a…
Sat Jul 13 04:09:42 PDT 2002 — News Server Updates: We’ve just completed a software upgrade to bring us up to the latest stable version of Typhoon, the news server software we use on our news reader box. We’ll be keeping a close eye on the news server over the next few days to see if this resolves the problems that it has been having this past week. -Kelsey UPDATE: The problems appear to have been resolved by the new version of Typhoon. It’s ran problem free for more than 12 hours. -Kelsey
Night Op – We’re planning to move our core…
Fri Jul 12 01:24:53 PDT 2002 — Night Op – We’re planning to move our core disk storage architecture on Sunday morning, with downtime beginning at about 1am.
Our Network Appliance F740 network file system cluster is the basis of Sonic.net’s storage solution. All user data resides on the two NetApp filers, and they’re configured in a completely redundant configuration. The drives are dual-channel fiber arbitrated loops, and are served by redundant processor heads. The units are RAID level four, WAFL filesystems, and include redundant power and cabling to all disks and the network itself at Gigabit speeds.
The move of the disk shelves themselves will be a service affecting move, and during the time that they are in transit, local web and ftp hosting, email and shell will be unavailable. Dialup, DSL and web browsing will be unaffected. We expect the downtime to be between an hour and an hour and a half, beginning just after 1am on Sunday morning. Actual server moves of redundant systems will begin at 11pm Saturday night, but these changes should be transparent.
The move to the new datacenter is nearing completion for Sonic.net equipment, and it’s been a good opportunity to redesign a number of network elements. Downtime has been very brief, and we appreciate your patience with any interruptions noted in the MOTD. -Dane
Night Op – Redback SMS and DSL customer move.
Wed Jul 10 18:58:53 PDT 2002 — Night Op – Redback SMS and DSL customer move. Tonight we will be moving the ATM DS3 which terminates Pac Bell DSL and FRATM customers. It is scheduled to take place at 3am and should last about 30-40 minutes as we relocate the equipment to the new data center.
Update: The move of the RedBack SMS DSL router and PacBell DSL customers is complete. Downtime was about twenty minutes. Pacific Bell did a great job of moving this circuit quickly and efficiently, and we had a very smooth transition of the equipment. -Matt, Dane, Eli, Kelsey, Nathan and Mike(2) from PacBell
News Server Issues: news.sonic.net, our NNTP…
Wed Jul 10 16:41:12 PDT 2002 — News Server Issues: news.sonic.net, our NNTP reader server has been experiencing stability problems for the past few days. This instability results in periodic refused connections as the server process reinitializes. We have been unable to find the cause and are in the process of working with the software vendor to resolve this as soon as possible. -Kelsey
Update: We are still experiencing trouble with the news server. The vendor has recommended a version upgrade and is currently analyzing our cores. We will attempt the upgrade as soon as we reach an appropriate maintenance window. -Kelsey
Web performance impacted.
Tue Jul 9 11:11:43 PDT 2002 — Web performance impacted. A denial-of-service attack is affecting the performance of one of our web servers; this may result in slower response times when loading pages hosted at Sonic.net. Our operations crew is resolving the issue and web performance should return to normal shortly. We maintain a pool of redundant, load-balanced web servers, which greatly reduces the severity of problems of this type. -Eli and Russ
Tonight we will be moving BroadLink wireless…
Tue Jul 9 18:27:50 PDT 2002 — Tonight we will be moving BroadLink wireless customers from one ATM router to another in order to prep for the migration of Pacific Bell customers tomorrow night. In order to have as much flexibility as possible, we’re deploying a second RedBack SMS, which means we can keep BroadLink and PacBell customers separate and move them separately. This will reduce downtime and potential for problems during migration for both groups of customers. BroadLink customers can expect about ten minutes of downtime tonight at around 2am.
We will also be swapping one of our Cisco 7206 edge routers for a loaner unit (thanks, John Harkin!) to enable final deployment of our equipment at Equinix next week. This moves us closer to completion of our new network design, which includes four Cisco 7507 RSP4 routers in Santa Rosa (two edge, two customer attach) and two Cisco 7206 VXR-400 routers at Equinix and Focal in San Jose and San Francisco respectively. The loaner 7206 will be used to handle our connection to Cable & Wireless until that’s moved over to the equipment deployed at Equinix next week. -Dane, Kelsey, Nathan, Chris (BroadLink) and John Harkin
Update – We’ve moved all BroadLink customers to the new RedBack router in downtown Santa Rosa in prep for the move of our large SMS serving Pacific Bell customers tomorrow night. We have also replaced delta, the current C&W edge. All appears to be well. In a few minutes, we’ll be making a change to the sr2 link to address some performance issues that have recently affected the 1001 dial group. -Dane, Nathan and Kelsey
This evening at 3am, we will be moving dialup
Tue Jul 9 00:30:18 PDT 2002 — This evening at 3am, we will be moving dialup equipment serving our primary Pacific Bell Santa Rosa number, “1003”. Downtime is expected to be between 15 and 30 minutes. Note that we have at least three other numbers that all Santa Rosa area customers can use. See the POP finder at the following URL to look up backup dialups for your area. Printing the list for future reference would be a great idea!
Update: The move of the 3Com/USR Total Control Enterprise Hub and T3 mux equipment is complete, and we’ve moved 736 dialup lines of capacity. Downtime was about ten minutes. If you notice any new dialup issues on 1003 on Tuesday, please let support know.
-Dane, Steve, Nathan, Kelsey, Mike (PacBell) and various helpers
Night Op Complete.
Wed Jul 3 06:19:18 PDT 2002 — Night Op Complete. We have successfully moved mega and the circuits that it terminates. The majority of the move was completed in 45 minutes with the exception of a single T1 that took longer. Thanks to Mike from Pac Bell for his good work!
-Matt, Nathan, Kelsey and Mike (Pac Bell)