UPDATE(11:25am) The issue with Sonic voice services has been resolved. Thank you for your patience.
Month: April 2024
System Maintenance
Tonight, beginning at 10PM PST, we will be applying updates to multiple systems. Any downtime to affected services should be brief, and we expect to complete the maintenance within 2 hours.
The systems include, among others:
- IPv6 tunneling service
- VPN servers
- Various public facing applications
Sonic FTP Server Maintenance
Tonight at 10pm we will be upgrading the operating system on the Sonic customer FTP server ftp.sonic.net. We expect the service downtime to be brief and no longer than 1 hour in total.
Update: This maintenance has been completed successfully.
LA Area Network Maintenance
Update (12:55AM): This maintenance has been completed.
Tonight, starting at 11:59 PM, we will be conducting maintenance on core networking equipment in the LA area. While we anticipate no downtime for end users, there might be occasional routing instability as traffic is rerouted from the affected equipment. The maintenance window for this operation is estimated to be 4 hours.
System Maintenance
UPDATE: Maintenance complete.
Tonight, beginning at 10PM PST, we will be applying updates to multiple systems. Any downtime to affected services should be brief, and we expect to complete the maintenance within 2 hours.
The systems include, among others:
- VPN servers
- Various public facing applications
Cascading IMAP/POP3 failures this morning
At 11am our back-end IMAP/POP3 cluster entered a critical state which lead to an interruption in those services, as well as other services that rely on our mail infrastructure. The initial cause of the failure was a routine maintenance procedure that involved dropping traffic to a portion of the cluster. While the remaining cluster should have been able to run temporarily with a smaller group, that quickly turned out to not be the case. The remaining servers began to fail intermittently as they tried to shift traffic to account for the sudden increase in load. This would have caused noticeable mail client issues, and it also led to service availability interruptions on both our webmail and our voicemail platforms.
As soon as the problem was detected, we acted by aborting the maintenance. This was followed by additional resources being added to the cluster to prevent further disruptions. As of 11:34am service was fully restored. We do not expect loss of email to have resulted from this. As always we will look into improving our metrics and analytics to improve our response time.