MX Server Outage

A routine configuration change on our MX cluster had some unintended consequences and took inbound mail offline about 45 minutes ago.  We’re in the process of consoling the servers now to restore services and expect to have them back up and running shortly.  Due to some quirks with how some of our internal systems function this has also impacted our member tools servers and is adding a substantial delay to login and subsequent page loads.

Update:  All services were restored shortly after the original post.  Postmortem of the failure revealed that a new log messages stream from the MX cluster caused some synchronous blocking behavior leading to excessive resource consumption and eventual lock up of all servers in the cluster.  We have also addressed delays in our initial response to the outage.  -Kelsey and Grant.

Leave a Reply

Your email address will not be published. Required fields are marked *