Load Balancer Issues: We’ve just uncovered…

Tue Nov 4 14:36:42 PST 2003 — Load Balancer Issues: We’ve just uncovered that one of our Alteon AD3 load balancing switches is apparently corrupting ethernet frames off of at least one of it’s port with single bit errors. These errors were going completely undetected by the servers or switches; the corrupted frames have the correct checksum information.

The single-bit error corruption in Ethernet frames on this switch was resulting in the transposition of characters in email streams sent to and from the affected servers. For example, the letter ‘A’ might have been translated to the symbol ‘~’, or ‘.’ to ‘x’. In most cases, the errors introduced would go unnoticed — they’d appear to be typos. However, attachments that were corrupted could be rendered unusable and it’s also possible that errors at certain points, or those which introduced certain control characters, could have caused fatal errors.

We are in contact with our vendor to identify if the problem is a hardware or software fault in the switch. We’ve temporarily worked around the corruption by disabling the affected servers. Once we’ve gathered sufficient debugging information, we’ll swap to the standby Alteon which is not exhibiting the problem and re-enable the affected servers. -Kelsey and Nathan

Leave a Reply

Your email address will not be published.

*