Send tip

Category: SocialMedia

Facebook’s “Worst Outage in Over Four Years”

Written on September 26, 2010 by Japhet Writ

1 person

One social networking site and 500 million friends later, Facebook got the gigantic downtown of its life yesterday morning. Read on to see the post-mortem report from the network's Software Engineering department.

Earlier yesterday, the network experienced a downtime that began at around 11:30 am PST. According to Software Engineering Director, , the downtime was caused by a mis-handled error condition.

Such instance involves and automated system, designed to verify configuration values in the cache. Meaning, every single client saw an invalid value and attempted to fix it. Due to this, with the fix involving a query to a cluster of databases, the database cluster was overwhelmed. Even worse, after the real flaw has been solved, the stream of queries goes on as it is interpreted the configuration as an invalid value.

Due to this, the system that automatically rectify configuration values has been turned off, while the network searches for a better way to handle such instances in the future. The director also noted that getting the feedback loop to stop is so painful, that they have to turn the whole site to prevent traffic from getting in.

It was at 3:00 pm PST yesterday when the social network giant started functioning for most users again.

View Article Source »
Share

Related articles


Featured


View all