Facebook has blamed a “faulty configuration change” for the widespread outage which impacted the social media platform, along with Instagram and WhatsApp, for several hours late on Monday 4th October.
Adam Leon Smith’s insight on behalf of BCS’ Software Testing specialist group (of which he is Chair) was extensively sought and widely covered by the media on the evening of the outage, including The Guardian.
Commenting on the internal technical issues which affected users worldwide, Adam Leon Smith said: “The outage was caused by changes made to the Facebook network infrastructure. Many of the recent high-profile outages have been caused by similar network level events.
For you
Be part of something bigger, join BCS, The Chartered Institute for IT.
“It was reported by unidentified Facebook sources on Reddit that the network changes also prevented engineers from remotely connecting to resolve the issues, delaying resolution.
“Notably, many organisations now define their physical infrastructure as code, but most do not apply the same level of testing rigour when they change that code, as they would when changing their core business logic.”
He added: “It is unlikely the issues were directly caused by people working from home, however it is quite possible that it took so long to restore the service because of reduced staffing within the data centre.”