Amazon Explains Causes of Massive Amazon Web Services Outage — It's a "Rare" Software Bug and "Faulty" Automation
Following a major outage on October 20-21, 2025, that caused global disruptions, Amazon has explained the technical glitch behind the Amazon Web Services (AWS) downtime. The failure impacted a wide range of critical online services.

What Caused the Outage?
According to the company's official report, the incident was triggered by two conflicting software programs responsible for updating DNS records. This conflict caused the system to incorrectly delete IP address bindings for a key access point in the cloud infrastructure.
The initial error set off a chain reaction, affecting other AWS cloud tools and preventing external services from connecting. The outage disrupted numerous sectors, including:
- Banking and payment platforms
- AI services and neural networks
- Messaging applications
- Online games
- Smart home devices
During the recovery, engineers faced a surge in system requests, forcing them to manually restart some processes. Amazon reported that the primary issues were resolved by the afternoon of October 21.
Industry Experts Weigh In
Network and infrastructure specialists told Wired that such large-scale incidents are inevitable for cloud giants like Amazon, Microsoft Azure, or Google Cloud Platform, given their enormous complexity and scale.
At the core of cloud computing is an endless list of complex services and dependencies that are one step away from breaking.
Amazon doesn't often face such 'cascading' failures. On the other hand, it creates this situation itself by attracting more and more customers to its infrastructure.
In response to the incident, Amazon has confirmed it will implement additional system checks to prevent similar failures in the future.

