Microsoft Azure Outage Impacts Global Services
— Cloud Infrastructure Incident Report
On October 30, 2025, a major Microsoft Azure service outage disrupted operations for global customers, including Alaska Airlines, Xbox users, and Microsoft 365 subscribers.
Azure
According to Microsoft, the disruption began around 9:00 a.m. Pacific Time when systems started experiencing issues related to Azure Front Door (AFD), rendering parts of the service unavailable. The company stated that the issue was “triggered by an unintended configuration change.”
Microsoft explained: “We are taking multiple mitigation steps: First, we have blocked all changes to the AFD service, including customer configuration changes. At the same time, we are rolling back AFD to its last known good configuration. During this rollback process, we want to ensure the faulty configuration will not restart once the service is restored.”
Impact on Customers
Alaska Airlines posted on X (formerly Twitter) at 10:33 a.m., stating that the Azure outage affected multiple systems, including its website. Passengers unable to check in online were directed to staff counters for boarding passes. “We sincerely apologize for the inconvenience and appreciate your patience as we work to resolve this issue,” the airline said.
Microsoft followed up at 10:51 a.m., saying: “We are unable to provide an ETA for rollback completion at this time, but we will update this notice within the next 30 minutes or sooner.”
By 12:22 p.m., Microsoft announced that the affected systems had been reverted to their “last known good configuration,” and customers should begin to see improvement. The company added: “We expect full mitigation within the next four hours as we continue restoring nodes. We will provide another update within two hours or sooner.”
Airline Operations Disrupted
Alaska Airlines later attributed the outage to a failure in its primary data center. The airline operates a hybrid infrastructure combining its own data centers and third-party cloud services. The incident affected more than 49,000 passengers and disrupted check-in and scheduling systems.
1. Incident Summary — What Happened?
Between 15:45 UTC on October 29 and 00:05 UTC on October 30, some customers using Azure Front Door (AFD) experienced latency, timeouts, and errors. Impacted Azure services included, but were not limited to:
- App Service
- Azure Active Directory B2C
- Azure Communication Services
- Azure Databricks
- Azure Healthcare APIs
- Azure Maps
- Azure Portal
- Azure SQL Database
- Azure Virtual Desktop
- Container Registry
- Media Services
- Microsoft Copilot for Security
- Microsoft Defender External Attack Surface Management
- Microsoft Entra ID
- Microsoft Purview
- Microsoft Sentinel
- Video Indexer
AFD configuration changes remain temporarily blocked. Microsoft will notify customers when restrictions are lifted. Although latency and error rates have returned to pre-incident levels, a small number of customers may still experience residual issues (“tail-end impact”). Further updates will be published through Azure Service Health.
2. Root Cause Analysis
The incident was caused by an unintended tenant configuration change that triggered a widespread service failure within Azure Front Door. The change introduced invalid or inconsistent configuration states, preventing many AFD nodes from loading correctly.
As unhealthy nodes dropped out of the global pool, traffic load became unbalanced, amplifying the impact. Even regions that remained healthy experienced intermittent availability issues.
Microsoft immediately halted all new configuration deployments to prevent further spread and began a global rollback to the last known good configuration. The recovery process required reloading configuration data across thousands of nodes and gradually rebalancing traffic to avoid overload.
The root cause was traced to a defect in the tenant configuration deployment pipeline, which allowed an invalid configuration to bypass safety validation. Microsoft has since reviewed and reinforced validation and rollback mechanisms to prevent recurrence.
3. Timeline of Events (UTC)
- 15:45 (Oct 29) — Customer impact begins
- 16:04 — Monitoring alerts trigger internal investigation
- 16:15 — AFD internal configuration change identified
- 16:18 — First public post on Azure status page
- 16:20 — Targeted notifications sent via Azure Service Health
- 17:26 — Azure Portal rerouted from failing AFD endpoint
- 17:30 — All new customer configuration changes blocked
- 17:40 — Deployment of last known good configuration initiated
- 18:30 — Fix rollout begins globally
- 18:45 — Manual node recovery and gradual traffic rerouting
- 23:15 — PowerApps recovers after dependency removal; customers report improvement
- 00:05 (Oct 30) — Microsoft confirms full mitigation