Welcome to another edition of the Outage Roundup newsletter.

Headlines

Stripe - Federal Reserve ACH System Disruption

This is an upstream incident that Stripe surfaced on its own status page. The US Federal Reserve's FedACH Services experienced an internal systems processing outage beginning on March 3rd, temporarily taking the FedACH application offline and delaying file distribution and settlement for multiple ACH processing windows. This affected US payouts, top-ups, bank debits, and bank transfers flowing through Stripe and other processors. Stripe confirmed the issue was external and resolved within 5 hours with ACH transactions expected to settle the same day.

A similar Fed outage occurred in 2021.

GoDaddy Nameserver Issues

A DNS related outage, this one lasted between 20:17 and 23:09 UTC on March 16 - approximately 3 hours. The incident was classified as affecting Domain Services (DNS). GoDaddy's status page confirms it was investigated and resolved the same evening. No root cause was published.

LLM Outages Again

Apart from our usual Anthropic and OpenAI outages, Azure OpenAI between 23:20 UTC on March 9 and 19:32 UTC on March 10 - a window of approximately 20 hours - experienced a platform issue. This resulted in degraded request processing, intermittent failures (including HTTP 400 and HTTP 429 errors), and reduced availability for the Azure OpenAI Service across Australia East, Sweden Central, Central US, East US 2, Korea Central, Norway East, and UK South regions. Microsoft determined the cause was unexpected resource exhaustion that the service was unable to recover from adequately, triggered by a recent change impacting specific model workloads. Microsoft published an initial post-incident report.

OpenAI had 22 distinct outages in these 2 weeks, while Anthropic had 19.

The number of LLM outages is lower than other service categories but the perceived effect is more because of widespread coverage in popular media.

Azure - CDN Issue Affects Multiple Services

North American users experienced a disruption in multiple Microsoft services on March 6th, including the MS 365 Admin Center, Office.com, SharePoint and Teams. Microsoft narrowed down the root cause to a CDN configuration issue and their automated recovery mechanism not taking “immediate action”.

Developer Tools

Both GitHub and GitLab experienced multiple issues, with GitHub’s March 3rd outage causing upto 43% of GitHub API requests to fail. To their credit, GitHub’s status page updates stand out as an example of how to communicate during an outage - frequent updates followed by a write up after resolution.

This was a re-run of GitHub's February 2026 cache invalidation incident. GitHub's own post-mortem confirmed the same caching layer was involved in both events, less than a month apart.

Outages By Service Type in the Last 2 Weeks

In Case You Missed It…

Blast From The Past

10 years ago, Southwest Airlines faced a 13 hours outage affecting online and airport check-ins. The outage took out the email system as well so customers could not be informed. More than 2000 flights were canceled during the outage and the following days as recovery efforts continued.

Till next time,

Hrish from Outage Roundup

Visualizations and tabular data in this newsletter are derived from IncidentHub’s third-party status monitoring. IncidentHub monitors status pages of hundreds of SaaS and Cloud services.

Keep Reading