On February 9th notifications service started showing degradation around 13:50 UTC, resulting in an increase in notification delivery delays. Our team started investigating.
Around 14:30 UTC the service started to recover as the team continued investigating the incident. Around 15:20 UTC degradation resurfaced, with increasing delays in notification deliveries and small error rate (below 1%) on UI and API endpoints related to notifications.
At 16:30 UTC, we mitigated the incident by reducing contention through throttling workloads and performing a database failover. The median delay for notification deliveries was 80 minutes at this point and queues started emptying. Around 19:30 UTC the backlog of notifications was processed, bringing the service back to normal and declaring the incident closed.
The incident was caused by the notifications database showing degradation under intense load. Most notifications-related asynchronous workloads, including notifications deliveries, were stopped to try to reduce the pressure on the database. To ensure system stability, a database failover was executed. Following the failover, we applied a configuration change to improve the performance. The service started recovering after these changes.
We are reviewing the configuration of our databases to understand the performance drop and prevent similar issues from happening in the future. We are also investing in monitoring to detect and mitigate this class of incidents faster.
Posted Feb 09, 2026 - 19:29 UTC
Update
We continue observing recovery of the notifications. Notification delivery delays have been resolved.
Posted Feb 09, 2026 - 19:14 UTC
Update
We are continuing to recover from notification delivery delays. Notifications are currently being delivered with an average delay of approximately 15 minutes. We are working through the remaining backlog.
Posted Feb 09, 2026 - 18:33 UTC
Update
We are continuing to recover from notification delivery delays. Notifications are currently being delivered with an average delay of approximately 30 minutes. We are working through the remaining backlog.
Posted Feb 09, 2026 - 17:57 UTC
Update
We are seeing recovery in notification delivery. Notifications are currently being delivered with an average delay of approximately 1 hour as we work through the backlog. We continue to monitor the situation closely.
Posted Feb 09, 2026 - 17:25 UTC
Update
We continue to investigate delays in notification delivery with average delivery latency now nearing 1 hour 20 minutes. We are just now starting to see some signs of recovery.
Posted Feb 09, 2026 - 16:51 UTC
Update
We are investigating notification delivery delays with the current delay being around 50 minutes. We are working on mitigation.
Posted Feb 09, 2026 - 16:12 UTC
Investigating
We are investigating reports of impacted performance for some GitHub services.