Community
Post Mortem: Management Console Downtime on October 17, 2024, between 5:40 PM and 6:00 PM
On October 17, 2024, from 5:40 PM to 6:00 PM (German Time), our Management Console was inaccessible to users due to an internal networking issue at our hosting provider, fly.io.
Jeremy Theocharis
October 17, 2024
8 min read

While the instances remained operational, the inability to access the Management Console hindered user interaction with these instances. Fly.io identified and resolved the issue within approximately 20 minutes.
Timeline
- 5:40 PM: Users began reporting issues accessing the Management Console.
- 5:55 PM: Fly.io started investigating increased proxy errors affecting apps using Flycast internal networking.
- 5:56 PM Fly.io continued their investigation.
- 6:00 PM: The Management Console became fully operational again.
- 6:04 PM: Fly.io reported that they had fixed the error.
Root Cause
The downtime was caused by an internal networking issue within fly.io’s infrastructure, specifically affecting applications communicating over their Flycast internal networking. This issue resulted in increased proxy errors, preventing users from accessing the Management Console.
Preventive Measures
- Multi-Cloud Strategy: While adopting a multi-cloud approach could provide failover options during such incidents, it is currently not feasible for us due to resource constraints.
- Vendor Evaluation: We will continue to monitor fly.io’s performance closely. So far there was nothing negative to report about the reliability. Should issues persist, we will consider switching to a more reliable cloud vendor to enhance service stability.
Conclusion
We sincerely apologize for the inconvenience caused by this downtime. Our team is committed to providing reliable services and is taking steps to prevent similar issues in the future. We appreciate your understanding and patience.


