This is the postmortem of the bug of Wednesday, April 20th, 2022.
What went wrong:
- At approximately 13:00 MX / 19:00 FireServiceRota went down. Any attempt to access the system resulted in server errors.
- Our hardware infrastructure located in London presented a failure. Usually our system is capable of restoring automatically when failures happen, but this time it was unable to do so.
- The system had a downtime of approximately 20 minutes, being restored at around 13:21 MX / 19:21 UK.
- As reported by our hosting provider the issue was a widespread problem on their own platforms. We maintained close contact with them to raise awareness of this issue and identifying mitigating actions on our end.
What actions were taken to mitigate the issue: