Homalab
Homelab Post Incident Review 11/05/25
·421 words·2 mins
Homalab
Post Incident Review
Since it’s important to practice what you preach (apparently) here’s my post incident report on a P1 homelab failure
Timeline # 09:30 - Services slow, services down
10:00 - Attempt to upgrade Ubuntu and reboot VM
10:00 - CPU spiking 100% across all 8 cores
10:15 - Increase core count to 16 and reboot VM
10:30 - Slow recovery but some services still down
16:00 - Server not on network
18:00 - Server powered on but no response
18:30 - Server disassembled and left to cool - fans cleaned a bit
19:00 - Services recovered