This doesn’t fit the question exactly but I feel it’s in the same spirit, and a kind of interesting solution, I think.
Back in the early days of scryptcoin mining, I had a few gpu mining rigs running Linux. Occasionally they would hard lock and I’d have to power cycle them.
What I ended up doing is getting some usb to serial adapters, wrote a python script that ran on startup and would send a character over serial at a set interval in a loop. That was hooked up, if I recall correctly, to an attiny85 using softwareserial and some ttl to rs232 conversion. It would listen over serial and if it didn’t receive anything with a reasonable time frame it’d flip a relay that cut mains power to the pc, then flipped it back. A deadman’s switch, of a sort. It worked great!
I remember a story about someone who did something similar with a server that kept hanging. They rigged up a second computer to ping it over the local network and if there was no response for a certain amount of time, the computer would eject its CD-ROM tray which had been lined up neatly with the reset button on the server.
Since it couldn’t eject fully, it then retracted, having rebooted the server.
I assume that was a temporary fix… and it was probably a Windows server tbh.
The closest I’ve done is having a job run every 12 hours checking if a process was over a certain memory usage (memory leak) and restarting it if it was. That was also Windows, but the same thing on Linux wouldn’t have been difficult… not that the Linux servers ever had that problem.
This doesn’t fit the question exactly but I feel it’s in the same spirit, and a kind of interesting solution, I think.
Back in the early days of scryptcoin mining, I had a few gpu mining rigs running Linux. Occasionally they would hard lock and I’d have to power cycle them.
What I ended up doing is getting some usb to serial adapters, wrote a python script that ran on startup and would send a character over serial at a set interval in a loop. That was hooked up, if I recall correctly, to an attiny85 using softwareserial and some ttl to rs232 conversion. It would listen over serial and if it didn’t receive anything with a reasonable time frame it’d flip a relay that cut mains power to the pc, then flipped it back. A deadman’s switch, of a sort. It worked great!
I remember a story about someone who did something similar with a server that kept hanging. They rigged up a second computer to ping it over the local network and if there was no response for a certain amount of time, the computer would eject its CD-ROM tray which had been lined up neatly with the reset button on the server.
Since it couldn’t eject fully, it then retracted, having rebooted the server.
I assume that was a temporary fix… and it was probably a Windows server tbh.
The closest I’ve done is having a job run every 12 hours checking if a process was over a certain memory usage (memory leak) and restarting it if it was. That was also Windows, but the same thing on Linux wouldn’t have been difficult… not that the Linux servers ever had that problem.
if ping() <> 0 then drinktray.exe
Holy jank, Batman!
But hey; if it’s stupid and it works, it’s not stupid.