RE: Lessons learned from failing as a witness
I've implemented something like this for my trading platform tymoraPRO that gives alerts and warnings whenever any network line, service, or datafeed goes down or doesn't check in within the designated amount of time. At that point, my server-monitor app sends an alert to prowlapp that immediately pops up on my mobile device (better than SMS, since you can include more information if necessary, and without potential SMS fees either). Of course, you could also easily link twillio as well (and/or), as I recall the twillio API is pretty similar and straight-forward.
https://www.prowlapp.com/
API: https://www.prowlapp.com/api.php
If you really want to go all the way with this, here are a few other open source projects that may already do most of the work for you and provide potentially much more robust (albeit more weighty) solutions as well:
libraries:
https://github.com/uniqush/uniqush-push
https://github.com/jreese/znc-push
complete systems:
https://github.com/huginn/huginn
https://github.com/Netflix/Hystrix
https://github.com/OpenNMS/opennms
If you need any help setting something up, let me know, I'd be happy to assist where I can.
this is a great idea and some of the witnesses might want to implement it. The idea i had in mind had zero tech overhead for their servers as it would rely on other humans triggering the alert through the steem blockchain
That aspect could relatively easily be incorporated as well, though you're still talking some tech overhead. You still need the monitor app or script that triggers the SMS, even if it's triggered by humans. But you do bring up an interesting point. Technically, the same app could monitor as many witnesses who'd also like their servers monitored for either discrepancies or outages, etc. as well.