Ugly Bash script does the job

in #witness-update7 years ago (edited)

As a witness, I use many different checks to monitor the reliability of my nodes. After few issues some time I got a good understanding which scripts can really catch the issue at an early stage and which ones are completely useless and do nothing after all.

When I started my journey I had running conductor in kill-switch mode to track missing blocks and Zabbix to monitor hardware, software and uptime of steemd process. For notifications, I've configured email and PUSH messages through Pushover to get alerts immediately when they occur, but this setup didn't satisfy me. My biggest concern was how to catch the issue before I start missing blocks...

...and this is what I figured out so far,

  • connect to the local RPC endpoint (require rpc-endpoint = 127.0.0.1:8090 in config.ini)
  • get data from database_api get_dynamic_global_properties
  • check the delta between server time and the time of the last synced block

Simple but powerful, because it checks if the steemd process is up and running and also measures the delay time of syncing blockchain. Time is crucial when it comes to producing blocks because the delay of more than 3 seconds will cause your node to start missing blocks.

I wrote a Bash script to gather data and feed Zabbix to trigger actions if delay is >4 seconds. The script can be used to check all kind of nodes.

#!/bin/bash

HOST=$1
TIME_WITNESS_RAW=$(curl --silent --connect-timeout 5 $HOST --data '{"id":1,"method":"call","jsonrpc":"2.0","params":["database_api","get_dynamic_global_properties",[]]}' | jq -r '.result.time')
TIME_WITNESS=$(date -u -d "$TIME_WITNESS_RAW" +%s)
TIME_CURRENT=$(date -u "+%s")

DIFF=$(expr $TIME_CURRENT - $TIME_WITNESS)
echo $DIFF

 
(*) it uses jq to parse JSON data, but it can be replaced by the grep.

Example usage,

# witness node
$ ./steemd_delay.sh http://127.0.0.1:8090
1

 

# seed node
$ ./steemd_delay.sh http://seed.jamzed.pl:8090
2

 

# full RPC node
$ ./steemd_delay.sh https://api.steemit.com/
1

 
I know... The Python is more sexy, but Bash also does the job... ;-)


If you think I will be a good witness, please vote for me.
Thank you!

Sort:  

Would you mind if I will post my Flask service that does same job? I'll clearly say where I got the idea from. :-)

It will be great ;-) we’re all here to improve the Steem ;-)

Ha, wyprzedziłeś mnie. Mam podobny skrypt i też chciałem go opisać. :-)

To jest chyba najprostsze rozwiązanie które daje stosunkowo dużą pewność, że node działa prawidłowo. ;-)

There is nothing like a quick and dirty bash script that gets the job done!
:)

Agree, if it's stupid but it works it isn't stupid ;->

yes, all my steem projects start like that..and this is the result:

$ crontab -l|wc -l
64

:)))

check the delta between server time and the time of the last synced block

So the assumption is, if the block is not getting synced in 3 seconds, there is something wrong ?

That’s correct. In my understanding if block_log is delayed by >3 seconds there is something wrong and I need to check it. It’s very simple check, but I was able to detect few serious Steem blockchain issues and take action before loosing block. ;-)

makes sense. I was looking at "triggers" for the fail over and also talking to @petertag to have a proper fail over tool. So this is one of the conditions that can help. Right now I am manually doing failover between the primary and secondary.

Coin Marketplace

STEEM 0.22
TRX 0.20
JST 0.034
BTC 98923.04
ETH 3381.66
USDT 1.00
SBD 3.09