[GUIDE] Optimise your RPC server for better performance
What is an RPC server?
An RPC server is a type of Steem server usually ran by witnesses like myself (@someguy123). They are used by third party applications such as MinnowBooster, voting bots, interfaces such as Busy and ChainBB, and AnonSteem.
They provide an API (Application Programming Interface) to the Steem network, allowing developers to create applications which read data from the Steem network such as posts or transfers, and to allow them to broadcast transactions such as votes and comments.
These servers are extremely expensive, as they currently require 512GB of RAM to operate, and growing every day. The majority are operated by TOP 20 witnesses due to their high costs of operation.
It can be difficult to find a 512GB server, but Privex Inc. sells them for just $600/mo (DISCLAIMER: I am the CEO of Privex), and accepts cryptocurrency such as Bitcoin, Litecoin, STEEM, and SBD.
What causes RPC servers to run slowly?
There are a mix of issues at hand:
steemd
is single threaded while resyncing, and does not make good use of cores even after syncing. Single core performance is extremely important.- RPC nodes require a huge amount of RAM to operate at good speeds. Running on NVME or SSD will cause it to perform very poorly. RAM speed may influence the performance
- SSDs are necessary for the blockchain, if not NVME. RAID 0 is strongly recommended for increased performance
- Public nodes run into various networking problems from the high load they suffer.
Hardware
The first and foremost thing is to obtain good hardware. You want a CPU with good single core performance, rather than a CPU with 10s of cores. This cuts replay time and improves performance.
Storing the shared memory in RAM massively reduces replay time and improves stability. Alternatively you will want several NVME drives in RAID 0 dedicated to the shared memory file. Due to the heavy reads and writes, it may be advised to use a high performance filesystem such as XFS on the NVME drives, disable access times, and have it write the journal to a different disk (e.g. SATA SSDs).
Public RPC nodes can chew through massive amounts of bandwidth. The public PRIVEX load balancer (steemd.privex.io) has gone through 20 TB in just under 3 months. Network speeds of 300mbps+ are recommended (Privex sells 1gbps (1000mbps) servers), with at least 5TB bandwidth per month minimum.
Network Optimization
It's recommended to use NGINX in front of your RPC node, disable access logs and set up rate limiting as such:
# ----------------------
# nginx.conf
limit_req_zone $binary_remote_addr zone=ws:10m rate=1r/s;
# ----------------------
# sites-enabled/default.conf
limit_req zone=ws burst=5;
access_log off;
keepalive_timeout 65;
keepalive_requests 100000;
sendfile on;
tcp_nopush on;
tcp_nodelay on;
This will restrict the rate at which individual users can send requests, preventing abuse.
Further network optimization
One problem is that some applications make excessive connections. This causes detrimental effects to performance.
To protect against this, you can use iptables (use iptables-persistent/netfilter-persistent to hold this on reboot) to restrict each IP to 10 connections at a time.
iptables -A INPUT -p tcp --syn --dport 443 -m connlimit --connlimit-above 10 --connlimit-mask 32 -j REJECT --reject-with tcp-reset
iptables -A INPUT -p tcp --syn --dport 80 -m connlimit --connlimit-above 10 --connlimit-mask 32 -j REJECT --reject-with tcp-reset
Notice the massive drop in connections from adding these iptables rules. This dramatically freed up connections and improved response times.
Another issue many RPC nodes face, is stale connections. This may be related to poor networking code within steemd or third party libraries for interfacing with Steem.
This can be resolved by tweaking the TIME_WAIT re-use, recycling and timeouts.
echo 30 > /proc/sys/net/ipv4/tcp_fin_timeout
echo 1 > /proc/sys/net/ipv4/tcp_tw_recycle
echo 1 > /proc/sys/net/ipv4/tcp_tw_reuse
To retain this on boot, place this in /etc/sysctl.conf
(Taken from LinuxBrigade)
# Decrease TIME_WAIT seconds
net.ipv4.tcp_fin_timeout = 30
# Recycle and Reuse TIME_WAIT sockets faster
net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
As can be seen above, tweaking these networking flags helped reduce TIME_WAIT connections massively, further cleaning up the connection pool and improving response times.
BEFORE network adjustments
curl -w "@curl-format.txt" -s --data '{"jsonrpc": "2.0", "method": "get_dynamic_global_properties", "params": [], "id": 1 }' https://direct.steemd.privex.io
time_namelookup: 0.067882
time_connect: 0.098762
time_appconnect: 0.173686
time_pretransfer: 0.173719
time_redirect: 0.000000
time_starttransfer: 0.469058
----------
time_total: 0.469133
AFTER network adjustments
curl -w "@curl-format.txt" -s --data '{"jsonrpc": "2.0", "method": "get_dynamic_global_properties", "params": [], "id": 1 }' https://direct.steemd.privex.io
time_namelookup: 0.004555
time_connect: 0.033890
time_appconnect: 0.105844
time_pretransfer: 0.105878
time_redirect: 0.000000
time_starttransfer: 0.137760
----------
time_total: 0.137781
Conclusion
As can be seen, the network fixes caused a four-fold improvement in response time. The Privex RPC server was ranging from 400ms to 600ms prior to the fixes applied. After enabling connection restrictions, and cleaning up TIME_WAITs, the response time was stable between 120ms to 150ms.
I hope that these tips will help you to improve your servers performance.
GIF Avatar by @stellabelle
Do you like what I'm doing for STEEM/Steemit?
Vote for me @someguy123 to be a witness - every vote counts.
Don't forget to follow me for more like this.
Have you ever thought about being a witness yourself? Join the witness channel. We're happy to guide you! Join in shaping the STEEM economy.
Are you looking for a new server provider? My company @privex offers highly-reliable and affordable dedicated and virtual servers for STEEM, LTC, and BTC! Check out our website at https://www.privex.io
hello dear i liked you're posst please follo me
I actually just followed you, please follow me back.
follow me and I will follow you. that's all.
Unfollow me unfollow you?
oh, me :(
Really helpful tips for the house keepers. I will sure support you as a witness for this act of selflessness. Thanks for the heads up
What you have done is important. RPC nodes greatly simplify Steem based apps!
Done pressing the witness button to you sir ,for a person like you and your activities in steemit world that shows big impact of many.. Thanks for your work and godbless.
I have often wonder why more applications have not been running there own dedicated server for their related service. With that kind of price tag and other issues of ram prices this paints a much better picture why.
Is there a reasoning why steemd is single threaded or do they simple have plans in the future to make it take advantage of multicore cpus? It seems rather odd from a layman like myself why with how many cores cpus are coming out these days would things still choose to be single core focused.
Thanks for including the disclosure of your relation with Privex.
Making multi-threaded applications is extremely difficult, even more so with things like replaying a blockchain since every block relies on the previous one to be correct.
Check this post for more info: https://www.quora.com/Why-is-multi-threading-so-damn-hard?share=1
I hope that the developers can improve this in some way. It's possible many of the issues are already solved in EOS, but may never make it to light in Steem.
Hello there @someguy123 thanks for sharing this. May I ask you some questions about being a witness?
Thanks for your time.
STEEM ON.
The specifications are getting more serious all the time, I am guessing fast quad channel memory would also be quite helpful.
I am curious, what is your opinion on the post from @steemitblog that it is not necessary to use a 512GB server for a full node. Specifically this part:
As @themarkymark wrote, nobody believes @steemitblog's results.
I have never successfully gotten an NVME (without /dev/shm) server to replay without crashing. It is also slow as molasses.
At @privex we're experimenting with high quality NVME drives and locating CPUs with good single core performance, to try to make it more scalable. We think it may be possible to get half decent performance on a non-RAM node with 4 to 5 NVME drives in RAID 0, using XFS as the file system, storing the blockchain on a separate SSD, boot drive on a separate SSD, and various tweaks to XFS e.g. disable access time, move the journal onto the boot SSD so that it does not impact the NVME performance.
It is a lot more difficult than using RAM, but we're quickly approaching 512gb, and the next level can triple in price...
From their publication I had the impression that the scaling issues were not as severe as they have been portrayed in other blogs. At some point I considered setting up a full node but I realized that I need to learn a lot more and the cost is now beyond my budget. I appreciate that you took the time to respond.
I think he touched on it with this:
....stale connections can eat RAM too. Having more RAM than necessary is always ideal.
I love your posts @someguy123 -- very detailed and they help people realize what they should be looking into and researching.
Wow. This is so helpful and we should be researching, it will sure help improve the server performance.
This is the first post am reading from you cos I just started following you