What is MIRA?
I’m @vandeberg, the Senior Blockchain Engineer at Steemit and today I want to talk a little bit about MIRA, the software solution to Steem’s hardware scalability challenge that I have been developing for the last few of months and which we’ve just announced has begun its soft roll-out on Steemit’s nodes. You can find that announcement on @steemitblog. You can also watch me, or listen to me, talk about MIRA on the latest episode of the Steemit podcast below.
Thanks to MIRA we are seeing dramatic reductions in the cost of running Steem nodes that results from migrating the Steem blockchain database (steemd) from expensive, high-performance, hardware to low cost, run-of-the-mill hardware like network attached SSDs or even old school spinning disk drives!
Database Replacement
MIRA is a complete replacement for the backend database that utilizes technology called RocksDB. RocksDB was developed by Facebook to power their Feed which has to load data very rapidly in order to provide a pleasant user experience. Leveraging RocksDB allows us to run the Steem blockchain much more cost effectively, and put our hardware scaling challenges at rest once and for all.
Commodity Hardware
Initially when Steem launched, the database was stored on more affordable hardware. But as the database grew in size, those storage media had difficulty keeping pace with the level of engagement that was being demanded of the protocol. To address those issues we developed innovative software solutions that migrated the database to more high performance storage media in order to ensure a consistent user experience.
This was a good temporary solution, but as Steem continued to grow, the cost of running the blockchain in state-of-the-art storage media started to become financially burdensome on those who run Steem nodes like app developers and Witnesses (i.e. block producers). MIRA resets the hardware requirements for Steem to what they were when Steem first launched, without negatively affecting performance.
Costs Decreasing Over Time
Thanks to MIRA, the technology is now in place to allow Steem to scale much more efficiently into the future, ideally on a course that will see hardware outpacing the requirements of Steem. That means that despite the fact that Steem will continue to grow, it might actually get cheaper to run Steem over time, as long as hardware improvements continue at their current rate. This is important because the largest cost for blockchains (especially blockchains that produce blocks very rapidly) is going to be disk space, but enabling the data to be stored on a slow disk should neutralize that cost.
What is RocksDB?
RocksDB is a fork of LevelDB which is a fork of Berkeley DB. Many blockchains use LevelDB to run on disk, but that’s because they aren’t nearly as fast as Steem, which has 3 second block times. RocksDB adds many more layers of caching algorithms that make it a lot more efficient than LevelDB while also providing interfaces that made it much easier for us to into it into the existing Steem codebase. All of that is important for ensuring that such a high performance blockchain as Steem can be run on cheap hardware and not just exotic hardware like nVMEs.
But this is not a problem that will be unique Steem. As other blockchains like Ethereum seek to shrink their block times, the requirements for quick data access will become all the more important, at which point they will need to look at better scaling solutions for their backend database. Luckily for those team, since MIRA will be Open Source, as much as 90% of their work will already be done for them.
Why an Adapter?
There are projects like Hyperledger use RocksDB directly, which begs the question of why we did so much work to build yet another piece of software. One of the things the adapter (i.e. MIRA) accomplishes is it reduces programmer error, we have interfaces that are well defined and work and are really really easy to develop on, a big part of Steem’s initial 3 month development time can be attributed to those interfaces, they have really good C++ bindings that obfuscate all the database management and allow us to develop all the code very quickly. If we were to build directly on RocksDB, all of the Steem code would have to be rewritten, it would have been very error prone and would have taken months–to years–to do a refactor of that scale and test and ensure no bugs
Non-Steem Use
One of the great things about MIRA is that it will be fully Open Source, so if another team wanted to use MIRA, they would only need to do a little work to integrate MIRA into their solution because MIRA is also blockchain agnostic. It can be used for many different applications. I know that if I were developing an app in C++ and I needed the type of the type of database scaling solution that MIRA provides I would not hesitate to use it and doing so would save me probably 90% of my development time in part because we put a lot of effort into building good interfaces that make it easy to work with MIRA.
The point of MIRA was to wrap RocksDB in an interface that matches what is provided by boost multi-index containers which are a more widely used library, as Boost is standard in C++ applications. That’s because the people who maintain Boost are some of the most talented C++ developers in the world. Without their amazing work, MIRA wouldn’t even be possible. Most of our work was digging into the boost multi-index containers, looking at their implementation and swapping out code where needed to interface with RocksDB rather than be in memory object structures that boost multi index containers do. So if any project want to use Boost to migrate a database from memory to disk, MIRA can help.
The Joy of Open Source Development
The joy of working in Open Source is that we get to borrow great ideas from great engineers, add on our own ideas, and let the rest of the world use all of it. The code is just sitting there, waiting for people to use it. It would be awesome if other projects use MIRA, but even if they don’t, it fills a critical need within our ecosystem. If the Steem source code is the only code that uses MIRA, that will mean that everyone who runs a Steem node will be using MIRA to make their lives easier and reduce their costs. And that’s still a pretty big deal.
Congratulations to all involved in the development of MIRA.
This is indeed a red letter day and an amazing transformation in the effectiveness of Steemit Inc in delivering what the Steem community really needed from them.
I posted about how important the ability to run a Steem node on commodity hardware was immediately after Steemit Inc announced the 70% cuts.
You have really turned things around. Well done.
I really dont care to educate myself on what is what. All i care about is one word:
and how @andrarchy can take that word and mold it into a 100 different versions, comparisons with other blockchains to show twitter, facebook, Reddit how STEEM compares.
Doesnt matter how it works. All that matters is that it does work and that the shill info is so good that the receiving end doesnt even mind, because its just so beautiful..
First of all this is a monumental moment in Steem's history, the costs of running 3 year old blockchain will be cheaper than ever, I don't think many people understand this. Every blockchain that gets bigger over time is more costly to maintain, with MIRA Steem can be run by almost any PC configuration.
I have one question though, are there any bottlenecks left that would prevent Steem for massive influx of people, something that would potentially be another 1year development cycle problem?
I've always wondered about the whole 'logistics' of steem/blockchains; of course modern computer processing speeds are hard to even wrap your head around, I can't help but sometimes wonder how 'long term' feasible block chain technology is... I mean every three seconds there's more data to be processed. Admittedly I'm not a computer scientist and I'll leave this speculation up to people that are more informed on the subject. But I like what you're saying!
Wake me up when we have communities :)
Couldn't the government narrow down who's running a node in DPoS? I was reading how centralized nodes aren't very decentralized.
dPOS is centralised to a permissioned set of block producers, right. When they speak about "decentralisation" they mean that they have lowered the economic hurdles for you to take part in this political process of becoming a blockproducer. So technically it does not decentralize anything. But dPOS is obviously not relying on topological decentralization, it relies on delegating power to representatives of which 21 will produce the blocks.
Since voting in Steem is plutocratic (dependent on your money) it is not democratic and heavily biased towards creating circles of power and influence. The "government" does not wnat to discuss this aspect...obviously they are not interested in "power to the users".
Well played! This sounds like a huge milestone accomplished.
You're very eloquent @vandeberg! Congrats on this remarkable work and I hope we'll get to hear you more in the future!
$rewarding 100% 15 min
Posted using Partiko Android
That is great work and I hope there will be some updates once thing settle to look at the numbers in regards to performance and savings.
To listen to the audio version of this article click on the play image.
Brought to you by @tts. If you find it useful please consider upvoting this reply.