A discussion of Dan's "People Rank" proposal: complex, potential flaws, is it really necessary?

in #steem8 years ago (edited)

This discussion follows @dantheman's People Rank proposal and its follow up post.

Doubt

Delegated Voting via Account Rank
If you view each account as having 1 web page per unit of Steem Power controlled by that account, and you let each account link to other accounts that they trust to allocate funds and curate content then the result is a massively recursive delegated voting system

But how does "massively recursive" equate to "good" or "suitable" for Steem? I agree that this would be like doing Page Rank in Steem, but Page Rank was designed for the specific purpose of determining value in an environment where there is nothing of intrinsic value other than relationships, and where everything can be faked easily except massive interconnected-ness between webpages scattered across many domains and IP subnets. In the case of Steem, possession of Steem Power cannot be faked and is definitely a metric of value, so an approach like PageRank may be overkill, and I really don't think that the recursive part is necessary.

Discussion of some apparent weak points

Negative weights would allow the network to quickly remove voting influence from accounts that earn a reputation for bad behavior. This is something the current Steem algorithms require a voting bot war that generates unwanted collateral damage.

There is no need of a recursive algorithm to do that: other users can simply delegate partially their voting power directly to the misbehaving whale in such a way as to cancel her power instead of increasing it, which can be easily expressed by adding a boolean flag parameter to the power delegation function call. If enough users delegate enough negative power to neutralize entirely a whale's influence, the latter may be more keen on negotiating and committing to fix her behavior in exchange of having her power reinstated, resulting in a win-win-win where the community benefits from the whale fixing her toxic behavior, the other whales benefit from recovering their delegated voting power, and the misbehaving whale benefits from getting a second chance. This is simple, effective, and has almost no attack surface. Of course, there is the risk that a bad whale could cancel arbitrarily smaller accounts voting power, but the tactical threat of being given the same fate after the abuse was escalated to public attention should suffice to keep whales in check, in the same way that fear of lethal retaliation has so far been very effective at keeping nuclear powers from nuking not only each other, but also smaller nations that are not nuclear powers.

From an implementation perspective, if voting power received through proxying is accounted for separately from native voting power, like it's already the case for proxying of witnesses voting, there is no need to propagate it when the recipient of voting power himself delegates voting power.

The page rank algorithm is a computationally intensive iterative process that is normally performed by large clusters of computers using massively parallel map-reduce algorithms. A blockchain is required to reach consensus quickly and is ultimately single threaded because every transaction has the potential to impact the consensus state relevant to every transaction after it.

There will be cycles in the graph of power delegation that must be eliminated, but how to eliminate them without making an arbitrary ruling about where to break the cycle by removing a link, knowing that the destination vertex of the link is short changed since he is the only one who will give power to the cycle but receive nothing back. This inherently makes a Page Rank like algorithm unfair, because there is no way of eliminating cycles without picking an arbitrary cycle opening point and cheating a user off one inbound link.

This algorithm has many of the same scalability considerations as Page Rank. This means that each account would be limited in the number of accounts it could delegate to. Furthermore, there would need to be a minimal delegated amount.

Even with a limited number of power delegations per account, there is still a risk of denial-of-service. The algorithm to detect and eliminate cycles in the graph and propagate voting power will be at the very least O(n logn) if not O(n^2) or more. While that's still manageable in theory, it grows superlinearly (resp. quadratically) wrt to the number of accounts. If no global network-wide maximum number of eligible vertices or edges is set, but simply a limit by account as you propose, then this opens a new sybil attack vector where a user can create many accounts and use maximum arity for each node, in such a way as to create a maximum number of cycles, and lead to a combinatorial explosion. The cost of computing the power graph and eliminating cycles will increase superlinearly (resp. quadratically) with the number of accounts, making the attack increasingly costly to handle for the witnesses as more accounts are added, while the cost of doing the attack increases only linearly for the attacker. Adding the constraint of a minimum quotity to the delegated amount can help if the quotity is chosen to be large enough so that even if all the Steem Power available in the whole network was to be delegated entirely by lots of exactly the minimum quotity, the result would still be computable fast enough. This equivs to setting a maximum limit to the number of eligible vertices and edges without quantifying it as such.

Once we have limited the number of links it is simply a matter of spreading the calculation over time and prioritizing calculations that will effect the biggest changes. So long as the rate at which links can change is slower than the rate at which the algorithm can reach equilibrium then on average the network will remain close enough to equilibrium to accomplish the desired goal.

Unless you deactivate the delegated voting power for longer than it takes for the network to rebalance the power graph, rebalancing would have to occur at every block. Letting the delegated power be active after delegation or reactivated too fast would create a new attack vector where the attacker could take advantage of the delay in updating the graph of power to vote multiple times with the same active voting power delegated in short sequence to multiple sockpuppet accounts, leading to some voting equivalent of the double spend attack. Here is how it would work.

  1. Attacker A delegates all his voting power to his first socket puppet account B just before the rebalancing
  2. Rebalancing happens, network now thinks B has got all the voting power of A
  3. Attacker A removes voting power from B and delegates it to C instead
  4. B exhausts the voting power he got from A by making many votes and keeps using it to exhaustion until the next rebalancing happens. This is validated because the network still thinks that B has got all the voting power of A due to the delayed rebalancing.
  5. Rebalancing happens, network now think that C has got all the voting power of A and that the whole voting power is active. Network can now tell that B votes were in fact invalid, but since it's been already quite a few blocks since it happened, it has now become irreversible without doing a rollback / fork. The network also knows that B has exhausted the voting power, but since this was invalid, C should still be given the power in the same state as it was when reallocated by A, that is to say in active state.
  6. Now that the rebalancing just happened, Attacker A removes voting power from C and delegates it to D instead
  7. C exhausts the voting power ..
    Etc,.. Rinse, Repeat, Profit

A solution could be to do the rebalancing at every block but the shortcoming of this approach is that witnesses would have to do that computationally intensive operation every single block which makes the potential DoS attack exposed above even more impacting.

And even then, it would still be necessary to track what power has already been used for voting and what power has not been used so that the recipient of voting power that has already been used may not be allowed to use it another time. This would create a situation where delegated Steem Power has to be accounted separately as "active" or "used" and is therefore not anymore fungible. Worse, there isn't even such a clearly defined thing as "active" or "used" power since amount of power used by an account is tracked using a percentage variable that only indicates a proportion, and doesn't identify specific units of voting power.

The solution could be to always use active power to delegate, meaning that a user wouldn't be able to delegate more than voting_power % of his total non-delegated power. To avoid "double spend" type of attack, the delegated voting power, although it was active on sender side, should be counted as used upon delivery to the recipient, so that the recipient may not use this power to vote before the normal reactivation delay has elapsed. In that case, fungibility issue is avoided by delegating solely active voting power, and double spend avoided by making sure that the voting power cannot be used earlier than the time required to rebalance the graph of power.

Conclusion

The above is just a list of some random points picked in first reading. All of these points can be addressed and I attempted to do so with more or less success in the reply above. There are probably many other potential weak points that can also probably all be addressed one way or another. And I'm sure that @dantheman had already thought about all of the above and much more, but just speared us the details for the sake of readability.

Then, why this post?

The point I'm trying to make here is that Page Rank style distributed reputation is a complex problem that takes Google dozens of researchers to maintain and is always being gamed, adjusted, gamed again, adjusted again etc. I'm sure Dan would be able to pull something like that, but then Steem would suddenly move from being a simple, transparent, humanly tractable system to a complex dynamic system in constant motion, and requiring complex computation to converge periodically to a solution. It would be near impossible for anyone to verify that his reputation and voting power are what they should be due to the sheer complexity of the calculation. And it would make replaying the blockchain more expensive. Not to mention that complexity increases the attack surface and exposes the system to a higher risk of chain splits and vulnerability exploits. Attacks would also be more difficult to detect, since their effect would be difficult to isolate in the constant ebb and flow of delegated power in the graph.

This leads me to asking again the question asked in introduction: is it really worth bothering with a recursive delegation algorithm when the cost of owning Steem Power is sufficient to protect Steem against Sybil attacks and a fairer, more diverse, more multi-lingual, more multi-cultural Steem can be achieved easily with simple quantified proxying of voting power (see earlier proposal) to a well diversified team of curators from all walks of life, and negative power delegation to neutralize evil whales?

Now I'm sure there is much more to Dan's new idea than meets the eye in his two latest posts. So let's see what comes out of this discussion.

Sort:  

Argh you had to post this on a day I'm in with a client @recursive. What's funny is this post is going to go stratospheric before I have time to toss my hat in the ring :(

Anyways, well thought out post. I have an idea that would solve both problems but I'm probably going to need to blog about it tonight.

Ha! It's the village attack game scenario all over again! The village attack game is isomorphic to People Rank! Actually, it very well could be. I'm really curious of what input you can bring to the discussion. I don't think that discussion is going to settle any time soon given the ambitious goal so you have plenty of time to post on the topic. Don't forget to drop the link here so that we can follow!

Well, you are definitely living up to your name @recursive =P

The scope of this is a bit beyond me, but it is interesting to hear the potential consequences of a page-rank system. Steemit is already working to protect us from malicious users trying to game the system, and if page-ranking just adds another more complex layer of gaming, then maybe we should reconsider it? I am all for KISS (keep it simple stupid)

Haha, I didn't think about it. How ironic that the guy called @recursive is the one who opposes the use of a recursive algorithm :p. I am, and shall remain, the only recursive in that place!

I was thinking about the same thing (excluding the math).. It seems Steem Power ranking may be sufficient, and any incremental benefits from the Page Rank system may not be worth the risks. There may be easier ways to delegate voting power. ..

I would like to invite you to http://steemspeak.com Radio 24/7 steemit and crypto-talk. You have some really good pointers. Just drop in. You know how to install teamspeak 3, yes?

Very well thought and planned out discussion, you have really picked the brain here. I am curious to see what will happen after Dans discussion. Thanks for sharing this ;) Alla x

I am confused....

The context is too complex for me to understand.

Solution: Rebalance only the accounts of people who transferred their STEEMPOWER.

Really interesting ideas. I'm also not convinced that this is worth the effort. By the way, I recently did an analysis trying to rank users by their curation "ability", and @recursive2 came up as number 1 on the list, @recursive3 was number 5, and @recursive was number 7. Seems like you really know what you're doing! lol, you can check out all the results here if you want: https://steemit.com/bots/@trogdor/building-better-bots-ranking-the-top-curators

Coin Marketplace

STEEM 0.22
TRX 0.25
JST 0.039
BTC 105647.40
ETH 3331.47
SBD 4.08