Steem - Comment Data and Automated Accounts

in #stats7 years ago

Is it possible to use the posting activity of comments to isolate which accounts are bots?

My last post, in this deep dive into account activity, focused on Posting data. We got some insights into what posting activity looked like for the last 3 months of 2017, and we also got a feel for what the mix of automated posting and manual posting was.
There were less automated posts than I was expecting but I am sure this is something we will see grow over time and is one to watch. I suspected that most of the automated posts are being generated on comments so today I will take a look at the breakdown of these.



Image Source: pixabay.com

In this post I am going to focus my analysis on Comments (excluding Posts) to see what trends emerge.

Distribution of Posting Frequency

If we slice the data by month and look at the distributions of posting activity by account we get the following boxplot.

  • The pink rectangles show that the majority of accounts Comment 30 or less comments per month. (i.e. about 1 a day)
  • The blue shaded region is where some accounts make up to 8 comments per day.
    8 comments per day was my average during this period. I know people that are much more active than me but I will use this as a benchmark.

So how many times did the most active accounts comment?

Posts Oct - Dec

This graph shows the distribution of number of posts from the 100 most active individual accounts over this period.

  • Around 45 accounts post more than 5,000 times in this period.
  • There is no surprises in the top few, with the exception of @steemitboard who has posted more than 200k times in this 3 month period. There are obviously a lot of people making achievements.

If there is one, what is the number of comments a day that we could use to identify automated accounts?

I would propose anything higher than 15 comments per day (on average) uses some sort of automation. This is almost twice the rate that I posted at during the period so it's an indicator for some sort of automated commenting. This number may not be correct but it's a starting point for our analysis.

Using this criteria (15 comments per day on average) I next look at the splits of posts per day between automated and manual. Automated Comments have posted more than 450 times in a month. This produces a list of 1,175 accounts that regularly post more than 15 times a day.

Comments Per Day

This next series of graphs shows the split of number of comments per day based on this criteria to identify manual and automated comments.

Number of Comments

  • automated > 15 posts per day (on average)
    (From 1,175 Accounts)
  • manual everything else

Number of Comments

The last graph shows the total post count and the visual the split but we can also plot the individual components separately to identify trends.

  • I was expecting to see a growth in Automate Comments over the period but this has been relatively flat and we see an positive increase in manual posting activity. This is very encouraging!

Have you come across any interesting ways to gauge the amount of users on Steemit or Steem?

With this series I am analysing trends in the Steemit Account Activity to see what the most useful metrics are for identifying growth and activity on the platform. I have come across some interesting trends in the data which I hope to analyse regularly and which the community may find useful. There are a few more items I will look at in the coming days related to accounts but please let me know if there is anything in particular you would like to see. Thanks for reading.

Related Posts

I am taking a deep dive into the Accounts of Users in this series of posts. You may also be interest in:



Thank you for reading this. I write on Steemit about Blockchain, Cryptocurrency, Travel and lots of random topics.

Sort:  

I can tell you a small fact but I don't know if it helps with your question.

By 'users', I am thinking you mean people who are active.

At the time I did this exercise, I was posting 5 articles a day that seemed good quality to me but was getting hardly any comments and votes. I had 504 followers. I expected a better response rate so I looked into how many were still active users. By 'active user', I meant someone whose account showed activity within the last week.

I visited every follower of my account both on Steemit and Busy if they were on there. It took 3 days because it was a soul-destroying task. At the time there was a drive on recruiting new users by visiting places like universities, and posts saying how successful it was, were frequent.

Out of 504, only 26 were active users. 202 had posted between 5 and 10 times in 2 weeks of joining, then ceased activity.

The rest were between 0 and 5, posted as soon as they joined then never again, 136 of them had never posted.

I was shocked. I understood why I was in the position of having 504 followers but a tiny response rate. My 504 was really 26.

It's such a small sample that no overall judgement can be made about Steemit but it was relevant to me. In my judgement, none of the 26 active Steemians were bots.

Could such an exercise be carried out universally? Maybe the bot plague could help somehow. An account like Guillaume Cardinal's would be a better sample.

I apologise in advance if this reply is a time waster for you.

This is exactly the reason why I am doing this exercise. There are often numbers quoted which, in my opinion , don't give a very meaningful picture of how many people are on Steemit. How many real users are there (as opposed to bots), how many active users are there?, how many people have joined and never posted?, What group of people are posting regularly, is it new users, old users, the same people all the time???

There is a site but it doesn't seem to be working for me this morning but you might check it out called Dead Followers

This let you see the stats for any account to see how many active followers there are. What would be really nice to see would be a comparison of accounts that shows how many have the most active followers. Ill try to visualise this in the coming days. That would be really interesting to see.

Thanks for the contribution and ideas.

Thank you for replying. Your investigation is very interesting.

I'm a student and studying about algorithms, so this is what I think

To identify a bot account maybe you should look about the measurement between each comments. I think bot accounts was programmed to write comments having the same time between comments to ensure regenerating the bandwidths the account is using. If they failed to let the account regenerates the bandwidth the account will stop making comments to other users.

If I'm the developer of the bot I will ensure that the account will not go to the point where its bandwidth is empty

That is a great suggestion. I'll have a look into it. The more I think about it makes most sense to look at a few things and generate a list based on several criteria.

Thanks so much.

no problem, I'm happy that I somehow help you about your research :)

I think a better way to identify bot accounts would be to measure the time between comments from a particular account. I don't know if that's even possible, but I think it would be more accurate than using the number of posts per day, because I'm sure there are some people on here who love commenting and would do it more than 15 times in a day.

Looking at the times will be interesting. I can check accounts that never sleep for more than 5 hours.

15 comments per day seems too low for using as a yardstick for bot commenting, i dont do much of long post but i do love to read post on steemit and then engage in comment section. on a good day i read and comment on more than 15 post and am not a bot,lol.

Fair point, what level would you think makes more sense? The thing is this is an average over a month. Even if you post a lot one day I think not many people would post every day for a whole month more than 15 unless the comments are somehow automated.

There were less automated posts than I was expecting but I am sure this is something we will see grow over time and is one to watch.

Its really something to watch seriously

As the power and superiority of the Steem Blockchain becomes evident (the speed etc) people may use it more and more to write things into the blockchain. These will not be posts as we know them but will add to the value of Steem. It will be just another use case for it.

I did not understand exactly what you mean by automatic post or automatic comment. If you used to explain it a little bit then we would understand.

There are several types of automated comments, I have used the term automated as they must be automated in some way to generate such a high frequency of posts.

I understand.

This post very nice..Thanks for about discus ..Best of luck..

Nice post beautiful presented and explained. detail oriented with nice information. thank you for sharing @eroche 😍😍😍😍

Each of your posts is very interesting!
Hopefully the next post more interesting and useful

Interesting post. Give me a more info. Thanks a lot friend

Your welcome, thanks for reading.

Coin Marketplace

STEEM 0.22
TRX 0.20
JST 0.034
BTC 98713.95
ETH 3352.77
USDT 1.00
SBD 3.07