Analysis of programming languages tags used on Steem

in #utopian-io7 years ago (edited)

The purpose of the following analysis is to investigate the popularity of programming tags and their development over time.

Outline

  • Most popular programming tags
  • Development over time
  • Average post payout

Scope of Analysis

The data was collected from programming posts from the very beginning of Steem blockchain up to the present.

Tools

  • SteemSQL
  • Python 3.6 (with matplotlib library)

Results

Most popular programming tags

The first step was to delete all posts that are not related to programming and leaving only those columns that will be useful in further analysis:

  • creation date
  • url
  • payout
  • tags
2017-06-30  /programming/@profitgenerator/learn-basic-python-programming-ep-4-let-s-build-a-calculator  5.9480  programming,howto,education,python,tutorial
2017-07-01  /programming/@qed/installing-haskell-idris-and-atom-idris-2 0.0560  programming,technology,math,video,steemit
2017-06-25  /code/@cosmobug/what-can-you-do-with-code   0.0000  code,programming,python,csharp,visualstudio
...

Using a simple script I checked the occurence of tags related to programming.

counter = Counter()
with open('programming.tsv', 'r', encoding='utf8') as f:
    for line in f:
        columns = line.strip().split('\t')
        tags = columns[3].split(',')
        for tag in tags:
            counter[tag] += 1
for tag, c in sorted(counter.items(), key=itemgetter(1), reverse=True):
    print(tag, c)

Some tags such as python, rust, c, r do not necessarily concern programming, so I started by finding general programming tags so that I could use them for filtering.

TagOccurrence
programming10200
technology2503
coding1774
utopian-io1217
tutorial913
steemdev544
development385
computer353
code280
opensource260
software256
tech241
learning225
tutorials217
html193
linux188
ai159
web158
design147
dev143
security139
android135
developer130
steemiteducation125
learn114
computers110
hacking95
css91
machine-learning91
deep-learning85

The next step was to count programming languages tags. Some tags refer to the same language (rust and rust-lang, go and golang, cpp and cplusplus and c-plusplus), so you need to take this into account.

counter = Counter()
with open('programming.tsv', 'r', encoding='utf8') as f:
    for line in f:
        columns = line.strip().split('\t')
        tags = (columns[3]
            .replace('rust-lang', 'rust')
            .replace('golang', 'go')
            .replace('cplusplus', 'cpp')
            .replace('c-plusplus', 'cpp')
            .split(','))
        tags = set(tags)
        if tags & set(common_tags):
            for tag in tags:
                counter[tag] += 1
for tag, c in sorted(counter.items(), key=itemgetter(1), reverse=True):
    if tag not in common_tags:
        print(tag, c)

The result looks as follows.

.TagOccurrence
1python1013
2java595
3javascript552
4php193
5c114
6cpp97
7csharp52
8go50
9solidity49
10ruby41
11rust40
12kotlin37
13r28
14scratch19
15mysql18
16assembly17
17elixir17
18swift15
19lua11
20bash10

I also checked the 20 most popular tags from https://stackoverflow.com/tags to be able to compare the results.

.TagOccurence
1javascript1553961
2java1370330
3c#1177662
4php1165762
5python891863
6c++553889
7mysql504295
8objective-c282490
9c270118
10r221396
11ruby191670
12swift180026
13vb.net115839
14bash95125
15vba93824
16postgresql80043
17matlab76946
18scala75314
19perl58632
20delphi40852

Languages ​​such as python,java, javascript, php occupy top positions in both tables. Relatively low in the Steem table is csharp.

Below is the list of languages only included in the Steem ranking:

  • go
  • solidity
  • rust
  • kotlin
  • scratch
  • assembly
  • elixir
  • lua

And the languages only included in the stackoverflow ranking:

  • objective-c
  • vb.net
  • vba
  • postgresql
  • matlab
  • scala
  • perl
  • delphi

The first factor that can affect these differences is the fact that the Steem blockchain works much shorter and some of the languages were popular some time ago: objective-c, which is replaced by swift; perl, which is replaced by, for example, python and delphi, which is now very rarely used. The Steem ranking also did not include both languages ​​from the VisualBasic family: VisualBasic.Net (vb.net) and VisualBasic for Application (vba). The low popularity of the postgresql tag is probably due to the fact that the most popular Steem blockchain databases use other solutions: SteemSQL - mssql, sbds - mysql, SteemData - mongodb. Personally, I am surprised by the low popularity of scala tag in Steem.

The reason why the languages ​​rust,kotlin, elixir are quite high in the Steem ranking compared to stackoverflow is the fact that they are relatively new technologies. Scratch is a visual programming language, especially for children and youth, so it's not surprising that it's not popular at Stackoverflow, which is designed for professional programmers. Solidity is a programming language for the Ethereum platform, hence its high position in the Steem ranking.

Other languages ​​used by platforms similar to Ethereum:

  • Lisk: javascript
  • Cardano: haskell
  • NEO: csharp, vb.net, fsharp, java, kotlin, python
  • EOS: wren
  • Stratis: csharp

Let's also look at the popularity of tags on the pie chart.

image.png

Development over time

The following chart shows the number of posts related to a top 10 most popular programming languages in given months.
Popularity of which tags will grow fastest? I think that it will be the top 4: python, javascript, java, php because they are they generally very popular, and used by steem related libraries:

I think that the solidity can also count on growth due to the development of the Ethereum platform.

image.png

Below is the chart for the remaining tags. The division into two charts was intended to increase readability. Among the languages ​​of the second tenth, the greatest potential for growth is for the relatively new ones, ie rust, kotlin, elixir. And probably r because of machine learning popularity.

image.png

Average post payout

The average payout depending on the tag is quite diverse. More popular tags generate higher payouts, probably because of reaching a wider audience. It should also be taken into account that the result for less common tags may be somewhat biased due to the too small sample from which the result was determined.

image.png



Posted on Utopian.io - Rewarding Open Source Contributors

Sort:  

info about wren is out of date

EOS: wren

EOS: WebAssembly

Your contribution cannot be approved because it is not as informative as other contributions. See the Utopian Rules. Contributions need to be informative and descriptive in order to help readers and developers understand them.

Dear @jacek-w, great to see a new contributor to utopian analysis posts! But I'm sorry, your contribution is not as detailed as other contributions. The "analysis" aspect is a bit on the short end, you mainly present data visualizations without explaining, interpreting or extrapolating the results.

You can contact us on Discord.
[utopian-moderator]

Hey @crokkon, I just gave you a tip for your hard work on moderation. Upvote this comment to support the utopian moderators and increase your future rewards!

Ok I understand. I've updated my contribution.

Hi @jacek-w, I'm sorry, it is not foreseen to have several review iterations.

There is surprisingly high share of Java in programming languages group... JS' & Python's don't bothers me in terms of share - use cases and high popularity makes them shine, but why Java?

Congratulations @jacek-w! You have completed some achievement on Steemit and have been rewarded with new badge(s) :

Award for the number of upvotes received

Click on any badge to view your own Board of Honor on SteemitBoard.
For more information about SteemitBoard, click here

If you no longer want to receive notifications, reply to this comment with the word STOP

By upvoting this notification, you can help all Steemit users. Learn how here!

Coin Marketplace

STEEM 0.25
TRX 0.20
JST 0.035
BTC 95284.46
ETH 3462.33
USDT 1.00
SBD 3.49