Analysis of programming languages tags used on Steem
The purpose of the following analysis is to investigate the popularity of programming tags and their development over time.
Outline
- Most popular programming tags
- Development over time
- Average post payout
Scope of Analysis
The data was collected from programming posts from the very beginning of Steem blockchain up to the present.
Tools
- SteemSQL
- Python 3.6 (with
matplotlib
library)
Results
Most popular programming tags
The first step was to delete all posts that are not related to programming and leaving only those columns that will be useful in further analysis:
- creation date
- url
- payout
- tags
2017-06-30 /programming/@profitgenerator/learn-basic-python-programming-ep-4-let-s-build-a-calculator 5.9480 programming,howto,education,python,tutorial
2017-07-01 /programming/@qed/installing-haskell-idris-and-atom-idris-2 0.0560 programming,technology,math,video,steemit
2017-06-25 /code/@cosmobug/what-can-you-do-with-code 0.0000 code,programming,python,csharp,visualstudio
...
Using a simple script I checked the occurence of tags related to programming.
counter = Counter()
with open('programming.tsv', 'r', encoding='utf8') as f:
for line in f:
columns = line.strip().split('\t')
tags = columns[3].split(',')
for tag in tags:
counter[tag] += 1
for tag, c in sorted(counter.items(), key=itemgetter(1), reverse=True):
print(tag, c)
Some tags such as python
, rust
, c
, r
do not necessarily concern programming, so I started by finding general programming tags so that I could use them for filtering.
Tag | Occurrence |
---|---|
programming | 10200 |
technology | 2503 |
coding | 1774 |
utopian-io | 1217 |
tutorial | 913 |
steemdev | 544 |
development | 385 |
computer | 353 |
code | 280 |
opensource | 260 |
software | 256 |
tech | 241 |
learning | 225 |
tutorials | 217 |
html | 193 |
linux | 188 |
ai | 159 |
web | 158 |
design | 147 |
dev | 143 |
security | 139 |
android | 135 |
developer | 130 |
steemiteducation | 125 |
learn | 114 |
computers | 110 |
hacking | 95 |
css | 91 |
machine-learning | 91 |
deep-learning | 85 |
The next step was to count programming languages tags. Some tags refer to the same language (rust
and rust-lang
, go
and golang
, cpp
and cplusplus
and c-plusplus
), so you need to take this into account.
counter = Counter()
with open('programming.tsv', 'r', encoding='utf8') as f:
for line in f:
columns = line.strip().split('\t')
tags = (columns[3]
.replace('rust-lang', 'rust')
.replace('golang', 'go')
.replace('cplusplus', 'cpp')
.replace('c-plusplus', 'cpp')
.split(','))
tags = set(tags)
if tags & set(common_tags):
for tag in tags:
counter[tag] += 1
for tag, c in sorted(counter.items(), key=itemgetter(1), reverse=True):
if tag not in common_tags:
print(tag, c)
The result looks as follows.
. | Tag | Occurrence |
---|---|---|
1 | python | 1013 |
2 | java | 595 |
3 | javascript | 552 |
4 | php | 193 |
5 | c | 114 |
6 | cpp | 97 |
7 | csharp | 52 |
8 | go | 50 |
9 | solidity | 49 |
10 | ruby | 41 |
11 | rust | 40 |
12 | kotlin | 37 |
13 | r | 28 |
14 | scratch | 19 |
15 | mysql | 18 |
16 | assembly | 17 |
17 | elixir | 17 |
18 | swift | 15 |
19 | lua | 11 |
20 | bash | 10 |
I also checked the 20 most popular tags from https://stackoverflow.com/tags to be able to compare the results.
. | Tag | Occurence |
---|---|---|
1 | javascript | 1553961 |
2 | java | 1370330 |
3 | c# | 1177662 |
4 | php | 1165762 |
5 | python | 891863 |
6 | c++ | 553889 |
7 | mysql | 504295 |
8 | objective-c | 282490 |
9 | c | 270118 |
10 | r | 221396 |
11 | ruby | 191670 |
12 | swift | 180026 |
13 | vb.net | 115839 |
14 | bash | 95125 |
15 | vba | 93824 |
16 | postgresql | 80043 |
17 | matlab | 76946 |
18 | scala | 75314 |
19 | perl | 58632 |
20 | delphi | 40852 |
Languages such as python
,java
, javascript
, php
occupy top positions in both tables. Relatively low in the Steem table is csharp
.
Below is the list of languages only included in the Steem ranking:
go
solidity
rust
kotlin
scratch
assembly
elixir
lua
And the languages only included in the stackoverflow ranking:
objective-c
vb.net
vba
postgresql
matlab
scala
perl
delphi
The first factor that can affect these differences is the fact that the Steem blockchain works much shorter and some of the languages were popular some time ago: objective-c
, which is replaced by swift
; perl
, which is replaced by, for example, python
and delphi
, which is now very rarely used. The Steem ranking also did not include both languages from the VisualBasic family: VisualBasic.Net (vb.net
) and VisualBasic for Application (vba
). The low popularity of the postgresql
tag is probably due to the fact that the most popular Steem blockchain databases use other solutions: SteemSQL - mssql
, sbds - mysql
, SteemData - mongodb
. Personally, I am surprised by the low popularity of scala
tag in Steem.
The reason why the languages rust
,kotlin
, elixir
are quite high in the Steem ranking compared to stackoverflow is the fact that they are relatively new technologies. Scratch
is a visual programming language, especially for children and youth, so it's not surprising that it's not popular at Stackoverflow, which is designed for professional programmers. Solidity
is a programming language for the Ethereum platform, hence its high position in the Steem ranking.
Other languages used by platforms similar to Ethereum:
- Lisk:
javascript
- Cardano:
haskell
- NEO:
csharp
,vb.net
,fsharp
,java
,kotlin
,python
- EOS:
wren
- Stratis:
csharp
Let's also look at the popularity of tags on the pie chart.
Development over time
The following chart shows the number of posts related to a top 10 most popular programming languages in given months.
Popularity of which tags will grow fastest? I think that it will be the top 4: python
, javascript
, java
, php
because they are they generally very popular, and used by steem related libraries:
I think that the solidity
can also count on growth due to the development of the Ethereum platform.
Below is the chart for the remaining tags. The division into two charts was intended to increase readability. Among the languages of the second tenth, the greatest potential for growth is for the relatively new ones, ie rust
, kotlin
, elixir
. And probably r
because of machine learning popularity.
Average post payout
The average payout depending on the tag is quite diverse. More popular tags generate higher payouts, probably because of reaching a wider audience. It should also be taken into account that the result for less common tags may be somewhat biased due to the too small sample from which the result was determined.
Posted on Utopian.io - Rewarding Open Source Contributors
info about wren is out of date
EOS: WebAssembly
Your contribution cannot be approved because it is not as informative as other contributions. See the Utopian Rules. Contributions need to be informative and descriptive in order to help readers and developers understand them.
Dear @jacek-w, great to see a new contributor to utopian analysis posts! But I'm sorry, your contribution is not as detailed as other contributions. The "analysis" aspect is a bit on the short end, you mainly present data visualizations without explaining, interpreting or extrapolating the results.
You can contact us on Discord.
[utopian-moderator]
Hey @crokkon, I just gave you a tip for your hard work on moderation. Upvote this comment to support the utopian moderators and increase your future rewards!
Ok I understand. I've updated my contribution.
Hi @jacek-w, I'm sorry, it is not foreseen to have several review iterations.
There is surprisingly high share of Java in programming languages group... JS' & Python's don't bothers me in terms of share - use cases and high popularity makes them shine, but why Java?
Congratulations @jacek-w! You have completed some achievement on Steemit and have been rewarded with new badge(s) :
Award for the number of upvotes received
Click on any badge to view your own Board of Honor on SteemitBoard.
For more information about SteemitBoard, click here
If you no longer want to receive notifications, reply to this comment with the word
STOP
STOP