Cryptocurrency: Growth Prediction Model

in #startup7 years ago

We live in a sea of cryptocurrencies, and sometimes it feels like there are 100 new ones everyday and it just seems impossible to stay ahead. Fortunately there are some truly fantastic researchers in this space

ICODrops https://icodrops.com/

WalrusCap https://docs.google.com/spreadsheets/d/1wJ-g_lXTBgOZ9HDALphjmTJv8DB3-hCCTvX3OtN5b3w/edit#gid=1917438206

Crypto Projects https://docs.google.com/spreadsheets/d/1hdpKwgJRdvNIslVvN91tEcsoHtUgShpHZmxdlxf_FTo/htmlview#

Lendex0 https://docs.google.com/spreadsheets/d/e/2PACX-1vScEz0nAEoBr9a7aJ5qKEsdIzORuycAzWjDpSwCa-jHnNLpyXyrzwv2_1l69gfLEutLIY4XVLmQklMV/pubhtml#

Mandy’s https://docs.google.com/spreadsheets/d/1MjhiUslFV9bnKSWG7ElCA11B2xxFEbJRzMZ7wmhJrTE/edit#gid=303074765

DiddyCarterICO https://docs.google.com/spreadsheets/d/16GSqCaJwGtQs68w6hep5iKQ8LtGFrEqllDu4ruaVVIU/edit#gid=1587433441

TheGobOne https://docs.google.com/spreadsheets/d/1N8mI7JNIl1ZpAFebr0EkURoV0w1UQbDvB-FZ2qdISgs/htmlview?sle=true#gid=526711915

Ian Balina https://docs.google.com/spreadsheets/d/1qvCCS6lwEH9nOa8KwQGTVhtQ3VXPzed3rXUqksDQkT0/edit#gid=228882474

MoNoICO https://docs.google.com/spreadsheets/d/1xVD9itOY_elQK6Gmknk_GD6siF4IQAxL1JOxC3UBslc/edit#gid=0

NN https://docs.google.com/spreadsheets/d/11EuOOMRePHVTls1UYjd_iakVrgOoIVg0annNp2oB30E/edit#gid=0

MyICO https://docs.google.com/spreadsheets/d/1sJ36185NirHnbr6q_LGTuZwt09Jaov5SOBGrB8mBd5Y/edit#gid=1544587875

CryptoMoon https://docs.google.com/spreadsheets/d/1js-N4uFteHPAYMAZJRPajDhOkhVCE-iwHdPnxPtuftU/edit#gid=877350162

Top 7 ICO https://docs.google.com/spreadsheets/d/12bJmLfCf02VIrBg4DPrTeR_hqGE6V5dorgMglPUOwSA/edit#gid=0

The Kript Keeper https://docs.google.com/spreadsheets/d/1F60DrtAbBlBE_NHqbJNnHKN2cP830dB9sf3Obbj0n5k/edit#gid=587531212

CryptoPros https://docs.google.com/spreadsheets/d/e/2PACX-1vSyCDYkbttPU1Cjw8vsOHzFXKhfbZEPfT53arhVPssYRPqssFkBo9bMxP5mGs8SoVSFvuMRA0LjAzIT/pubhtml?gid=680834704&single=true

But as you can see that is still an insane amount of data to work through, and each one has their own bias towards how they feel towards a specific sector or product, some dislike Lending, others AI, and it’s difficult to create a neutral opinion.

So based off of this I decided I wanted to create a neutral prediction model, one that could at least help me filter some of these projects to see which ones I should investigate in detail. Specifically which ones to do a code review for.

Something that I can give raw data, and it can give me clean output. So first off I want to say thank you to all of the above for all of their incredible research and making it publicly available, I used all of your research as the base data extraction model for my training data.

So the setup was nice and easy, collect the aggregated data from all of them. But now, what metric do you use to define success? Right now, there is unfortunately only one metric and that’s ROI (Return on Investment) since the rest can’t really be quantified right now. So this is the end result we wanted to achieve, so we set up our training data (past ICO’s with their metrics and their current ROI performance). 50% went into training, 50% went into testing.

As anyone that has trained models knows there was a lot of back and forth here, my very original batch I accidentally left the ROI % in as part of the inputs and got a 90%+ confidence model and I was super excited, then I realized it used ROI to predict ROI, so needless to say, very newbie move on my part.

But after a few iterations I arrived at something consistent that I liked. It used the following data;

Social activity (numeric inputs)

LinkedIN followers
Subreddit followers, posts per week
Twitter followers, tweets per week
Telegram followers
Alexa rank
Google trends rank

Development Stage (One of)

None, Whitepaper, Mockup, Proof of Concept, Alpha, Beta, Production

GitHub (numeric inputs)

Number of repo’s
Commits in most active repo
Contributors in most active repo
Forks for most active repo
Watchers for most active repo

Category (One of)

(It’s a long list)
MarketCap of biggest competitor in this field

Purchasable by (One of)

ETH, BTC, NEO, STRAT, Other

Token (Actual values)

Price (calculable from Raised or MarketCap/Supply)
Total
Supply/Circulating
MarketCap/Raised

Maximum contribution capped (true/false)

KYC (true/false)

Website (true/false)

Own Token (true/false)

Own Wallet (true/false)

Team (C level and technical team members only)

Per member; highest skill recommendation amount, total recommendations

Advisors (all)

Per member; highest skill recommendation amount, total recommendations

Idea (High, Medium, Low)* This one still has a slight bias, but I have yet to find another way to quantify, since dependent on the knowledge of the data capturer the answers could vary wildly

Sector disruption
Adoption potential
Market Saturation
Competitors
Innovation

The model was great, anything above 80% confidence and it warranted further investigation, my problem however was that it is incredibly tedious to research this data every single time and very time consuming for the amount of crypto’s and ICO’s out there, so I needed something better.

Step 1: Website scraping

So the start was to scrape the main website, the inconsistency of different websites made this frustrating at best, however with fair consistency I could at least extract external links, for example telegram, twitter, reddit, github, and linkedIN. So this was a good start

Step 2: Token

The primary source here for ICO’s is ICODrops, they collect the data in the most efficient way, so I could pull the data by scraping their site. However I don’t just want to focus on ICO’s I also want to apply my model to existing tokens, so next up was pulling from coinmarketcap, very straight forward, no scraping required, they have a fantastic API, as long as I had the symbol. But often you can’t find the symbol, so I could still scrape the actual site, do a search, grab the symbol and then do the lookup. At this point, I figured just adding the symbol wasn’t that much effort, so I left this as an input

Step 3: Social

Twitter has an easy API for getting the followers, reddit has an easy API for getting the subscribers, telegram has getChatMembersCount, but I need to send a chat_id, turns out this is simply @channelusername, so that’s not a problem.

LinkedIN… LinkedIN does not like developers, no API to give this data, to get member data they have to authenticate via your app, and scraping is illegal, and even if it wasn’t you have to scrape with an account. Ok, that’s annoying, but I guess LinkedIN will remain manual input for now, thanks LinkedIN.

Step 4: Github

Great API’s, easy to get all of the data.

Step 5: The rest

Fortunately most of the rest are fairly easy inputs, so I ended up just leaving them as is. It’s still a bit of data capturing, but fortunately it is a lot faster to do.

Next steps, I’m currently building the model into an accessible API and will release this for anyone to use, you can then collect this data yourself input it into the site and it will do the heavy lifting for you. I am currently busy with this project and hope to complete it fairly soon.

Will post another article as soon as it is available.

Sort:  

Coins mentioned in post:

CoinPrice (USD)📈 24h📉 7d
BTCBitcoin7553.370$1.38%-8.83%
ETHEthereum603.330$3.18%-14.02%
NEONEO53.920$2.85%-9.7%
STRATStratis4.974$1.51%-13.94%

Coin Marketplace

STEEM 0.27
TRX 0.27
JST 0.041
BTC 98029.88
ETH 3631.63
SBD 2.32