R programming for quantitative investment: part 2 - fetch crypto market data with API and forecast expected future price
In this post we'll fetch crypto market pricing data for analysis and forecasting. Unlike previous post on crypto data mining with R, we'll use API instead of web scraping. We'll be using R to both fetch and analyze data. Of course, R data can be exported to almost all major data formats including that of Excel, SPSS, SAS. So collecting data would be helpful even if you use other software/language for analysis.
We'll collect, for illustration, historical (daily) pricing of BTC from CryptoCompare (CC, from now) using their API. You can collect any other crypto data and do many more things. See CC help for available options. There are reasons for using CC's data including:
- CC APIs are available under free to use under a Creative Commons Attribution-NonCommercial 3.0 Unported (CC BY-NC 3.0) license.
- Some of the great services use CC pricing API including EtherChain, EtherScan, Exodus, DAppWallet, nanopool, GasTracker, https://explorer.zcha.in/, https://moon.cryptothis.com/" Ethereum Stats App, Ethereum Classic Stats.
So we get reliable data for free, as in freedom. But please do not abuse them, a request every 10 sec should be more than enough. Please also make sure you credit CC with a link if you use their data on your website or app.
If you haven't already, install R and RStudio, a great open-source R IDE developed by Hadley Wickham. RStudio website has a lot of learning materials to get you started and do more. Now let's get to actual work. First load required packages, namely, jsonlite, for fetching JSON data via CC's API, forecast, for time-series based forecasting, and ggplot2, for visualizing data:
R> library(jsonlite)
R> library(ggplot2)
R> library(forecast)
Let's do an API request:
R> cc <- fromJSON("https://min-api.cryptocompare.com/")
We'd like to know what our request fetched from CC, i.e, what data the variable cc
contains. The first thing we can do is run str(variable_name)
. str() is a very useful function for examining data structure of a variable. To learn more about the function, issue ?str
in R console or see here.
R> str(cc)
List of 3
$ Called : chr "/"
$ Message : chr "Min API Options, works with all symbols, for more options see https://www.cryptocompare.com/api/. If you are requesting signed "| __truncated__
$ AvailableCalls:List of 1
..$ Price:List of 13
... ... output truncated
str()
returns a long output in the event of complex data structures. We have a list; it is a confusing data structure, especially to beginners. See more about accessing elements of a list in this excellent stackoverflow post.
Studying the output of str()
I navigate to cc$AvailableCalls$Price$HistoDay$Info$Examples
, where I can see API request examples.
R> cc$AvailableCalls$Price$HistoDay$Info$Examples
[1] "https://min-api.cryptocompare.com/data/histoday?fsym=BTC&tsym=USD&limit=30&aggregate=3&e=CCCAGG"
[2] "https://min-api.cryptocompare.com/data/histoday?fsym=ETH&tsym=USD&limit=30&aggregate=3&e=Kraken&extraParams=your_app_name"
... ... output truncated
Collect BTC historical (daily) data:
R> cc_histoday_btc <- fromJSON("https://min-api.cryptocompare.com/data/histoday?fsym=BTC&tsym=USD&allData=true&e=CCCAGG")
Run str(cc_histoday_btc)
to display the fetched data's structure. We now create a time series object from the fetched data and store it to variable <btc_ts>
. You can choose whatever name you please, except for a few limitations. Run ?ts
to learn more about time series objects.
btc_ts <- ts(cc_histoday_btc$Data$close, start = cc_histoday_btc$Data$time[1])
Now we fit an ARIMA model using auto.arima()
and forecast BTC's price for the next 50 days.
R> fit_arima <- auto.arima(btc_ts)
R> autoplot(forecast(fit_arima, 50))
Looks like BTC price is only going up! Let's see how accurate our model is with accuracy(fit_arima)
command:
R> accuracy(fit_arima)
ME RMSE MAE MPE MAPE MASE ACF1
Training set 0.5483334 27.41288 9.607521 0.02666922 3.533025 0.9876943 0.002047275
Different error estimations do not appear to be too big. Without using complicated parameter or code, we produced some neat and useful results. We'll learn more soon!
A big warm Steemit Sunday greeting goes out to you Cryptovest! :-D
Upvoted and High Pawed!
Flatrider
hhahaha. this deserves more paws
:-D Thanks Ayrton!
Good post 👍
Thanks! It's exciting when other users find the posts useful.
I totally agree 💯👍
I always wanted to get my hands dirty with R. This is a great motivator. Going to be playing with it soon. Thanks!
@cryptovest well if it can predict the price change of btc and other altcoins then its awesome i am definetly in
#upvote
Thanks! Stay networked! We'll definitely learn about BTC-altcoin prices analysis, as well as prices between altcoins, so investors know which coins they could release and which to invest in. I will probably put all codes in a GitHub repo so others can use them straight out of the box.
here is something you might find interesting @satfit @tudisco @scrazy @lykencrypto @flatrider
https://steemit.com/programming/@cryptovest/this-simple-r-script-keeps-track-of-crypto-portfolio-calculate-realtime-portfolio-value-in-any-fiat-or-crypto
Nice work man :) I love data mining.