DEXBot Technical Design Document
See here: https://github.com/Codaone/DEXBot/wiki/DEXBot-Technical-Design-Document
29.0 DEXBot Technical Design Document
See Googledoc here: https://docs.google.com/document/d/1IXnhskQfVWMD2w-hryom5D72RLWXEDDVmS1kwrrsduc/edit?usp=sharing
Purpose
For refactoring the DEXBot code architecture, there are several areas to address:
Architecture Refactor - for Future code base maintainability
Extensibility - for additional strategies and frameworks
Speed Optimization - pybitshares, i/o vs cpu bound code
Architecture Refactor
Prior to refactoring, we did an analysis of existing code to see what the state of project is currently at. The code reviewed in this document is for existing Strategies Infrastructure as per DEXBot release 0.10.1. The tools in this document can be used for any part of the project and is highly encouraged on a regular basis to make an assessment of what code can be improved and for fixing potential “red flag” areas.
In the diagram, we see the StrategyBase is one giant bloated base strategy class which does all of the following:
- price query
- order management (including some asset conversions)
- data logging to db
- account management
- state machine
- GUI slider/profit update
For DEXBot WP2 , we aim to address the following planned features:
Async queries to nodes and orderbook
CEX/DEX Arbitrage
Strategy Performance tracker
Custom Strategy Plugin infrastructure
Single Account multiple workers
Any future trading engine plugin not yet accounted for, e.g. AI/Machine Learning Engine
In order to meet the above feature improvements, the current code based must be refactored with an architecture that can meet future demands. In this investigation, we turn to existing design patterns. Here is an conceptual industrial grade algorithm trading system layout:
The reporting layer (data logger), Data-source (price feed), and order processing layer are currently embedded in the StrategyBase layer class. These features should be split out into separate classes to manage. A class should have only one responsibility.
The StrategyBase and other strategies should be restricted to the intelligence layer with some support for data pre-processing. From the StrategyBase, the price queries should be extracted into a Price Query Engine, and the Order processing layer extracted to Order Engine, that will permit order placement and status query from both the bitshares dex and cex layer.
Proposed UML structure
For detailed view, please visit:
https://www.lucidchart.com/documents/view/6d5d06c4-81b2-4866-82d3-88788e1e2f6b#
The links on the lucidchart diagram connect to a bitbucket copy of original github code. For ease of viewing, most variables have been removed for classes that need to be refactored. Classes have been colored according to their groups, Green for the strategies, Blue for the Bitshares Price and Order Engines, Yellow for the CEX price and order engines, and Purple for storage. White Classes are either referenced classes or helper utilities.
In this new structure, the StrategyBase is minimal and acts as the framework for other strategies. We should consider making StrategyBase an Abstract Class.
Where the responsibilities are delegated to
- Order Engine (BTS and External)
- Bitshares Price Query Engine
- External Price Query Engine (for CEX/DEX arb)
- Node Query Manager
- Async Event Logger/storage DB
- Base Strategy is an interface with minimal account management.
The above proposal is a general guidelines for refactoring elements and is subject to change/modification should we uncover specific needs at the code level.
Staggered Orders Refactor
Methods related to Virtual orders should be pulled into the Virtual Orders class
Virtual orders class should be designed to allow any new strategy to use it. For example, a new AI driven strategy may want to use the virtual orders system.
Additional Refactoring of User Facing Elements
Dexbot has the UI layer already implemented with MVC for the GUI. The work for the GUI is sufficiently abstracted and easy to understand for extension. However the CLI still has some more room for package improvement.
Extensibility for future strategies and frameworks
Improving Readability and New Strategy Development
The base strategy should be simple and readable, like the Echo template. This will enable new developers to easily use the StrategyBase as a template for new strategies.
The current Echo template is a nice sample, but doesn’t allow a user to test the system directly without any strategy. The StrategyBase should have unit tests which allow a new user to test and see how the base is used as blank template. Currently this does not exist, and the Base code is confusing for new entrants. At least 2 new developers have gotten lost with the existing code base.
Any methods that can be used for more than one strategies would be put into Utility Classes, example “Dust order management, or price calculation” . This way a new developer can select commonly used methods to be implemented in their strategy with less confusion.
All Strategies would implement the Base Strategy Interface and not interfere with specialist operations on the Internal or External Price or Order Engines.
Speed Optimization
Hold out on adding concurrency until you have a known performance issue and then determine which type of concurrency you need. As Donald Knuth has said, “Premature optimization is the root of all evil (or at least most of it) in programming.”
Making the Right Choice
We need to make the right choice for what method of concurrency should be applied to DEXBot. Simply adding on async, will not solve the speed issue if it is CPU bound.
CPU Bound => Multi Processing
I/O Bound, Fast I/O, Limited Number of Connections => Multi Threading
I/O Bound, Slow I/O, Many connections => Asyncio
Other items to consider for refactor are making the whole application run as multiprocessing from the cli. However, we need to further investigate and demonstrate quantitatively which processes are slowing the application down. See Appendix B.
This document addresses the I/O issues only.
Currently DEXBot staggered orders are handled as a 100% synchronous multithreaded application using the worker infrastructure class. Here are two Items to discuss for speeding up trades in Dexbot:
Node Query Manager
Future of PyBitshares
I propose the construction of a Node Query Manager. Currently the end user is allowed to select a node, but what if that node goes down? What if it not the best node for handling rapid trades? We do not have an option to switch over to other nodes should a node fail.
The Node Query Manager would allow for:
Switching to a new node if the existing one is not responding within a specific time frame.
Multiple configurations for node access:
- multiple nodes
- least latency
- round robin (rotate nodes)
Websocket Async Calls to nodes. See Appendix B for benchmark tests to show speed difference between pybitshares vs websocket for a simple ticker query
Currently, Dexbot has relies on PyBitshares to handle all market data and order handling.
Advantages of pybitshares:
Easy to use interface.
Less code: ticker query is only 4 lines long
Disadvantages of pybitshares:
Queries are slow, entire code base is synchronous.
Depend on Bitshares development team to fix critical bugs that can impact dexbot. Most recently the 0.30.0 upgrade prevented DEXBot from working and users had to downgrade until pybitshares was fixed.
Websockets - Async already built in
Advantages:
Websocket queries for pricing and orders is faster for initial node connection. For example, the websocket connection query is about 50% faster than pybitshares.
Some of the code has been already written but needs to be refactored (litepresence)
Asyncio should be restricted to Network IO.
Disadvantages:
We will need to selectively refactor and adopt websocket async for price queries and have better node management
New code to manage and rewrite.
Below are methods that we may want to extract into separate classes. This is not a comprehensive list without errors, but acts as a guideline.
general market -> Bitshares Price Query Engine or External price Query Engine
GUI slider if not strategy logic, should be in display logic.
filtered market for into Filter data utility class
order execution into Order Engine
datalogger
config class (complete)
class StrategyBase(Storage, StateMachine, Events):
def configure(cls, return_base_config=True):
def configure_details(cls, include_default_tabs=True):
def __init__(self,
def _callbackPlaceFillOrders(self, d):
def _cancel_orders(self, orders):
def account_total_value(self, return_asset): # account info
def balance(self, asset, fee_reservation=0):
def calculate_order_data(self, order, amount, price):
def calculate_worker_value(self, unit_of_measure):
def cancel_all_orders(self):
def cancel_orders(self, orders, batch_only=False):
def count_asset(self, order_ids=None, return_asset=False):
def get_allocated_assets(self, order_ids=None, return_asset=False):
def get_market_buy_orders(self, depth=10): # general market info
def get_market_sell_orders(self, depth=10):
def get_highest_market_buy_order(self, orders=None):
def get_highest_own_buy_order(self, orders=None):
def get_lowest_market_sell_order(self, orders=None):
def get_lowest_own_sell_order(self, orders=None):
def get_external_market_center_price(self, external_price_source):
def get_market_center_price(self, base_amount=0, quote_amount=0, suppress_errors=False):
def get_market_buy_price(self, quote_amount=0, base_amount=0, exclude_own_orders=True):
def get_market_orders(self, depth=1, updated=True):
def get_orderbook_orders(self, depth=1):
def get_market_sell_price(self, quote_amount=0, base_amount=0, exclude_own_orders=True):
def get_market_spread(self, quote_amount=0, base_amount=0):
def get_order_cancellation_fee(self, fee_asset):
def get_order_creation_fee(self, fee_asset):
def filter_buy_orders(self, orders, sort=None):
def filter_sell_orders(self, orders, sort=None, invert=True):
def get_own_buy_orders(self, orders=None): # filterd market info
def get_own_sell_orders(self, orders=None):
def get_own_spread(self):
def get_updated_order(self, order_id):
def execute(self):
def is_buy_order(self, order):
def is_current_market(self, base_asset_id, quote_asset_id):
def is_sell_order(self, order):
def pause(self):
def clear_all_worker_data(self):
def place_market_buy_order(self, amount, price, return_none=False, *args, **kwargs):
def place_market_sell_order(self, amount, price, return_none=False, invert=False, *args, **kwargs):
def retry_action(self, action, *args, **kwargs):
def store_profit_estimation_data(self):
def get_profit_estimation_data(self, seconds):
def calc_profit(self):
def write_order_log(self, worker_name, order):
def account(self):
def balances(self):
def base_asset(self):
def quote_asset(self):
def all_own_orders(self, refresh=True): # filtered market info
def get_own_orders(self):
def sort_orders_by_price(orders, sort='DESC'):
def market(self):
def convert_asset(from_value, from_asset, to_asset):
def convert_fee(self, fee_amount, fee_asset):
def get_order(order_id, return_none=True):
def get_updated_limit_order(limit_order):
def purge_all_local_worker_data(worker_name):
def update_gui_slider(self):
def update_gui_profit(self):
Appendix B. Benchmark speed tests websocket(s) vs bitshares
“Python is a great language. But it can be slow compared to other languages for certain types of tasks. If applied appropriately, optimization may reduce program runtime or memory consumption considerably. But this often comes at a price. Optimization can be time consuming and the optimized program may be more complicated. This, in turn, means more maintenance effort. How do you find out if it is worthwhile to optimize your program?.... The answer is measure, measure, measure”
Here are 4 Profiling Tools to measure where the program may spend the most time:
SnakeViz
line_profiler
Pympler
memory_profiler
The Websocket vs Bitshares Source, along with benchmark data can be viewed on github at: https://github.com/thehapax/pricefeed-benchmarks
For a very small price query, the code ousing pybitshares will see overhead from additional calls that can slow down a new connection. For connections that are already open the query time to the same node are about even at 0.2 seconds per query on average.
We do a comparison of Websockets, websocket-client and pybitshares libraries for a comparison.
NOTE: websocket-client and websockets are not part of the same library or group of maintainers.In our use case, websocket-client is sufficient, and this library is also what pybitshares is based on. However for back pressure protection, websockets is the better option.
Time Summary for single price query, connection to same node:
Websocket Async : 1.03 seconds
Websockets Async: 1.33 seconds
Bitshares: 1.87 seconds
Results from the snakeviz tool also show that the pybitshares module spends additional time in its own api creating additional overhead to opening a new connection. Percall time on the right hand side, shows substantial amount of time is spent in the api and bitshares calls, while 0.96 is spent in websocket.py:18 (connect)
One other option to investigate to further speed up Async is UVloop (2-4x speed up). For more information: https://magic.io/blog/uvloop-blazing-fast-python-networking/ Of note: “with uvloop, it is possible to write Python networking code that can push tens of thousands of requests per second per CPU core. On multicore systems a process pool can be used to scale the performance even further.”
Appendix C. Code complexity analysis
Appendix C. How to setup Snakeviz for DEXBot profiling on linux/osx
Build DEXBot with the uml branch from Codaone:
$ git checkout -b uml
$ git branch --set-upstream-to=codaone/uml
$ make clean;make install-user
Install snakeviz
pip install snakeviz
Run Dexbot as normal from cli or gui:
$ dexbot-cli run
or
$ dexbot-gui
After the program finishes running, there should be new files that appear in the top level directory. From the command line prompt, for gui or cli respectively:
$ snakeviz gui-stats
Or
$ snakeviz cli-stats
This will start a localhost http server for you to view the profiling statistics in real time while DEXBot is running. You will be able to see what parts of the code is taking up the most time and where.
For more details please visit: http://jiffyclub.github.io/snakeviz/#installation
References:
SOLID object oriented design principles served as a basis for design.
UML Class and Object Diagrams Overview
Test driven development: What it is and what it is not.
https://stackoverflow.com/questions/582336/how-can-you-profile-a-python-script#582337