The code for this write-up can be found at orderbook.

Introduction

Orderbook is an experimental order-filling engine for an exchange, a place where users can swap commodities, derivatives, currency, and other assets. This project is meant as an exercise in writing tiny libraries, something I’ve been working on to practice system design skills.

I wrote this library to explore the deeper problems in engineering and maintaining an exchange. How they work, what the real bottlenecks are, what theory goes into them, the whole nine yards.

I have never built an exchange, never worked on one, and haven’t studied the source code of one. I’m going in blind, from a software engineering perspective, but they are simple enough in theory that they should be conceivable from scratch with first principles reasoning. While building this library, I read only documentation from Investopedia or similar sites that dealt in general financial concepts related to exchanges rather than studying existing exchange engine implementations.

My idea is to reason about these systems and build them from scratch while keeping an eye on the software engineering behind them. They’re not meant to be perfect, but they should be reasonable and sound. At the end, I’ll discuss some of the strengths and weaknesses of the design and different points about the design that I think should be covered. Think of this as a design document.

Market Orders

Say we want to buy an amount X of $B at price Y but we want it no matter the current market price, even if that price changes. This is called a market order. The difference between a market order and a limit order is that market orders focus on filling as fast as possible even given market price fluctuations, but limit orders focus on the price constraint. This means that limit orders might not fill before the end of the trading day, but market orders most likely will execpt in extreme circumstances.

Market orders are naturally more complex. How would you calculate how far up or down from the market price you would accept in a sell order? And if you want to sell an amount X of asset $B for price Y then you would obviously accept any price above Y since your incentive is to earn as much money. And how do you determine what the market price is?

Your seller Bob of asset $B is willing to sell at a higher price for asset $A that buyer Alice is willing to pay because they placed a market order.

Now we have defined limit and market orders for buy and sell-side orders. This gives us a little matrix we can start to fill out that describes the possible order types we can support.

        |  buy   |  sell  |
--------|--------|--------|
limit   |        |        |
--------|--------|--------|
market  |        |        |
--------------------------

With these primitives we can build more complicated orders. Users could design more complex options trades, for example. This hints at a good place for a deep module: lots of abstraction with a thin interface for interaction. We’ll save that complexity for Part II , though.

Assets

Assets are anything of value in our system that can be traded in some other underlying asset, commonly a currency like USD. Assets are represented in our system with an asset name and its underlying. The underlying is the real financial asset or security that a derivative is based on. This is attached to each order that we create. We can create any arbitrary market between two of our Assets with this scheme. I considered calling this the “opposite”, but settled on underlying. Naming is a hard problem.

// AssetInfo defines the name of an asset and the underlying 
// currency or asset that it is traded across. 
type AssetInfo struct {
	Name	   string
	Underlying string
}
// In use this might look like 
_ = &AssetInfo{Name: "ETH", Underlying: "USD"}

Edit: I spoke with an industry friend of mine after publishing this and learned that most exchanges call this the “quote” or the “quote price” as the value of the asset in the currency it’s being traded. Underlying is usually reserved for specifically derivatives. So in our case, this would be updated to look like

// AssetInfo for exchanging Ethereum for Dollars
type AssetInfo struct {
	Name 	string // ETH
	Quote	string // USD
}

Then we can create a Derivative struct that holds these types of options and still fulfills the Order interface.

type Derivative struct {
	Name 		string
	Underlying 	string
}

Orderbook Structure

In Fig. 0 we created an order type table. This presents a clean way to portray our orders - buy and sell-side. The orderbook currently maintains only one order tree, which has buy and sell orders within it, but I’m considering splitting those buy and sell orders into their own trees. This might come in a later refactor.

Trees are a very useful data structure in computer science. Trees are a type of data structure composed of multiple nodes. There are a lot of different types of trees, but our implementation uses a binary tree. A binary tree has exactly two branches off of each node in the tree, with the root node being the first and only node that has no parents. Leaf nodes are nodes in the tree that have no children. Each node has at most a left and a right branch in a binary tree. Those branches connect to other nodes in the tree. Binary trees are great for fast lookups because with each layer of the tree, you jump closer to your lookup target.

Let’s run an example of this and create a tree that organizes letters alphabetically. Each node in our tree has a “value” of a letter. If you’re looking for the letter B and your tree node starts at E, your first jump would be left of E, since B is before E. So you jump left, and the next node in the tree, C, would have two nodes itself, B and D. You would jump left again, because B comes before C in the alphabet. If you were looking for D, you would jump right instead, since D comes after C in the alphabet. This right or left jumping action is how you traverse the tree. Each jump in our alphabet brings us closer to our target.

binary_lookup-3

Fig. 2. Binary tree lookups.

I chose a tree because ordered lists are too slow to navigate, and a map would require a lot of unnecessary checks based on order placement time and type. Indeed, the tree structure worked quite nicely.

If we used an ordered slice of items, then we would have to jump through hundreds of orders before we might stumble on one that has a price that fits the order’s criteria. The average depth and thus the average hop count of a lookup operation in a binary tree comes out to O(log n), which is significantly faster than iterating through the items in even a relatively small ordered list. Using the above binary tree implementation, we would have drastically fewer jumps to get to the right price point, and once at that price node, we would have a list of orders with the correct market price to fill our order from, allowing us to simply chew through those orders until our fill order is complete.

// Insert will add an Order to the tree.
func (t *TreeNode) Insert(o Order) error {
	if t == nil {
		t = &TreeNode{val: o.Price()}
	}
    // store the order in our current node's list if 
    // the order's price matches the node's price.
	if t.val == o.Price() {
		// when we find a price match for the order,
		// insert the order into this node's order list.
		if t.orders == nil {
			t.orders = make([]Order, 0)
		}
		t.orders = append(t.orders, o)
		return nil
	}
    // navigate left if the order's price is less 
    // than the current node's price.
	if o.Price() t.val {
		if t.left == nil {
			t.left = &TreeNode{val: o.Price()}
			return nil
		}
		return t.left.Insert(o)
	}
    // navigate right if the price is greater
    // than the current node's price
	if o.Price() > t.val {
		if t.right == nil {
			t.right = &TreeNode{val: o.Price()}
			return nil
		}
		return t.right.Insert(o)
	}
	return nil
}

This inserts an order into our tree by navigating to the tree node matching the order’s market price, adds it to the end of that node’s order list, and returns. It will create a price node if no previous order has existed at that price.

We always append orders to the end (nth position) of the node’s order list, and always fill them from the start (0th position), which gives us a natural first-in-first-out queue of orders in our market that is only possible because we organize orders by price in this tree scheme. That order-preservation quality was lost when I attempted map or slice-based solutions.

Order Matching

When attempting to use the slice and map iterations to solve this problem it was a struggle. However, when I tried it with the tree implementation, the problems I was running into previously sort of solved themselves. This was a sign that the tree implementation was doing a lot of the heavy lifting for me, generally a good indicator that one has chosen the right data structure for the job.

This really manifested with order matching, one of the key problems to solve in any exchange. I applied the callback pattern in the Match function and it worked beautifully. It abstracts away the asynchronous tree traversal easily, while allowing us to immediately exit with a match. I considered a channel pattern here but it was unnecessarily complicated in practice.

// Match will iterate through the tree based on the price of the
// fillOrder and finds a bookOrder that matches its price.
func (t *TreeNode) Match(fillOrder Order, cb func(bookOrder Order)) {
	if t == nil {
		cb(nil)
		return
	}

	if fillOrder.Price() == t.val {
		// callback with first order in the list
		bookOrder := t.orders[0]
		cb(bookOrder)
		return
	}

	if fillOrder.Price() > t.val {
		if t.right != nil {
			t.right.Match(fillOrder, cb)
			return
		}
	}

	if fillOrder.Price() < t.val {
		if t.left != nil {
			t.left.Match(fillOrder, cb)
			return
		}
	}
}

It feels natural to use since you have the two matching orders next to each other on the same line, occupying a close context on the screen and in mind.

fillOrder := &MarketOrder{MarketPrice: 15.0}
root.Match(fillOrder, func(bookOrder Order) {
    log.Printf("fill order matched to book order: %v", bo)
})

Order Filling Strategies

As mentioned in Fig 0. we have different order filling strategies. I started with the simplest - limit orders. Like we defined in the First Principles section, limit orders can fill for any price up to its threshold for buying, and as low as its threshold for selling. The problem is, we might exhaust all of the orders in our orderbook that qualify for the specific price. This means a limit order might sit open in our books for an undetermined length of time. We’ll likely need some sort of expiration feature in the future, but - surprise, surprise! - that will come in Part II.

This is the approach to a super basic market order fill. It iterates over our order tree by constantly calling Match until the order reports that it’s filled, then we break out of the loop.

// Fill returns the fill algorithm for this type of order.
func (fm *market) Fill(ctx context.Context, fillOrder Order) {
	for {
		fm.OrderTrie.Match(fillOrder, func(bookOrder Order) {
			if err := fm.attemptFill(fillOrder, bookOrder); err != nil {
				log.Printf("attemptFill failed: %v", err)
			}
		})
        if fillOrder.Filled() {
        	break
        }
	}
}

It checks for the underlying to match the name of the book order, signaling a worthy candidate for our order, and then fills it based on the balance and transfers balances across accounts. If that all happens successfully, then it updates order quantities and returns.


// attemptFill attempts to fill an order as a Limit Fill order.
// * It removes the market order from the orderbook if it fully fills
// the order.
// * TODO: Add rollback functionality. An order could currently transfer a balance and then
// fail to update the order totals, resulting in mishandled money.
func (fm *market) attemptFill(fillOrder, bookOrder Order) error {
	// TODO: keep buy and sell side orders separately so as to avoid this check.
	log.Printf("fillOrder: %+v - bookOrder: %+v", fillOrder, bookOrder)

	if fillOrder.AssetInfo().Name == bookOrder.AssetInfo().Underlying {
		total := float64(fillOrder.Quantity()) * fillOrder.Price()

		if bookOrder.Quantity() < fillOrder.Quantity() {
			return fmt.Errorf("partial fills not implemented") // TODO
		}

		bookOrderOpen := bookOrder.Quantity() - fillOrder.Quantity()
		bookOrderFilled := fillOrder.Quantity() - bookOrder.Quantity()

		// attempt to transfer balance from buyer to seller
		_, err := fm.Accounts.Tx(fillOrder.Owner().UserID(), bookOrder.Owner().UserID(), total)
		if err != nil {
			return fmt.Errorf("transaction failed: %s", err)
		}

		_, err = fillOrder.Update(bookOrderFilled, bookOrderOpen)
		_, err = bookOrder.Update(bookOrderOpen, bookOrderFilled)

		return nil
	}

	return nil
}

Above is the attemptFill function. It’s fairly straight forward, but there are some important points to note. First, we don’t keep buy and sell-side orders separate, so we must compare the underlying of each Order before filling, which is costly. We could optimize this in the future. Second, we don’t have the ability to rollback order fills if anything fails. A future upgrade would allow us to snapshot the current state of both orders to allow us to rollback if anything fails. Thirdly, we see our first and only usage of the Tx interface here which we will discuss in detail here shortly. The Tx interface declares our dead simple single-function interface for updating user balances in the accounting module. We specifically want to limit this to only the Tx function so that all logic funnels through this function.

Alternative Filling Strategies

In the future, if we have multiple order filling strategies we must fulfill, the order filling interface will be clutch for a strategy pattern. We can decide different order types that fulfill the Order interface and attach a Filler to each different Order type, letting each order declare it’s own strategy for fills. And you guessed it, this will come in a refactor - Part II.

Market Price Calculation

Market price is the equilibrium between our two buy and sell side order trees. Our tree structure that we’ve created can help us determine this number, but it’s no small feat. There are a number of ways to determine the fair market price for an asset, but they all have some trade-offs. The easiest way to calculate a market price is to take a rolling average of the last set of orders over a given time frame, but there’s room for interpretation there, too.

Transactions & Accounts

Another key problem in any exchange is making sure you take from the buyer and give to the seller the proper amount exactly once each. This is one of those things that’s easy to say but getting it right 100% of the time is a finely tuned operational skill and practically impossible. In the code snippet above we used the Tx function to achieve this.

type Transaction interface {
    Tx(from string, to string, amount float64) error
}

The Transaction interface defines a simple single function interface that will error on a few conditions. The AccountManager fulfills this interface, as it has access to all of the various accounts necessary, and ensures the transaction is atomic.

The Tx function signature almost certainly has shortcomings, like no rollback support, but it will be a good enough abstraction right now for us to build around. It’s a strong primitive that we can assemble into more complex abstractions, such as rollback support and composite transactions.

// AccountManager defines a simple CRUD interface for managing accounts.
type AccountManager interface {
	Transaction

	Get(id string) (Account, error)
	Create(id string, acct Account) (Account, error)
	Update(id string, acct Account) (Account, error)
	Delete(id string) error
}

The transaction interface needs accounts to transfer balances between. We tie these accounts to a balance and a user ID, likely a user’s email. We currently don’t track transactions, that’s something we’ll cover in a future update. In fact, the entire account manager interface is just persisted in memory, as I wasn’t interested in solving the problem of backing it to persistence. Cut scope where you can, kids.

Architecture Overview

The above modules form into basically three pieces: The market, which wraps and operates on two subordinate systems, the orderbook, and the account manager. The orderbook keeps buy and sell orders organized in tiers by price. The account manager links people to an account and a bank balance in our books. The two are wrapped and bound onto a market, which could be exposed to the world.

market_architecture_overview-1

Fig. 7. Marketplace architecture overview as it currently stands.

Strengths

After implementing the tree design, this library snapped into place. We can efficiently traverse the tree for correctly priced orders and know that we’re not wasting time since we will arrive at a node already populated with other similar orders. The accounts manager and transaction interface allow us to concurrently act on the orderbook inside the Fill function, which is given a context and allows us to concurrently traverse the orderbook until an order is completed. The design is testable, and the interfaces work together with minimal friction.

Weaknesses

This design has no way to notify when an order is filled except for repeatedly asking if the order has been filled. This works fine while we’re inside the Fill function and looping over the order. However, when we are looking in from the outside, it would be annoying as a user to consistently have to ask if an order was completed. It would be better if we had a way to notify when an order was filled or canceled, likely implemented with some signal channel.

Transaction Interface

The transaction interface is also a weak point. We don’t have a way to rollback transactions or compile multiple smaller transactions, which means we will have to handle that logic manually for each case, which would likely result in a complexity-space explosion.

Partial Fills

Partial fills sre not supported by this design, either. This is related but distinct from a larger issue of ownership not being represented by anything other than Orders. One must record all valid orders in order to determine ownership. Of course, we will address this in future posts as well.

These weaknesses will likely be addressed in an upcoming post, hopefully with some performance benchmarks, too.

Conclusions

This is a fun little toy project and has a lot of potential to build in different directions. I’ll be publishing a follow-up with some more details and performance characteristics.

Part II coming soon. We will dive into a more advanced strategy-pattern approach to order filling, support for partial fills, better semantics around manipulating Order types, order expiration, order transaction logs, ownership representation, performance benchmarks, and more.

Remember, the point of this series was to write a tiny library to uncover the core problem of an asset exchange and learn more about that. I would say mission accomplished on that front.