CS and Bitcoin Mempool Transaction Selection

The Bitcoin mempool transaction selection problem has some fascinating computer science elements. This article discusses how computer science principles can be applied to solve the problem in an elegant manner. This article requires some familiarity with Bitcoin and computer science.

Bitcoin transactions consist of one or more inputs (the source of the funds) and one or more outputs (the recipient of the funds). As a simple analogy, if a transaction is a check; the input is the writer of a check and the output is recipient of the check. Bitcoin isn't account based like checks, but it is a concept most people are familiar with.

Transactions must be validated in a similar manner to a bank ensuring a check was signed by its owner and the account has enough funds for the check to clear. For the sake of this discussion, all transactions that have made it into the mempool are valid transactions.

The Bitcoin mempool is a list of pending valid transactions. It includes transactions that have been validated according to the rules of Bitcoin but have not yet been included in a block. Transactions get included in blocks, and thus recorded for all of history, when a miner chooses to include them in a block. The choice of which transactions to include and in what order is based on several factors and is the nature of the problem we are trying to solve.

Transactions have a weight associated with them. This weight is based on the size and composition of the transaction. The weight is important because each block has a finite amount of space. We can think of the weight as the cost for a miner to include the transaction in a block.

Transactions also have a miner fee associated with them. The fee is based on the difference between the transaction's inputs and outputs. The miners fee creates incentive for the miner to include the transaction in the block. The larger the fee, the more incentive to include the transaction.

The last aspect is that transactions have parent transaction. A transaction input actually references a previous transaction's output; creating a chain of ownership. These previous transactions may be in prior blocks or they may exist in the mempool as other pending transactions.

For this discussion we are only concerned with transactions that exist in the mempool. When we refer to parents, we mean transaction parents that are in the mempool. So for this discussion, a transaction that doesn't have parents would be a transaction whose inputs reference transactions already included in blocks.

We make this distinction because we are concerned about transaction selection and parents that are already in blocks have already been selected!

We are left with a problem input that looks something like this:

Tx             Size            Fees         Mempool Parents  
a              1000            2000  
b              350             900  
c              330             990          a  
d              600             1200         a,c  
e              900             1500         d,c  
f              350             1490  

Miners of Bitcoin perform transaction selection according the following rules.

  1. Given a list of unconfirmed but valid transactions, each having a weight and a fee, choose the transactions that give the most fees to the miner
  2. The combined weight of the selected transactions must fit within the maximum block weight
  3. All parent transactions of a child must be included prior to including that child

Three has two consequences:

  1. As mentioned previously, parents in prior blocks have already been included and for the sake of selection do not need to be considered
  2. A transaction with parents in the mempool cannot be added until all of its parents in the mempool are added

Now to begin.

An Optimization Problem

Asking us to optimize item picking given a size constraint is a canonical knapsack problem. There are two simple algorithms for solving knapsack:

  1. Dynamic programming
  2. Greedy approximation

Because we know there are other constraints at play, we're going to start with the greedy algorithmlink.

This algorithm sorts items by a value/weight ratio. You can think of this as including the best bang for the buck items. This algorithm simply adds items until nothing else fits.

Pseudocode of Greedy Knapsack

1. packed_weight := 0  
2. packed_txs := {}  
3. sort txs descending by fee/weight ratio  
4. while (packed_weight < max_block_weight)  
5.     add top tx to packed_txs  
6.     packed_weight += tx.weight  

But there is a wrinkle: parent transactions.

This severely complicates the simple knapsack problem outlined above. We now need to track whether a transactions parent's have been included and either not include the transaction (while simple is probably sub-optimal) or defer inclusion until all parents have been selected.

Pseudocode of Naive Solution

1. packed_weight := 0  
2. packed_txs := {}  
3. deferredTxs := {}  
4. sort txs descending by fee/weight  
5. while (packed_weight < max_block_weight)  
4.     if any parent not in packed_txs  
5.         add tx to deferred_txs  
6.     otherwise  
7.         add tx to packed_txs  
8.         iterate all deferred and check if now cleared  

From an algorithm analysis standpoint this could get ugly very quickly. The check for deferred transactions can quickly trigger a cascade of checks and rechecks of the list. There is probably a better way.

The nature of the parent child relationship looks and feels like a graph. When looking at problems, I like to spend some time understanding what I am up against. So let us explore what transactions look like when represented as graphs.

Transactions as a Graph

Bitcoin transactions create a chain where each input references the output of a prior transaction. This get a bit more complex because transactions can have multiple inputs and multiple outputs. As a result, you can end up with a variety of structures for these transaction chains.

A few scenarios of parent-child relationships in the mempool:

Transaction Chain

A child transaction can have a single parent in the mempool. If we repeat this multiple times we end up with an ancestor chain.

  • A is the first transaction
  • B spends A
  • C spends B

chain

Multiple Parents

Because a transaction can have multiple inputs, it's possible that more than one of its parents are in the mempool. This is the case of multiple parents.

  • A, B, and C are pending transaction in the mempool.
  • D tries to spend outputs from A, B, and C.

multiple parents

Multiple Children

Because a transaction can have multiple outputs, it's possible that multiple children can spend from a single parent.

  • Parent A is spent by children B and C.

multiple children

Combinations

We can then get really crazy and start to combine things.

  • A is spent by B
  • B is spent by C
  • D spends from A, B, and C.

crazy

Intuitively we are treating our transactions as vertices and the link between inputs and outputs as the edges.

The neat thing is that all of these parent-child relationships point from the child to the parent. As a result we say the graph is directed: the graph has edges that go from one direction to another.

We also observe that we never loop back. In fact, a parent can never spend a child. It's impossible for this to happen because the parent would be invalid. It would be spending a transaction that does not yet exist (the child).

From a graph theory perspective, the fact that the we do not have loops means that the graph is acyclic: it does not contain cycles.

A directed acyclic graph is know known as a DAG. This information and reasoning will be helpful for our algorithm selection.

One other wrinkle is to consider what might happen if we have a parent transaction P with a low fee/size ratio and we attempt to spend it from a child C with a high fee/size ratio.

C will not be able to be selected until P is selected. If the ratio for P is really poor, then both P and C will be stuck! What if C has a fantastically high ratio? Should this impact P? Of course!

This selection feature is called "child pays for parent" (CPFP), which allows the child to be bundled with a parent (and its ancestors) and provide fee bumping. This is an important feature for allowing a tx recipient to fee bump the tx with too low of a fee. The net result of CPFP is that transaction ancestor chains can be bundled and considered in aggregate.

Now that we've thought about the problem a bit, lets see how we might solve it. To make our algorithm simpler, let's break the problem down into two smaller discrete problems:

  1. Transaction Bundling
  2. Transaction Ordering

Making the problem smaller helps us reason about each piece better.

Transaction Bundling

As transactions arrive in the mempool, it would be helpful to know which transactions have some relationship in any way, shape, or form. We aren't concerning ourselves with ordering yet.

If we were to think about the mempool as a whole, we would see groups of transactions. Transactions that have some relationship would be linked. There would be other groups of transactions that are linked but not linked to our first group. There would be individual isolated transactions. We want to figure out how to find these groups of transactions.

As mentioned already a directed graph is a graph one where the edges have a direction (point from one vertex to another vertex). For directed graphs, there is a concept known as strongly connected components. It is defined as a maximal set of vertices where each vertex is reachable from every other vertex in the subset:

strongly connected components

The above diagram has three strongly connected components outlined in the dotted lines. Notice that in each of the SCC, every vertex is reachable from every other vertex.

Astute readers may spot a major problem: two of the components have cycles. In fact, for a strongly connected component to have more than a single vertex, it MUST contain cycles.

As we found out above, Bitcoin mempool transactions are acyclic (do not contain cycles). Therefore, the concept of strongly connected components isn't going to work for us. But let's see what a a Bitcoin mempool graph might look like:

bitcoin graph

As you can see we have no cycles. If we were to count the strongly connected components we would have eight individual nodes. Not helpful.

But what if we simplify and ask to explore whether any edge exists, regardless of direction, between two vertices? Well, this is called a weakly connected graph and it turns out we can find the weakly connected components of it.

weakly connected

So if we take our Bitcoin transaction graph, ignore edge direction and find the connected components we end up with groups of transactions. Eureka!

There is also an incredibly efficient way to find connected components in an undirected graph: the disjoint-set (aka union-find) data structure.

The purpose of this disjoint-set is to maintain independent sets of information. We do this by supporting two operations: union and find.

  • union merges two sets together
  • find returns the set that a given item belongs to

There are several ways to implement this data structure that result in various run times. A simple naive implementation use a HashMap to store a set id for each item.

  • The find operation uses the HashMap to look up the item in O(1) time and return the set id it belongs to.

  • The union operation will first look up the set ids for items A and B. If they are different, we can iterate the HashMap and change any values in B's set to A's set.

In this case the union operation is slow O(n), at least in comparison to the constant time find operation. If we perform that n times to build the full disjoint-set we end up with O(n^2). Fortunately there are vastly more efficient ways to construct a disjoint-set!

Treating elements as a tree and using a variety of techniques it is possible to get the amortized time per operation to O(m α(n)) for m operations on n elements. α is the inverse-Ackermann function, which is essentially constant for realistic input sizes link. That's a lot of time-complexity soup to arrive at the fact we can build a disjoint-set in almost linear time using a simple algorithm.

If you want to explore this data structure in more depth, I recommend starting with the Wikipedia article and this lecture.

To translate this into our problem of mempool selection we can do the following:

1. initialize the disjoint-set with each tx as its own set  
2. for each tx  
3.   for each parent of tx  
4.      union(tx, parent)  

After completing this, all transactions that are related will belong to the same set. We were able to do this in near-linear time!

With the connected components, we can easily create an aggregate fee and aggregate size by summing the values of the transactions that belong to same set. This operation would also occur in near-linear time!

Now that we know how to group transactions efficiently we can find bundles of transactions and use that as input into our knapsack optimization.

This still leaves the remaining problem of how to order those transactions.

Transaction Ordering

When we insert transactions into a block we need to be sure that parents precede children. And for those parents we need to know if their parents were already included.

In a directed acyclic graph there is a way to order vertices known as topological ordering. If we think about a DAG, there will be one or more source vertices that only have outbound edges. The topological ordering puts these source vertices first. It then traverses the DAG until vertices that only have inbound edges are reached.

Fortunately, there are a few efficient ways to find the topological ordering of a DAG. The simplest algorithm is a form of depth first search which will run in linear time!

You can start at any vertex and DFS will correctly order a DAG.

Pseudocode for DFS Topological Ordering

L := {} // list of sorted nodes

while an unmarked node n exists  
    select an unmarked node n
    visit(n)

visit(node n)  
  if n is marked return

  for each edge connecting node n to node m
     visit(m)

  mark n as complete
  add n to tail of L

This algorithm stores values in completion order, which means the parents will be stored first and the children stored last.

This is exactly what we need! Pretty cool. More information on topological sort.

Putting It Together

We can now make our complete algorithm work like this:

  1. Using a disjoint-set, add all of our transactions and union them with their parents
  2. Iterate all items in the disjoint-set and create the bundles of transactions based on the set each item belongs to
  3. Sort descending the transaction bundles by their aggregated value/weight ratio
  4. Using greedy knapsack, we pick the best transaction bundle that fits into the block until no more can fit
  5. When a transaction group fits into the block, sort the group using the topological sort DFS algorithm
  6. Profit!

Can we do better? Most certainly.

Because we've reduced the parent-child relationship problem to single-aggregate values we can go back to using a dynamic programming algorithm for knapsack. This gets us away from a greedy approximation which is not 100% optimal. This change is likely a small improvement.

We're also selecting entire sets of transactions. In reality, Bitcoin limits the total size of nested transaction set, but it is certainly possible that the full transaction set may be sub-optimal. So it may be possible to break ordered transaction sets into ordered subset. This takes us back to another combinatorial optimization problem. Fun to think about if you're inclined.

I hope you've enjoyed this article and if you have any other thoughts, please leave them in the comments!

comments powered by Disqus