Issue 2: Following Bitcoin Through the Maze

Introduction

Here is the question that drives most Bitcoin forensic investigations: where did this money come from, and where did it go? The answer seems to be readily obtainable; after all, Bitcoin's ledger is public, every transaction is permanently recorded, and the UTXO model creates explicit links between inputs and outputs. Yet anyone who has actually attempted to trace funds through even a moderately complex transaction graph knows the uncomfortable truth: following the money is technically straightforward but interpretively treacherous.

Flow tracing sits at the heart of blockchain forensics. It is the methodology that connects a known starting point—a theft, a ransomware payment, a sanctioned address—to downstream destinations where funds might be recovered or attributed. The technique underlies every major cryptocurrency investigation, from the FBI's recovery of the Colonial Pipeline to the identification of Silk Road wallets. Without flow tracing, we would have a transparent ledger with no practical means of understanding what it reveals.

But flow tracing is not a single technique. It is a family of methods, each with different assumptions, different computational requirements, and critically different failure modes. In this issue, I aim to outline the core approaches to tracing fund flows, explain how I apply them in practice, and be transparent about where each method yields misleading results. Because the difference between good flow tracing and bad flow tracing is often the difference between actionable intelligence and confident-sounding nonsense.

The Methodology Defined

Flow tracing is the process of constructing and analyzing paths through a Bitcoin transaction graph to understand how funds moved from source addresses to destination addresses. The fundamental unit of analysis is the UTXO, the unspent transaction output, which creates a directed acyclic graph connecting every spend to its funding source.

Core concepts:

The Bitcoin transaction graph is inherently directional. Each transaction consumes inputs (i.e., previous unspent transaction outputs, or UTXOs) and creates outputs (i.e., new UTXOs). This creates a natural "flow" metaphor: funds enter through inputs and exit through outputs. Tracing forward follows this direction—where did funds go after this transaction? Tracing backward reverses it—where did these funds come from?

The academic foundation traces to Reid and Harrigan's 2011 work, which introduced the "transaction network" as a directed graph where nodes represent transactions and edges represent the flow of Bitcoin between them. Ron and Shamir (2012) expanded this into a full graph analysis of the Bitcoin transaction corpus, demonstrating that network analysis techniques from social science and epidemiology could be applied to understand fund movements.

Three primary tracing paradigms exist:

1. UTXO-based tracing (deterministic): The purest form of tracing follows individual UTXOs through their complete lifecycle. When a UTXO is created as a transaction output, it remains until it is consumed as an input to a subsequent transaction. This creates unambiguous, cryptographically-verified links. If UTXO-A was spent in Transaction-X, there is no debate—the blockchain proves it.

2. Value-based tracing (proportional): When a transaction has multiple inputs and multiple outputs, UTXO-based tracing faces a fundamental problem: which input funded which output? Value-based approaches attempt to allocate value proportionally or heuristically. The simplest method, known as "haircut" or "taint" propagation, distributes taint from inputs to outputs in proportion to their values.

3. Entity-based tracing (cluster-aware): Rather than tracing individual transactions, entity-based tracing first clusters addresses into entities (using CIOH and other heuristics from Issue 1), then traces flows between these aggregate entities. This reduces graph complexity but introduces uncertainty into the clustering of the trace.

The fungibility problem:

Bitcoin's design creates an interpretive challenge. When Alice sends 1 BTC from an address holding 3 BTC (1 BTC from a legitimate source, 2 BTC from an illicit source), which Bitcoin does the recipient receive? The blockchain does not answer this question—it cannot, because Bitcoin is fungible within a transaction. This is not a bug in tracing methodology; it is a fundamental property of how Bitcoin works.

Different jurisdictions and analysts resolve this differently. First-In-First-Out (FIFO) accounting assumes older funds are spent first. Pro-rata allocation distributes the taint proportionally. Specific identification requires off-chain evidence of intent. None of these approaches is objectively correct—they are accounting conventions imposed on a system that does not inherently support them.

How to Apply It

Here is my practical workflow for tracing Bitcoin fund flows, refined through investigations and the development of TrailBit's tracing algorithms.

Step 1: Define the investigation scope

Before touching the blockchain, I establish clear parameters: What is the starting point? (Transaction ID, address, or set of addresses) What direction am I tracing? (Forward from source, backward from destination, or bidirectional) What is the objective? (Identify cash-out points, quantify exposure, map entity relationships) What depth is appropriate? (Number of hops, time window, value threshold)

Scope definition prevents the most common tracing failure: unbounded graph expansion that produces overwhelming data without actionable insight.

Step 2: Initial graph construction

Starting from the seed transaction or address, I build the first-degree graph. Forward tracing identifies all transactions that spend outputs from the seed. Backward tracing identifies all transactions whose outputs became inputs to the seed. I also collect transaction metadata: timestamps, fee rates, input/output counts, value distributions.

In TrailBit's Transaction Graph view, this initial expansion shows the immediate neighborhood of the investigation target. I typically start with a depth of 2-3 hops to get oriented before proceeding with a deeper traversal.

Step 3: Apply filtering and prioritization

Raw graph expansion is computationally expensive and analytically overwhelming. A single address can connect to thousands of transactions within a few hops. I apply filters:

Value filtering: Ignore outputs below a threshold (often 0.001 BTC) to exclude dust, and focus on economically significant flows.

Time filtering: Constrain to a relevant time window. If investigating a theft that occurred on a specific date, flows from months earlier are unlikely to be relevant.

Entity filtering: Using cluster data, identify and potentially exclude or specially mark known entities (exchanges, services, mining pools) that represent common transaction endpoints.

Pattern filtering: Flag transactions matching known patterns (peel chains from Issue 3, CoinJoin transactions, consolidations) for special handling.

Step 4: Iterative expansion and analysis

Tracing is inherently iterative. After initial expansion: (1) Identify the most significant unexplored edges (largest value flows, most recent activity), (2) Expand selectively rather than exhaustively, (3) Document branch decisions and rationale, (4) Mark dead ends (spent to known exchange, entered mixer, unspent cold storage).

I maintain a "trace log" noting why I followed certain paths and ignored others. This is essential for reproducibility—another analyst should be able to follow my reasoning.

Step 5: Taint propagation (when applicable)

If the investigation requires quantifying exposure (e.g., "what percentage of funds in this address originated from the theft?"), I apply taint propagation:

Simple haircut: If a transaction has a 10 BTC input tainted at 50% and creates two outputs of 6 BTC and 4 BTC, both outputs inherit 50% of the tainted value. This method preserves taint percentage through splits.

Pro-rata allocation: If a transaction mixes 5 BTC (100% tainted) with 5 BTC (0% tainted) into a single 10 BTC output, that output carries 50% taint. This method dilutes taint through mixing.

FIFO/LIFO allocation: Apply accounting conventions to determine which specific funds are spent. This requires maintaining a temporal ordering of fund acquisition.

The choice between methods depends on investigative purpose and jurisdictional requirements. I always document which method was applied and why.

Step 6: Entity resolution and attribution

The goal of most traces is not to identify transactions, but to identify entities—exchanges, services, or, ultimately, individuals. This requires applying clustering heuristics to group addresses, cross-referencing with known entity databases, identifying service interaction points (such as exchange deposits and payment processor endpoints), and recognizing behavioral patterns consistent with specific entity types.

In TrailBit, the Entity Attribution panel shows which traced addresses belong to known clusters and provides confidence scores for attributions based on the clustering methods applied.

Step 7: Documentation and confidence scoring

Every trace conclusion should include: the path through the transaction graph (specific transaction IDs), the methods applied (UTXO-based, value-based, entity-based), the assumptions made (clustering heuristics, taint propagation method), the confidence level (high for direct UTXO links, lower for probabilistic attributions), and alternative interpretations that cannot be ruled out.

Where It Breaks

Flow tracing methodology has significant limitations that are often understated in forensic reports. Understanding these failure modes is crucial for an accurate analysis.

The mixing problem

When funds pass through a mixing service (such as CoinJoin, Wasabi, or Whirlpool), the cryptographic link between inputs and outputs is intentionally severed. The UTXO chain continues—funds are not destroyed—but the specific input-output mapping is unknown. Post-mix tracing must rely on probabilistic methods, timing analysis, or behavioral heuristics that are fundamentally less certain than pre-mix UTXO tracing.

This is not a failure of methodology; it is a designed feature of mixing protocols. But I have seen forensic reports that trace through mixers as if the uncertainty did not exist. That is methodologically indefensible.

The exchange black box

Exchanges and custodial services break traceability in a different way. When Alice deposits to Binance and Bob withdraws, there is often no on-chain link between Alice's deposit and Bob's withdrawal. The exchange's internal ledger—off-chain and private—determines which user's balance funded which withdrawal. From a purely blockchain perspective, tracing transactions through an exchange is impossible without subpoena power or cooperation from the exchange.

Some analysts trace through exchanges by following the timing and amounts of deposits and withdrawals. This is speculative pattern matching, not deterministic tracing. The confidence level for through-exchange traces should reflect this uncertainty.

The UTXO consolidation problem

When multiple UTXOs are consumed in a single transaction, the question "which input funded which output?" has no on-chain answer. Consider a transaction with 10 inputs (total 5 BTC) and 2 outputs (4.5 BTC and 0.5 BTC). If one input was from an illicit source, does the taint flow to both outputs? To one of them? Proportionally?

This is not a question with an objective answer. Different taint propagation methods yield varying results, and there is no definitive ground truth to validate against. Analysts who present taint calculations as definitive facts are overstating their certainty.

The dust and change ambiguity

Many tracing heuristics rely on identifying which output is "change" (returning to the sender) versus "payment" (going to the recipient). But this determination is itself heuristic—often based on value, address type, or spending patterns. When the change heuristic fails, the entire trace direction can be wrong.

I have encountered instances where following the "larger output" as the continuing fund led to an exchange cold wallet, while the "smaller output" actually contained the funds being laundered. The asymmetric value assumption from peel chain detection is useful but not universal.

The temporal gap problem

Funds can sit unspent for years. When tracing from a 2019 theft and the funds have not moved since, the trace terminates at an unspent UTXO. This provides useful intelligence (the address to monitor) but not the attribution investigators often want. The funds are located, but the person is not.

The cross-chain escape

Bitcoin can be bridged to other chains via wrapped tokens, atomic swaps, or centralized bridge services. Once funds leave Bitcoin, tracing requires cross-chain analysis capabilities that most investigators lack. Even sophisticated actors can simply bridge to a chain with weaker analytics, conduct transactions there, and bridge back.

The false confidence problem

Perhaps the most dangerous failure mode is not technical but psychological. When a trace produces a clear-looking graph connecting source to destination, there is a strong temptation to present it as definitive. But every step in that trace involved assumptions—clustering accuracy, change detection correctness, taint propagation choices—that compound uncertainty.

A ten-hop trace is not ten times as certain as a one-hop trace; it is potentially exponentially less certain, depending on what happened at each hop. Confidence should degrade with distance, not remain constant.

Visual Example

Let me walk through a trace scenario that illustrates both the power and limitations of flow tracing methodology.

A compliance team contacts me about a deposit to their exchange. The depositing address received funds that, three hops back, touched an OFAC-sanctioned address. They want to understand the exposure.

In TrailBit's Transaction Graph, I start from the sanctioned address and trace forward. The initial transaction shows 50 BTC leaving the sanctioned address in a single transaction to two outputs: 45 BTC and 5 BTC. Using change detection heuristics (address type matching and spending pattern analysis), I assess with medium confidence that the 45 BTC output represents the ongoing fund.

The 45 BTC output moves through two more transactions:

Transaction 2: 45 BTC input, 44.9 BTC and 0.1 BTC outputs (peel chain pattern, likely automated extraction)

Transaction 3: 44.9 BTC input, split into 10 outputs ranging from 2-8 BTC (distribution pattern)

One of those Transaction 3 outputs, approximately 4.5 BTC, eventually reaches the exchange's depositor after two additional hops.

The trace path is clear on-chain. But what can I actually conclude?

High confidence: The deposit address received funds that can be traced, through a specific UTXO chain, to the sanctioned address. This is deterministic—the blockchain proves it.

Medium confidence: The change detection at Transaction 1 correctly identified the continuing fund. If wrong, the actual tainted flow might have gone to the 5 BTC output instead.

Low confidence: The depositor "received sanctioned funds." The depositor received Bitcoin; whether that Bitcoin carries legal taint depends on jurisdictional interpretation, the accounting method applied, and whether the intervening transactions (especially the distribution at Transaction 3) sufficiently diluted the connection for compliance purposes.

When presenting this analysis, I provide the raw trace data, the methodological choices, and the confidence levels for each conclusion. The compliance team can then make its own risk determination rather than relying on a binary "tainted/not tainted" assessment that would obscure the underlying uncertainty.

Open Questions

The flow tracing methodology continues to evolve, and several questions warrant more rigorous research.

Standardization of taint propagation: The lack of consensus on taint calculation methods creates inconsistent results across different analysis platforms. Two forensic tools analyzing the same transaction graph can produce different taint percentages, leading to different conclusions about exposure. Should the industry converge on standard methods? Would standardization create exploitable knowledge for adversaries?

Confidence decay models: How should confidence levels degrade across hops? I have suggested that uncertainty compounds with distance, but what is the correct mathematical model for this relationship? A Bayesian framework that accounts for heuristic error rates at each step would improve the field, but it requires better empirical data on the accuracy of heuristics.

Cross-chain trace unification: As Bitcoin increasingly interoperates with other chains via bridges and atomic swaps, how do we build unified traces across ecosystems with different data structures and analysis techniques? The technical challenges are substantial, but ignoring cross-chain flows creates exploitable gaps.

Temporal analysis integration: Most traces focus on graph structure, but timing patterns often carry investigative signal. How should we formally integrate temporal analysis (inter-transaction intervals, mempool behavior, timezone patterns) into trace methodologies? What weight should timing evidence receive relative to structural evidence?

Adversarial robustness: How do we evaluate tracing methodology against adversarial actors who design transaction patterns specifically to mislead analysis? Privacy researchers and forensic researchers have competing interests in this context, and understanding the dynamics of the arms race would help calibrate confidence appropriately.

Legal standards for trace evidence: As cryptocurrency cases increasingly reach courts, what evidentiary standards should apply to flow trace analysis? The Daubert standard requires known error rates; however, as noted by Gong et al. (2022), most tracing tools cannot report error rates against ground truth. How do we bridge the gap between investigative utility and legal admissibility?

These questions matter because flow tracing is not an academic exercise. It determines who gets investigated, which funds get frozen, and what evidence reaches courts. Getting the methodology right—and being honest about its limitations—is essential for justice in both directions: catching actual criminals and protecting innocent users from false accusations.

Geo Nicolaidis

Builder, TrailBit.io

If you found this useful, subscribe to get the next issue in your inbox. Each issue breaks down a different heuristic used in Bitcoin forensics — what it assumes, where it breaks, and why it matters.