Issue 4: Taint Analysis

When 2.0 BTC from a hacked exchange mixes with 3.0 BTC of legitimately purchased coins, how tainted is the result? 100%? 40%? It depends entirely on which model you use—and there's no industry standard.

This issue breaks down the three dominant approaches to taint calculation, demonstrates how they produce wildly different scores for identical transactions, and examines what that means for compliance decisions.

Introduction

In blockchain forensics, determining the historical flow of illicit funds through complex transaction graphs requires systematic attribution methodologies. Taint analysis provides a quantitative framework for assessing the degree to which specific funds can be attributed to known illicit sources. Unlike binary classification systems that simply label funds as "clean" or "dirty," taint analysis introduces probabilistic measures that account for the mixing and dilution effects inherent in cryptocurrency transactions.

Two events in early 2025 highlighted how much rides on taint analysis methodology. In February, North Korean hackers stole $1.5 billion from Bybit—the largest crypto heist in history—then laundered it through thousands of addresses, bridges, and mixing services in what investigators described as a "flood the zone" tactic designed to overwhelm compliance analysts. One month later, the U.S. Treasury lifted sanctions on Tornado Cash after courts ruled that OFAC couldn't sanction immutable smart contracts. The juxtaposition is telling: as state-sponsored actors deploy increasingly sophisticated laundering techniques, the legal tools available to target mixing services face new constraints. How taint is calculated—and whether it survives mixing—determines what can be recovered and what disappears.

The evolution from early poison-based models to sophisticated proportional attribution systems reflects the maturation of blockchain analysis from rudimentary tracking tools to forensically rigorous investigation frameworks employed by law enforcement, financial institutions, and compliance professionals.

Methodology Framework

Poison Taint Model

The poison taint model represents the most conservative approach to taint attribution, operating under the binary assumption that any contact with illicit funds contaminates all subsequent outputs.

Mathematical Definition:

T_poison(o) = 1 if ∃ input i where T(i) > 0
T_poison(o) = 0 otherwise

Where T(i) represents the taint level of input i, and o represents the transaction output.

Implications: This model assumes perfect fungibility contamination, where a single tainted satoshi renders an entire output fully tainted. While mathematically simple, this approach suffers from taint propagation inflation, eventually marking significant portions of the Bitcoin supply as tainted through transitive contamination.

Haircut (Pro-Rata) Taint Model

The haircut model introduces proportional attribution based on the relative value contribution of tainted inputs to the total transaction value.

Mathematical Definition:

T_haircut(o) = Σ(V_tainted_i × T(i)) / Σ(V_all_inputs)

Where V_tainted_i represents the value of tainted input i, and V_all_inputs represents the sum of all input values.

Rationale: This approach treats Bitcoin transactions as mixing events where taint is distributed proportionally among outputs based on input contributions. The model assumes that value attribution follows mathematical proportionality rather than specific satoshi tracking.

FIFO Taint Model

The FIFO (First-In-First-Out) model applies temporal ordering assumptions to UTXO spending, treating older inputs as preferentially consumed in spending events.

Mathematical Definition:

For spending event at time t:
Order wallet UTXOs by confirmation time (oldest first)
Consume UTXOs sequentially until spend amount is satisfied
Remaining taint = weighted average of unconsumed UTXOs

Worked Example:

A wallet holds three UTXOs:

UTXO 1 (oldest): 1.0 BTC, taint = 100%
UTXO 2: 2.0 BTC, taint = 0%
UTXO 3 (newest): 2.0 BTC, taint = 50%

If the wallet spends 2.5 BTC, FIFO assumes UTXO 1 (1.0 BTC) and part of UTXO 2 (1.5 BTC) are consumed. The remaining wallet balance (2.5 BTC) carries taint calculated from the unconsumed portion of UTXO 2 and all of UTXO 3.

Application Context: Primarily employed in tax accounting scenarios where specific identification of spent coins is required for cost basis calculations. Less common in forensic contexts because the Bitcoin protocol does not actually enforce spending order—FIFO is an accounting convention, not a technical reality.

Empirical Analysis Example

Consider a transaction mixing clean and tainted inputs:

Transaction Parameters:

Input A: 3.0 BTC (clean, taint = 0%)
Input B: 2.0 BTC (compromised exchange withdrawal, taint = 100%)
Output: 5.0 BTC to address X

Taint Calculations:

Poison Model: T = 100% (binary contamination)
Haircut Model: T = (2.0 × 1.0) / 5.0 = 40%
FIFO Model: Dependent on UTXO age and spending order

Multi-Hop Propagation:

If the 5.0 BTC output subsequently mixes with 5.0 BTC of clean funds:

Secondary Haircut Taint: (2.0 tainted value) / 10.0 total = 20%

This demonstrates taint dilution through successive mixing events, a critical consideration for threshold-based compliance decisions.

Technical Limitations and Edge Cases

Mixing Service Ambiguity

Centralized mixing services introduce attribution gaps, rendering input-output relationships computationally intractable. Traditional taint models fail when:

Input-output mappings are intentionally obscured
Temporal delays separate deposits from withdrawals
Pool balances create mathematical indeterminacy

CoinJoin Complications

Collaborative transaction protocols like CoinJoin violate the Common Input Ownership Heuristic, rendering standard taint attribution mathematically undefined for equal-value outputs.

Exchange Integration Effects

High-volume exchange wallets create natural mixing environments where taint rapidly approaches statistical insignificance through dilution with legitimate trading volume.

Regulatory and Compliance Implications

Different jurisdictions and institutions apply varying taint thresholds for compliance decisions. While no universal standard exists, industry practice has converged around general tiers:

Direct Association (100%): Immediate outputs from OFAC-sanctioned addresses or confirmed illicit sources. Most institutions treat these as untouchable regardless of subsequent mixing.
High Confidence (>50%): Majority attribution to suspicious sources. Typically triggers enhanced due diligence or transaction blocking.
Elevated Risk (5-25%): Common institutional threshold range for flagging transactions for manual review. The specific cutoff varies by risk appetite—exchanges serving retail customers often use lower thresholds than OTC desks.
De Minimis (<1%): Generally accepted as commercially reasonable. At this level, taint is considered background noise from Bitcoin's natural circulation through the broader economy.

These thresholds reflect observed industry practice rather than regulatory mandate. Neither FATF guidance nor FinCEN regulations specify numerical taint thresholds, leaving institutions to develop internal policies—and creating compliance uncertainty when tools disagree.

Where the Field Is Heading

Two research directions are likely to reshape taint analysis in the near term:

Temporal decay models weight historical associations less heavily than recent ones. A transaction touching illicit funds five years ago arguably carries less compliance risk than one from last week. Several chain analysis firms are experimenting with time-weighted scoring, though no standard decay function has emerged.

Cross-chain attribution attempts to track taint across bridge protocols, atomic swaps, and wrapped assets. As value moves across Bitcoin, Ethereum, and other chains, traditional single-chain taint models lose visibility into it. This remains an unsolved problem—bridges function as de facto mixers from an attribution standpoint.

Practical Considerations for Analysts

Given model-dependent variability, how should practitioners approach taint scores?

Document your methodology. When preparing reports for law enforcement or compliance purposes, explicitly state which taint model was applied. A 40% haircut score and a 100% poison score for the same transaction are both "correct"—the model choice is the variable.

Use multiple tools as cross-checks. When Chainalysis and Elliptic produce different scores, investigate why. The divergence often reveals assumptions about clustering or attribution that warrant examination rather than simply picking the more convenient number.

Consider the decision context. Poison models may be appropriate for sanctions compliance where any association with designated entities is disqualifying. Haircut models better align with risk-based AML frameworks, where proportionality matters. Match the model to the regulatory requirement.

Be skeptical of precision. A taint score of 23.7% implies false precision. In practice, uncertainty in address clustering propagates through taint calculations. Treat scores as approximate indicators, not exact measurements.

Conclusions

Taint analysis represents a mature but evolving discipline within blockchain forensics. The transition from binary poison models to nuanced proportional attribution reflects the field's development toward more sophisticated analytical frameworks. However, fundamental limitations remain:

Model Dependency: Identical transaction sets yield different taint scores under different models
Privacy Technology Resistance: Modern privacy protocols increasingly resist traditional attribution methods
Regulatory Heterogeneity: Lack of standardized compliance frameworks creates jurisdictional arbitrage opportunities

Future developments in privacy technology and regulatory frameworks will likely require continued evolution of taint analysis methodologies to maintain investigative efficacy while respecting legitimate privacy expectations.

References

Meiklejohn, S., et al. "A fistful of bitcoins: characterizing payments among men with no names." Communications of the ACM 59.4 (2016): 86-93.
Möser, M., et al. "An empirical analysis of traceability in the Monero blockchain." Proceedings on Privacy Enhancing Technologies 2018.3 (2018): 143-163.
Reid, F., and Harrigan, M. "An analysis of anonymity in the Bitcoin system." Security and privacy in social networks. Springer, 2013. 197-223.
Chainalysis. "The 2026 Crypto Crime Report." Chainalysis Inc., 2026.
Financial Crimes Enforcement Network. "Application of FinCEN's Regulations to Certain Business Models Involving Convertible Virtual Currencies." FIN-2019-G001, 2019.
TRM Labs. "The Bybit Hack: Following North Korea's Largest Exploit." TRM Blog, February 2025.
U.S. Department of the Treasury. "Tornado Cash Delisting." Press Release, March 21, 2025.

This analysis is for educational and research purposes. Taint analysis methodologies should be applied within the appropriate legal and regulatory framework.

Geo Nicolaidis

Builder, TrailBit.io

If you found this useful, subscribe to get the next issue in your inbox. Each issue breaks down a different heuristic used in Bitcoin forensics — what it assumes, where it breaks, and why it matters.

Introduction

Methodology Framework

Poison Taint Model

Haircut (Pro-Rata) Taint Model

FIFO Taint Model

Empirical Analysis Example

Technical Limitations and Edge Cases

Mixing Service Ambiguity

CoinJoin Complications

Exchange Integration Effects

Regulatory and Compliance Implications

Where the Field Is Heading

Practical Considerations for Analysts

Conclusions

References

I build tools for Bitcoin forensics and responsible AI