Merkle Tree Construction
Methodology Version 1.0 — Effective February 2026
After each track in a certification batch has been canonicalised and hashed, the individual track hashes are assembled into a binary Merkle tree. This data structure allows a single root hash to cryptographically commit to every track in the batch, while still permitting each track to be independently verified.
What is a Merkle tree?
A Merkle tree is a binary tree in which every leaf node contains a data hash and every non-leaf node contains the hash of its two children. The single hash at the top of the tree — the Merkle root — is a compact commitment to the entire dataset. Any change to any leaf (any track's metadata) propagates upward through the tree, producing a different root hash.
In this example, the Merkle root commits to all four tracks. If Track B's metadata were altered, its leaf hash would change, which would change H_AB, which would change the root — making the tampering immediately detectable.
Construction process
1. Sort leaf hashes
All track hashes within the certification batch are sorted lexicographically (alphabetical order of their hexadecimal representations). This ensures that the same set of tracks always produces the same tree structure, regardless of the order in which tracks were processed.
2. Build the tree bottom-up
Adjacent pairs of hashes are concatenated (left + right) and hashed with SHA-256 to form parent nodes:
parent_hash = SHA-256(left_child_hash + right_child_hash)
This process repeats upward through each layer until a single root hash remains.
3. Odd-number duplication rule
If any layer contains an odd number of nodes, the last node is duplicated to form a pair with itself. This ensures the tree is always a complete binary tree. The duplicated node is hashed with itself:
parent_hash = SHA-256(node_hash + node_hash)
4. Root hash
The final single hash at the top of the tree is the Merkle root. This 64-character hexadecimal string is what gets anchored to the blockchain.
Inclusion proofs
A key property of Merkle trees is that any individual leaf can be verified against the root using a compact inclusion proof — a small set of sibling hashes from the leaf to the root.
For example, to prove that Track A is part of the certified batch, the proof consists of:
- Track A's own hash (the leaf)
- Track B's hash (the sibling at the leaf layer)
H_CD(the sibling at the next layer up)
With these three values, a verifier can reconstruct the path from Track A's leaf to the Merkle root:
Step 1: SHA-256(Track_A_hash + Track_B_hash) → H_AB
Step 2: SHA-256(H_AB + H_CD) → Root
Step 3: Compare computed root with the published root
If the computed root matches the blockchain-anchored root, the track is confirmed as part of the certified batch.
Proof size
The inclusion proof grows logarithmically with the number of tracks. For a catalogue of 10,000 tracks, each proof requires approximately 14 sibling hashes (log2(10,000) ≈ 13.3) — less than 1 KB of data, regardless of the catalogue size. This makes individual track verification efficient even for very large catalogues.
Why a Merkle tree?
| Benefit | Explanation |
|---|---|
| Single blockchain transaction | The entire catalogue batch — regardless of size — is represented by one root hash, requiring only one blockchain anchor |
| Individual track verification | Each track can be verified independently using its compact inclusion proof, without access to any other track's data |
| Tamper detection | Any modification to any track changes the root hash, making it impossible to alter a single track without invalidating the entire batch |
| Scalability | A 10,000-track catalogue and a 10-track catalogue both produce one root hash and one blockchain transaction |
| Privacy | An inclusion proof reveals only the sibling hashes along the path, not the data of other tracks in the batch |
Related methodology pages
- Hashing — How individual track hashes (the Merkle leaves) are computed
- Blockchain Anchoring — How the Merkle root is anchored to the blockchain
- Independent Verification — How Merkle proofs are used in the verification process