Version: v2.0

Enrichment & Cross-Source Validation

Methodology Version

Methodology Version 2.0 — Effective February 2026

Before a track can be evaluated against the Golden Record criteria, it must pass through TrackForge's multi-source enrichment pipeline. This pipeline consults authoritative music metadata databases in a defined priority order, detects conflicts between sources, and resolves them either algorithmically or through human review.

Source waterfall

Each track passes through a waterfall of authoritative sources. Sources are consulted in priority order, with later sources filling data gaps left by earlier ones:

Priority	Source	Data Provided
1	Spotify	Track metadata, artist information, album context, ISRC validation
2	MusicBrainz	ISWC linkages, recording/release relationships, artist credits
3	Discogs	Release metadata, label information, production credits
4	Last.fm	Artist metadata, tag-based genre classification
5	PRS for Music	Writer and publisher data, IPI numbers, work registrations
6	MLC (Mechanical Licensing Collective)	US mechanical rights data, writer share allocations
7	Additional PROs	GEMA, SACEM, JASRAC, and other societies where available

The waterfall is designed so that the most broadly available sources are consulted first (streaming platform metadata), followed by progressively more specialised and authoritative sources (collection society databases). Writer and publisher information from PROs at priorities 5-7 is treated as higher-authority than equivalent data from streaming services.

Conflict resolution

When sources provide contradictory information for the same field (for example, different writer splits from PRS and MLC), the conflict resolution process follows a defined sequence:

Detection — Conflicts are detected automatically by comparing field values across all sources that returned data for a track. Any divergence in rights-critical fields triggers a conflict record.
Scoring — Each source carries a reliability weight for each field type. PRO databases are weighted more heavily for writer and publisher information than streaming platform metadata. The confidence score reflects both the source's general reliability and the specificity of the match.
Algorithmic resolution — Where one source's confidence clearly exceeds another's (for example, a PRO's own registration data versus a streaming service's inferred credits), the higher-confidence value is selected automatically. The resolution is logged with the confidence differential and the rule applied.
Human review — Where confidence is ambiguous or the stakes are high (for example, conflicting writer share allocations with similar confidence from two PROs), the conflict is flagged for operator review. No automatic resolution is applied.

All conflicts and their resolutions are recorded in the enrichment conflict log. Each record captures:

The conflicting values from each source
The confidence scores assigned to each
The resolution chosen (algorithmic or operator)
The reasoning or rule applied
The timestamp and, for operator resolutions, the operator identity

Operator verification (B2B only)

For B2B certification engagements, enrichment is not fully automated. A trained operator performs the following verification steps before a track proceeds to Golden Record evaluation:

Completeness review — confirms that all fields expected from the enrichment waterfall have been populated or explicitly marked as unavailable
Writer/publisher cross-check — validates writer and publisher information against PRO databases, checking IPI numbers, name spellings, and role assignments
ISRC-to-ISWC linkage validation — confirms that recording-to-composition linkages are correct, catching cases where a recording has been linked to the wrong underlying work
Conflict review — reviews and approves all algorithmically-resolved conflicts, with the ability to override any automatic resolution
Golden Record threshold confirmation — confirms that the track meets all qualification criteria
Certification approval — formally approves the track for certification

All operator actions are logged with timestamp, operator identity, and the action taken. This audit trail is included in the certification proof bundle and is available for review.

Enrichment and rating separation

The enrichment pipeline described on this page is structurally independent from TrackForge's Rating Oracle. Enrichment modifies data (adding ISWCs, resolving writers, filling metadata gaps). Rating reads a snapshot of the resulting data and produces a deterministic, read-only assessment. There is no code path from the rating engine back to the enrichment pipeline — the data flow is one-directional. This separation is enforced at the application, database, and transaction levels. See Rating Independence for the full technical details.

Certification Tiers — The completeness criteria a track must meet after enrichment
Canonicalisation — How enriched metadata is transformed into a deterministic canonical form
Re-Certification — How metadata changes (including new enrichment data) trigger re-certification

Source waterfall​

Conflict resolution​

Operator verification (B2B only)​

Enrichment and rating separation​

Related methodology pages​

Source waterfall

Conflict resolution

Operator verification (B2B only)

Enrichment and rating separation

Related methodology pages