Skip to main content
Version: v2.0

Enrichment & Cross-Source Validation

Methodology Version

Methodology Version 2.0 — Effective February 2026

Before a track can be evaluated against the Golden Record criteria, it must pass through TrackForge's multi-source enrichment pipeline. This pipeline consults authoritative music metadata databases in a defined priority order, detects conflicts between sources, and resolves them either algorithmically or through human review.

Source waterfall

Each track passes through a waterfall of authoritative sources. Sources are consulted in priority order, with later sources filling data gaps left by earlier ones:

PrioritySourceData Provided
1SpotifyTrack metadata, artist information, album context, ISRC validation
2MusicBrainzISWC linkages, recording/release relationships, artist credits
3DiscogsRelease metadata, label information, production credits
4Last.fmArtist metadata, tag-based genre classification
5PRS for MusicWriter and publisher data, IPI numbers, work registrations
6MLC (Mechanical Licensing Collective)US mechanical rights data, writer share allocations
7Additional PROsGEMA, SACEM, JASRAC, and other societies where available

The waterfall is designed so that the most broadly available sources are consulted first (streaming platform metadata), followed by progressively more specialised and authoritative sources (collection society databases). Writer and publisher information from PROs at priorities 5-7 is treated as higher-authority than equivalent data from streaming services.

Conflict resolution

When sources provide contradictory information for the same field (for example, different writer splits from PRS and MLC), the conflict resolution process follows a defined sequence:

  1. Detection — Conflicts are detected automatically by comparing field values across all sources that returned data for a track. Any divergence in rights-critical fields triggers a conflict record.

  2. Scoring — Each source carries a reliability weight for each field type. PRO databases are weighted more heavily for writer and publisher information than streaming platform metadata. The confidence score reflects both the source's general reliability and the specificity of the match.

  3. Algorithmic resolution — Where one source's confidence clearly exceeds another's (for example, a PRO's own registration data versus a streaming service's inferred credits), the higher-confidence value is selected automatically. The resolution is logged with the confidence differential and the rule applied.

  4. Human review — Where confidence is ambiguous or the stakes are high (for example, conflicting writer share allocations with similar confidence from two PROs), the conflict is flagged for operator review. No automatic resolution is applied.

All conflicts and their resolutions are recorded in the enrichment conflict log. Each record captures:

  • The conflicting values from each source
  • The confidence scores assigned to each
  • The resolution chosen (algorithmic or operator)
  • The reasoning or rule applied
  • The timestamp and, for operator resolutions, the operator identity

Operator verification (B2B only)

For B2B certification engagements, enrichment is not fully automated. A trained operator performs the following verification steps before a track proceeds to Golden Record evaluation:

  1. Completeness review — confirms that all fields expected from the enrichment waterfall have been populated or explicitly marked as unavailable
  2. Writer/publisher cross-check — validates writer and publisher information against PRO databases, checking IPI numbers, name spellings, and role assignments
  3. ISRC-to-ISWC linkage validation — confirms that recording-to-composition linkages are correct, catching cases where a recording has been linked to the wrong underlying work
  4. Conflict review — reviews and approves all algorithmically-resolved conflicts, with the ability to override any automatic resolution
  5. Golden Record threshold confirmation — confirms that the track meets all qualification criteria
  6. Certification approval — formally approves the track for certification

All operator actions are logged with timestamp, operator identity, and the action taken. This audit trail is included in the certification proof bundle and is available for review.

Enrichment and rating separation

The enrichment pipeline described on this page is structurally independent from TrackForge's Rating Oracle. Enrichment modifies data (adding ISWCs, resolving writers, filling metadata gaps). Rating reads a snapshot of the resulting data and produces a deterministic, read-only assessment. There is no code path from the rating engine back to the enrichment pipeline — the data flow is one-directional. This separation is enforced at the application, database, and transaction levels. See Rating Independence for the full technical details.

  • Certification Tiers — The completeness criteria a track must meet after enrichment
  • Canonicalisation — How enriched metadata is transformed into a deterministic canonical form
  • Re-Certification — How metadata changes (including new enrichment data) trigger re-certification