Jason CapshawSystems · B2B Distribution
← All work

Industrial distributor · $400M · Catalog · Data · Search

Rebuilding a catalog that could actually be searched

Moved from seven fragmented supplier feeds to a single canonical catalog — without freezing the business.

420k

SKUs normalized

7→1

Feed pipelines

18h→90s

Publish latency

+38%

Search conversion

Period

2023 — 2024 · 14 months

Stack

Postgres · Python · Algolia · Custom MDM

Problem

Seven supplier feeds with incompatible taxonomies, no canonical attributes, and no single source of truth for product identity.

Merch, digital, and operations each maintained their own spreadsheets; the storefront was a lagging index of whoever last updated what.

Approach

Started with identity — made SKU canonicalization a platform primitive, not a data-cleanup project.

Rolled changes department-by-department rather than a big-bang cutover, so each team could validate against their own workflow.

Built a review queue and audit trail from day one — the data story only holds if people can trust it.

Hindsight

I'd invest earlier in supplier-facing tooling. By the time the canonical model stabilized, we were still chasing upstream quality issues we could have caught at ingest.