Fifty suppliers, fifty formats
A regional grocery chain sources from local farms, national distributors, and specialty importers. A fashion retailer works with domestic manufacturers, overseas factories, and independent designers. A home goods brand buys from artisans, industrial suppliers, and white-label producers.
The common thread is not industry. It is the chaos at the front door.
One supplier sends an Excel file with columns labeled “Item #,” “Desc,” “Whsl Price,” and “Qty Avail.” Another sends a CSV with “SKU,” “Product Description,” “Cost,” and “On Hand.” A third sends a PDF catalog with product details embedded in formatted tables. A fourth has an API that returns JSON, but the field names are abbreviated codes that require a separate reference document to decode.
None of them match your internal catalog structure. All of them need to.
The manual process everyone recognizes
The typical supplier onboarding workflow looks like this:
A buyer negotiates terms with a new supplier. The supplier agrees to provide product data. Someone on the catalog or operations team sends the supplier a template spreadsheet. The supplier fills it out partially, incorrectly, or in a completely different format than requested. The operations team downloads the file, opens it, and starts the real work.
Column headers get renamed. Fields get rearranged. Missing data gets flagged and sent back via email. “What does ‘CS’ mean in the pack size column?” “Is this price per unit or per case?” “Your weight column has some entries in pounds and some in kilograms.” These emails go back and forth for days, sometimes weeks.
Once the data is cleaned up enough to import, someone manually copies it into the catalog system or pastes it into an import template. If the system rejects rows, the errors get triaged one by one. The whole process repeats for the next supplier.
For a retailer onboarding five or ten suppliers a year, this is tedious but survivable. For one onboarding fifty, or managing ongoing data updates from existing suppliers, it consumes a full-time headcount that never appears on the org chart.
Why the template strategy fails
The obvious solution is to standardize: give every supplier the same template and require them to use it. In theory, this eliminates format variation. In practice, it fails for three reasons.
First, suppliers do not follow templates. They fill in what they understand, leave blank what they do not, and add columns for data they think is important. A supplier who has been sending product data to retailers for twenty years has their own format. Asking them to restructure their data for your template is asking them to do unpaid work. They will do it badly or not at all.
Second, templates cannot accommodate the range of product types across suppliers. A template designed for apparel does not work for electronics. One designed for packaged food does not work for fresh produce. You either create dozens of category-specific templates (which multiplies the maintenance burden) or create a generic template (which captures nothing well).
Third, templates freeze the schema. When your internal catalog structure changes (new fields for sustainability data, updated compliance requirements, additional image specifications), every template needs updating, and every supplier needs re-educating.
The structural alternative
The problem is not that supplier data is messy. It is that the onboarding process treats each supplier’s format as an obstacle to overcome manually, rather than a mapping problem to solve once.
When a supplier sends a spreadsheet, that spreadsheet has a schema, implicit in its column headers, data types, and value patterns. “Whsl Price” is wholesale price. “Qty Avail” is quantity available. “Item #” is the supplier’s product identifier. These mappings are not ambiguous. They are just different from your field names.
A platform that can analyze an incoming file, identify the schema, and generate mappings to your catalog structure turns supplier onboarding from a manual reformatting project into a review-and-approve workflow.
datathere handles this by accepting whatever the supplier sends — CSV, Excel, JSON, XML, or PDF. The AI examines the file structure, identifies field relationships to your destination catalog schema, and generates mappings with confidence scores. A catalog manager reviews the proposed mappings, adjusts any the AI got wrong, and certifies. From that point forward, every file from that supplier processes through those certified mappings automatically.
For PDF catalogs, and suppliers who only have print-formatted product sheets, AI vision extraction pulls structured data from formatted tables, product grids, and specification blocks. The extracted data feeds into the same mapping workflow as any other format.
Quality enforcement at the front door
Getting supplier data into a consistent format solves the structure problem. It does not solve the quality problem.
Supplier data has gaps. Products missing weight. Descriptions truncated at 20 characters. Prices that look like they might be in the wrong currency. UPC codes that do not pass check-digit validation.
The manual approach catches some of these on visual inspection. It misses more than it catches, especially under time pressure. The errors surface later: a product listed online with no image, a pricing error that costs margin, a missing allergen declaration that creates liability.
Quality enforcement rules applied during the mapping process catch these problems at ingestion. A product record missing a required field can be quarantined until the supplier provides the data, flagged for review, or set to stop the job entirely, depending on the field’s criticality. The rules are defined once per destination schema and applied consistently to every supplier’s data.
This shifts the burden from the catalog team finding and fixing quality issues to the system preventing them from entering the catalog in the first place.
What changes when onboarding scales
The difference between onboarding supplier number three and supplier number fifty should be negligible. The catalog structure does not change. The quality requirements do not change. The only thing that changes is the source format, and that is exactly what a mapping platform handles.
A new supplier sends their product file. The platform generates mappings. A catalog manager reviews and certifies, correcting any misidentified fields. The supplier’s data flows into the unified catalog. Total elapsed time: hours, not weeks.
When the supplier updates their data (new products, price changes, discontinued items), the same certified mappings apply. No manual reformatting. No template re-education. The supplier sends data the way they always have, and the platform translates it into the structure your catalog requires.
For retailers growing their supplier base, this is the difference between supplier onboarding as a bottleneck and supplier onboarding as a non-event. The buying team negotiates the relationship. The operations team approves the mappings. The catalog stays clean. Nobody spends their week renaming spreadsheet columns.