datathere
← Blog | Financial Services

Payment Processor and Fintech Partner Integration

Mert Uzunogullari|

One transaction, four versions of the truth

A cardholder taps their card at a terminal. Within seconds, an authorization request travels from the point-of-sale system through an acquirer, a card network, and an issuing bank. The authorization response returns with an approval code, a reference number, and a set of metadata fields.

By the time this single transaction is represented in the acquirer’s settlement file, the card network’s clearing record, the issuer’s posting data, and the merchant’s reconciliation report, it exists in four different formats with four different field structures. The approval code is auth_code in one system, authorization_id in another, APPROVAL_CD in a third, and nested inside an XML element called <AuthResponse><Code> in the fourth.

This is one transaction from one card tap. A financial institution processing transactions across multiple processors, acquirers, and networks deals with these field-level inconsistencies at a scale of millions of records per day.

The field variation problem

The inconsistencies go deeper than naming. The same conceptual data point gets represented in structurally different ways across processors.

Authorization codes appear as alphanumeric strings of varying lengths. One processor left-pads with zeros to six characters. Another uses variable-length codes up to twelve characters. A third includes the authorization source as a prefix (“E” for electronic, “V” for voice) while others store the source in a separate field.

Merchant attributes fragment differently across systems. One processor provides a single merchant_id field. Another separates the merchant identifier from the terminal identifier. A third nests merchant data inside a hierarchy: merchant group, merchant, location, terminal. The MCC (Merchant Category Code) might be a four-digit code in one system, a description string in another, and missing entirely from a third because they use a proprietary categorization.

Timestamps arrive in different formats and time zones. UTC with milliseconds. Local time with no timezone indicator. Unix epoch timestamps in seconds. Unix epoch timestamps in milliseconds. Date and time in separate fields. Date and time in a single field with varying delimiter characters. A settlement file from a processor on the West Coast using Pacific time needs different handling than one from a processor reporting in UTC, and the distinction may not be documented.

Transaction codes follow different encoding schemes. Processor A uses two-character codes: “SA” for sale, “RF” for refund, “RV” for reversal. Processor B uses numeric codes: “01” for purchase, “02” for refund, “03” for void. Processor C uses descriptive strings: “PURCHASE,” “CREDIT,” “VOID.” Mapping these to a unified transaction type taxonomy is foundational work that gets repeated for every new processor integration.

Lifecycle inconsistencies across processors

A payment transaction is not a single event. It is a lifecycle: authorization, capture, clearing, settlement, funding. Each processor represents this lifecycle differently.

Settlement structures follow different sequences. Some processors provide a single settlement file that contains all transaction types (purchases, refunds, chargebacks, fees) in one flat structure with a type indicator. Others separate these into distinct files or feeds. Others use a header-detail-trailer format where the header defines the batch, detail records contain individual transactions, and the trailer contains batch totals.

Fee structures embed differently. One processor deducts fees from the settlement amount, providing a net figure. Another reports gross settlement and fees separately. A third breaks fees into interchange, assessment, processor markup, and per-transaction charges across multiple records. Reconciling fee data across processors requires understanding not just the field mapping but the accounting model behind each processor’s reporting.

Batch formats vary in granularity. Some processors batch by day. Others batch by merchant. Others batch by merchant and terminal and day. The batch identifier format (numeric sequence, date-based, UUID) differs per processor. Settlement reconciliation depends on matching batches correctly, and batch-matching logic is processor-specific.

Dispute and chargeback data follows yet another set of conventions. Reason codes differ across card networks (Visa’s reason codes are not Mastercard’s). Processors relay this data with varying levels of detail and timeliness. Some provide chargeback records inline with settlement data. Others deliver them through a separate reporting channel.

The ISO 20022 and SWIFT challenge

Financial institutions increasingly receive payment data in structured message formats, notably ISO 20022 (particularly CAMT cash management messages) and SWIFT MT messages. These introduce their own integration complexity.

ISO 20022 CAMT messages use deeply nested XML structures. A CAMT.053 bank statement contains entries, which contain transaction details, which contain parties, which contain account identifiers, which contain proprietary codes. Extracting a flat transaction record from this hierarchy requires navigating multiple levels of nesting and handling optional elements that may or may not be present depending on the originating institution.

A single entry in a CAMT.053 might contain nested batch details with individual transaction records inside them. The amount might be at the entry level, the batch level, or the individual transaction level depending on how the originating bank structured the message. Parsing this correctly is not a mapping problem ; it is a structural flattening problem that requires understanding the hierarchy.

SWIFT MT messages (MT940, MT950, MT103) use a field-tagged format with numbered tags (:20:, :25:, :60F:, :61:, :86:) where each tag has specific formatting rules and sub-fields separated by slashes, forward colons, or position. The :61: statement line packs the value date, entry date, debit/credit mark, amount, transaction type, reference, and supplementary details into a single field with positional parsing rules. Extracting structured data from MT messages is closer to text parsing than data mapping.

datathere processes both ISO 20022 XML and SWIFT MT formats as source types. The AI handles the structural complexity — navigating CAMT nesting to extract flat transaction records, parsing MT field tags into structured fields — and maps the extracted data to the institution’s unified transaction schema.

Turning new integrations into variations

The economics of payment processor integration follow a familiar pattern: the first integration is an engineering project, and every subsequent integration is another engineering project of similar scope. Adding a new processor means understanding their file formats, field definitions, lifecycle conventions, and edge cases. It means writing new parsing logic, new mapping code, new reconciliation rules, and new tests.

A platform approach changes this by separating the parsing from the mapping from the business logic. When a new processor is added, their file format is a source. The unified transaction structure is the destination. The AI generates field mappings (APPROVAL_CD to authorization_code, TXN_TYPE to transaction_type with value transformations), and a payments operations analyst reviews and certifies them.

The structural work (quality enforcement rules, reconciliation logic, downstream feed generation) does not change. It was built against the unified transaction schema, not against any individual processor’s format. The new processor’s data flows through the same pipeline as every other processor once the mappings are certified.

This is the difference between a processor integration taking weeks of engineering time and taking days of operations time. The engineering team builds the pipeline infrastructure once. The operations team extends it to new processors by defining mappings.

What a unified transaction structure enables

When authorization, clearing, settlement, and funding data from all processors maps to a single canonical structure, the downstream possibilities change.

Reconciliation becomes matching records within a consistent schema rather than translating between schemas before matching. Regulatory reporting pulls from one structure instead of aggregating across processor-specific formats. Fraud detection models consume normalized transaction data instead of processor-specific encodings. Fee analysis compares like with like instead of reconciling different fee reporting models.

The financial institutions that reach this state did not get there by convincing all their processors to standardize. They got there by building a normalization layer that absorbs the variation and produces consistency. The processors will never align; they have different systems, different histories, different incentives. The alignment has to happen inside the institution, and the question is whether it happens through custom code maintained indefinitely or through a mapping platform maintained centrally.