datathere
← Blog | Financial Services

AML Feed Consolidation: Normalizing Transaction Monitoring and Watchlist Data

Mert Uzunogullari|

The alert fatigue problem starts with data fragmentation

An AML compliance team at a mid-market bank reviews alerts from four systems every day. The transaction monitoring platform flags unusual patterns: velocity spikes, structuring behavior, round-dollar transactions below reporting thresholds. The sanctions screening engine matches customer and counterparty names against OFAC, EU, and UN watchlists. JPMorgan Access delivers correspondent banking data and payment transparency information for USD clearing. A FinCEN feed provides 314(b) information-sharing responses and regulatory advisories.

Each system generates alerts in its own format, with its own severity model, its own entity identifiers, and its own case reference structure. A single suspicious wire transfer might appear as an alert in the transaction monitoring system, a hit in the sanctions screening engine, and a flagged payment in JPMorgan’s transparency data — three separate signals about the same event, in three different formats, visible in three different interfaces.

The compliance analyst investigating this wire needs to manually correlate these signals. They log into the transaction monitoring platform, find the alert, note the account number and transaction details. They switch to the sanctions screening interface, search for the counterparty name, find the match result. They open JPMorgan Access, locate the payment by reference number, review the originator and beneficiary chain. They copy relevant details into the case management system. Then they start the actual investigation.

The data assembly takes 20 to 40 minutes per case. The investigation itself (applying regulatory knowledge, assessing risk, making a determination) takes 15 minutes for straightforward cases. The ratio is inverted. The compliance team spends more time finding and normalizing data than analyzing it.

Transaction monitoring: every vendor has its own alert model

Transaction monitoring platforms (Actimize, Verafin, SAS, Oracle Financial Crime) each generate alerts with proprietary structures.

Alert severity uses different scales. One platform scores alerts from 0 to 100, where 100 is highest risk. Another uses five tiers: Low, Medium, High, Critical, Escalate. A third uses a numeric priority (1 through 5, where 1 is most urgent). A fourth combines a risk score with a confidence score, producing a two-dimensional classification that does not map cleanly to any single-axis severity model.

Scenario identifiers follow vendor-specific taxonomies. One platform labels a structuring detection as scenario STR-001. Another calls it UNUSUAL_CASH_PATTERN. A third uses a numeric code 4200 that maps to a description only if you have access to their scenario reference document. Comparing detection rates across platforms requires normalizing these identifiers to a common taxonomy, which typically does not exist until someone builds it.

Entity references vary in structure. One platform references the alerted entity by internal customer ID. Another uses account number. A third uses a composite key of customer ID, account number, and branch code. Joining alerts from different platforms to the same customer requires resolving these identifier differences, often with fuzzy matching when exact lookups fail.

Supporting transaction data embeds differently. Some platforms include the triggering transactions inline within the alert record. Others reference transaction IDs that require a separate lookup. Others provide aggregate statistics (total amount, transaction count, time window) without individual transaction details. A compliance analyst building a case needs the actual transactions, and getting them requires different extraction paths depending on the source platform.

Sanctions screening: match results that resist comparison

Sanctions and watchlist screening is a particularly acute normalization problem because the consequences of missed matches are severe and the format variation is extreme.

Match confidence uses incompatible models. One screening engine returns a fuzzy match score from 0 to 100. Another returns a categorical match strength: Exact, Strong, Possible, Weak. A third returns a composite score that weights name similarity, date of birth match, and address proximity differently. Determining whether a “72” from one engine is equivalent to a “Strong” from another requires understanding each engine’s scoring methodology, and those methodologies are often proprietary.

Watchlist source identifiers differ. OFAC’s SDN list, the EU Consolidated Sanctions List, the UN Security Council Consolidated List, and national lists from HMRC, SECO, and others each identify sanctioned entities with different reference formats. One screening engine preserves the original list identifiers. Another assigns its own internal IDs and maps them to list entries. A third provides the list name but not the specific entry identifier, requiring manual lookup to verify the match.

Disposition and override records follow different workflows. When an analyst reviews a match and determines it is a false positive, the disposition needs to be recorded. One engine stores dispositions as status codes attached to the match record. Another maintains a separate disposition log linked by match ID. A third requires the disposition to be entered through its UI and does not export it in structured data. For regulatory examiners who need to see that every match was reviewed and dispositioned, the compliance team must reconstruct this audit trail from multiple systems.

Correspondent banking intelligence: JPMorgan Access and SWIFT GPI

Financial institutions that clear USD through JPMorgan , which includes most institutions with significant dollar payment volumes, receive transaction-level data through JPMorgan Access (formerly JPM Link and J.P. Morgan Virtual). This data includes payment chain transparency, compliance screening results, and payment status information.

JPMorgan Access payment data provides originator-to-beneficiary chain details that the institution’s own transaction records do not contain. For a correspondent banking relationship, the institution sees its immediate counterparty but not the full payment chain. JPMorgan’s data fills this gap, showing intermediate banks, original ordering parties, and ultimate beneficiaries. This information is critical for AML analysis but arrives in JPMorgan’s proprietary format, which does not match the institution’s internal transaction schema.

SWIFT GPI (Global Payments Innovation) tracking data provides end-to-end payment status and fee transparency for SWIFT transactions. The GPI Tracker assigns a Unique End-to-End Transaction Reference (UETR) to each payment and tracks its status through each correspondent in the chain. This data is valuable for payment reconciliation and for identifying unusual delays that might indicate sanctions holds or compliance reviews at intermediary banks.

Integrating these feeds into the institution’s AML workflow means mapping JPMorgan’s payment chain data and SWIFT GPI status records to the same entity and transaction references used by the transaction monitoring and sanctions screening systems. The payment that JPMorgan identifies by its clearing reference needs to join to the same payment that the transaction monitoring system flagged by its SWIFT message reference, which needs to join to the same payment the sanctions engine screened by the counterparty name.

These joins are not simple key lookups. The reference numbers may differ across systems. The entity names may be formatted differently. The amounts may reflect different stages of the payment (gross versus net of fees). datathere’s multi-source join capability handles the correlation by defining join conditions with transformation expressions: normalizing reference formats, matching entity names with fuzzy logic, and reconciling amount differences using fee data from the payment chain.

FinCEN and regulatory feeds

US financial institutions interact with FinCEN (Financial Crimes Enforcement Network) through several data channels, each with its own format.

BSA E-Filing is the submission channel for SARs (Suspicious Activity Reports) and CTRs (Currency Transaction Reports). The filing format is structured XML following FinCEN’s schema, which changes with regulatory updates. When FinCEN updates the SAR or CTR format (adding new fields, changing validation rules, or modifying code lists), institutions must update their filing systems. The historical data in the old format does not automatically conform to the new format, creating version management challenges for trend analysis and examiner inquiries.

314(b) information sharing allows institutions to share information about suspected money laundering or terrorist financing with other financial institutions. Responses arrive in formats that vary by the sharing mechanism: some through FinCEN’s portal in structured form, others through secure email in ad-hoc formats. Correlating 314(b) response data with internal customer records requires entity matching that accounts for name variations, address differences, and identifier mismatches.

FinCEN advisories and geographic targeting orders provide regulatory intelligence that affects AML monitoring parameters. An advisory about increased money laundering risk in a particular corridor requires updating transaction monitoring scenarios to flag relevant patterns. The advisory itself arrives as a document, not as machine-readable data that plugs directly into monitoring rules.

Integrating these regulatory feeds into a unified AML data environment means treating each as a source with its own schema and mapping it to the institution’s internal compliance data model. FinCEN’s SAR filing format maps to the case management system’s investigation record. 314(b) response data maps to the customer risk profile. Advisory content maps to scenario parameters in the transaction monitoring configuration.

Entity resolution across AML data sources

The hardest integration problem across AML feeds is entity resolution: determining that records from different systems refer to the same person, organization, or transaction.

The transaction monitoring system identifies customers by internal customer ID. The sanctions screening engine identifies matched entities by name and date of birth. JPMorgan Access identifies parties by name and account details from the SWIFT message. FinCEN 314(b) responses identify subjects by the information the institution provided in the request, which may or may not match how the customer is recorded internally.

Resolving these identities requires more than exact key matching. It requires fuzzy name matching (to handle transliteration differences, abbreviations, and name order variations), address normalization (to match “123 Main St” with “123 Main Street, Suite 4”), and date handling (to match “March 7, 1985” with “07/03/1985” where the date/month order is ambiguous).

datathere’s multi-source join supports this through transformation conditions on join fields. Name fields can be normalized (lowercased, stripped of titles and suffixes, transliterated) before comparison. Address fields can be parsed into components for field-level matching. Date fields can be parsed from multiple formats into a canonical representation. The join confidence score reflects the quality of the match, flagging cases where the entity resolution is uncertain for analyst review.

Building a unified AML intelligence view

The end state is an AML data environment where compliance analysts see a consolidated view for each entity and each investigation. The transaction monitoring alert, the sanctions screening result, the payment chain data from JPMorgan, and any relevant FinCEN data all appear in context, linked to the same entity, without manual correlation.

This requires four integration pipelines converging on a common data model:

  1. Transaction monitoring alerts normalized to a common severity scale and scenario taxonomy, with supporting transaction details extracted and linked.

  2. Sanctions screening results normalized to a common match confidence model, with watchlist source identifiers mapped to a unified reference and disposition records included.

  3. Correspondent banking data from JPMorgan Access and SWIFT GPI mapped to internal transaction references, with payment chain details available for investigation.

  4. Regulatory feed data from FinCEN and other sources mapped to internal entity and case records, with version management for format changes.

Each pipeline uses datathere’s AI-generated mappings to handle the field-level translation between the source format and the common AML data model. Quality enforcement catches data issues before they affect investigations. An alert with no entity reference gets flagged, a sanctions match with an unrecognized list identifier gets quarantined, a payment chain record with an implausible amount gets held for review.

The certification workflow ensures that mappings are reviewed and validated before they process production data. When a transaction monitoring vendor updates their alert format, or when JPMorgan changes a field in their payment data, the affected mappings are updated, re-certified, and deployed without affecting the other pipelines.

The regulatory examination advantage

Regulatory examiners assess AML program effectiveness by reviewing how the institution detects, investigates, and reports suspicious activity. A fragmented data environment makes this assessment harder for both the institution and the examiner.

When an examiner asks “show me all the information you had about this customer at the time of this transaction,” a fragmented environment requires pulling data from multiple systems, assembling it manually, and hoping nothing was missed. A unified AML data environment answers the question from a single query — every alert, every screening result, every payment chain detail, every regulatory data point, linked to the customer and timestamped.

The institution’s ability to demonstrate comprehensive data coverage, consistent alert review, and thorough investigation is directly tied to how well its AML data sources are integrated. The mapping and normalization work is invisible to the examiner. The result — a complete, consistent, correlated view of AML intelligence — is not.