Enterprise Data Quality

Data Profiling

AI analyzes complete value distributions across your data to find issues — typos, outliers, placeholder text — before they reach production.

Plain English Editing

Describe validation rules in natural language. AI creates the logic, tests it against your data, and shows you the results before you accept.

Three Enforcement Actions

Each rule specifies what happens when data fails: quarantine for review, flag and continue, or stop the job.

Test Before Production

Dry-run your rules against your data before any processing happens. See per-rule pass rates and sample failures.

Violation Tracking

7-day and 30-day trends show which rules are catching more violations — a signal that upstream data quality is changing.

Flexible Rule Management

Toggle individual rules on or off without deleting them. Enable or pause quality checks at the organization level as part of your subscription.

Rule Profiling

Writing validation rules manually means making assumptions about what your data looks like. AI analyzes the values, patterns, and anomalies in your data and suggests rules based on what it finds.

Typos that appear once among thousands, casing inconsistencies, impossible ranges, placeholder text, and PII leakage are detected automatically. Each suggestion includes rationale with the failing values and their counts, so you can decide whether to accept, modify, or dismiss it.

Complete value distributions, not statistical samples
Detects typos, casing inconsistencies, placeholder text, and outliers
Email, phone, date, and URL format validation
PII leakage detection — SSN patterns, credit card numbers in wrong fields

AI-Suggested Rule

Field

customer_status

Rule

Must be one of: active, inactive, pending

Rationale

Found 3 valid values (active: 8,421, inactive: 2,103, pending: 847). Also found "actve" (2 occurrences) and "ACTIVE" (14 occurrences) — likely typos and casing errors.

Estimated violations: 16 error / quarantine

You describe

"must be a valid email address"

Rule created — tested against 11,371 rows — 99.2% pass

You describe

"price must be positive"

Rule created — tested against 11,371 rows — 99.8% pass

You describe

"country code must be exactly 2 characters"

Rule created — tested against 11,371 rows — 94.7% pass — 602 violations

Plain English Editing

You never have to think about validation logic. Describe what you want to check — "must be between 0 and 100" or "must not be empty" — and the system creates the rule, tests it against your data, and shows you the results.

When a rule needs to change, you describe the change the same way. The system updates the logic, re-tests, and shows you the new results. You accept or reject. That's it.

Validated against your data before you accept
See pass rate and sample violations immediately
Edit existing rules in natural language — same workflow

Three Enforcement Actions

Each rule specifies what happens when data fails validation. You choose the action per rule, and enforcement applies automatically during processing.

Quarantine

The row is removed from output and saved to a separate quarantine file with the failure reason attached, so you can inspect and reprocess it later.

Flag

The row continues processing but gets marked for your review. Output is unaffected — you decide what to do after the run.

Stop Job

The pipeline halts. Use this for critical rules where bad data should stop output from being produced.

Test Before Production

Dry-run all your active quality rules against your source data without processing anything. You see per-rule pass rates, violation counts, and sample failures — so you can tune rules before they affect production output.

The same rules run the same way in test and production. What you see in the preview is what happens during a production run.

Per-rule pass rates and violation counts
Sample violations with row context
Consistent behavior between test and production

Test Run Results

customer_status allowed values

99.8% pass

email format

97.2% pass

price positive

94.1% pass

country code length

98.5% pass

11,371 rows evaluated Overall: 97.4% pass

Violation Trends — 7 Day

email format

12 violations today

price positive

3 violations today

Top violating values

"actve" (2) "N/A" (7) "test" (4)

Violation Tracking and Trends

Violations are recorded with the row data, the value that failed, the rule it failed against, and the action taken. You see the context for each failure — enough to decide whether to fix the data upstream or adjust the rule.

7-day and 30-day trends show which rules are catching more violations over time — a signal that upstream data quality is degrading. The most common violating values per rule help you target cleanup.

7-day and 30-day trends per rule with daily sparklines
Most common violating values for targeted cleanup
Row context for each violation
Run history with pass rates across all autopilot executions

Aggregation-Aware Enforcement

When your mapping includes aggregation, quality rules apply at the group level. If any row in a group fails a quarantine rule, the group is excluded — preventing partial aggregations from reaching your output.

Group-level enforcement for aggregated data
Propagated violations tracked separately in reports

Flexible Rule Management

Data quality is enabled at the organization level as part of your subscription. Toggle individual rules on or off without deleting them — useful when you're tuning rules or onboarding new data sources. Pause quality checks at the organization level when you need to.

Enable or disable individual rules without deleting
Organization-level quality toggle