Tools / Drift Detector

Dataset Drift Detector

Compare a reference dataset against a current dataset to detect distribution shift in each feature. Uses the KS Test (statistical significance) and PSI (Population Stability Index) — the industry standard for monitoring model inputs.

How it works: numeric features are binned using the reference distribution. PSI and KS scores are computed per feature, then aggregated into an overall stability score. PSI < 0.10 = stable · 0.10–0.20 = moderate · ≥ 0.20 = significant drift.

Reference dataset (baseline / training)

Drop reference CSV or click

Historical / training data

Current dataset (new / production)

Drop current CSV or click

Recent / production data

PSI bins (numeric features)

More bins = finer granularity; requires larger datasets.

By using this tool, you agree to Terms, Privacy Policy, and Cookie Policy.

Feature Drift Summary

Sorted by drift severity. Click any row to view its distribution chart.

Feature	Type	PSI	PSI Severity	KS Stat	KS p-value	KS Drift

Distribution Comparison

Reference Current

Numeric Feature Statistics

Mean, std, and median for reference vs current. Red = >20% shift, amber = >5% shift.

Feature	Ref Mean	Cur Mean	Ref Std	Cur Std	Ref Median	Cur Median

Continue your pre-training checklist

Outlier Detector

Detect anomalous rows with Isolation Forest & DBSCAN

Data Quality Checker

Detect missing values, duplicates, type issues

PSI thresholds (0.1 / 0.2) are industry conventions originating from credit risk monitoring. Appropriate thresholds vary by domain and sample size. KS p-values are sensitive to sample size — large datasets may show statistically significant but practically irrelevant drift. Always interpret results in context.

What you get

Stability score

A 0–100 score based on the proportion and severity of drifted features, with a colour-coded verdict.

PSI per feature

Population Stability Index for every column — numeric features use quantile bins, categoricals use frequency bins.

KS test results

Two-sample Kolmogorov-Smirnov statistic and p-value for numeric features. Flags shift at p < 0.05.

Distribution charts

Overlay histogram (numeric) or grouped bar chart (categorical) comparing reference vs current for each feature.