About xariff
Free, no-signup ML dataset tools for practitioners who care about data quality.
Our mission
Garbage in, garbage out. Every ML practitioner knows this — yet tools for systematically auditing training data are expensive, complex, or locked behind enterprise SaaS. xariff exists to change that. We provide free, fast, browser-based analysis tools that give you the insights you need without the overhead.
Privacy first
CSV files are processed in-memory for analysis and are not retained by default. We do store limited operational data such as analytics events, feedback submissions, email capture entries, and generated scorecard reports. We use a session cookie for anonymous usage tracking.
Current tools
-
Quality Checker — Missing values, outliers, duplicates, mixed types, overall quality score.
-
ML Readiness Scorecard — Class balance, completeness, volume, uniqueness — scored against your ML stage.
-
Dataset Profiler — Per-feature statistics, distributions, and correlation insights for faster EDA.
-
Feature-Space Coverage — PCA + grid-based coverage checks to highlight sparse and empty regions.
-
Outlier Detection — Isolation Forest, DBSCAN, and ensemble detection with feature-level explanations.
-
Drift Detection — Reference-vs-current drift checks using PSI and KS tests.
Planned next
- • Label quality audit
- • Feature redundancy checker
- • Bias detection