About xariff

Free, no-signup ML dataset tools for practitioners who care about data quality.

Our mission

Garbage in, garbage out. Every ML practitioner knows this — yet tools for systematically auditing training data are expensive, complex, or locked behind enterprise SaaS. xariff exists to change that. We provide free, fast, browser-based analysis tools that give you the insights you need without the overhead.

Privacy first

CSV files are processed in-memory for analysis and are not retained by default. We do store limited operational data such as analytics events, feedback submissions, email capture entries, and generated scorecard reports. We use a session cookie for anonymous usage tracking.

Current tools

  • Quality CheckerMissing values, outliers, duplicates, mixed types, overall quality score.
  • ML Readiness ScorecardClass balance, completeness, volume, uniqueness — scored against your ML stage.
  • Dataset ProfilerPer-feature statistics, distributions, and correlation insights for faster EDA.
  • Feature-Space CoveragePCA + grid-based coverage checks to highlight sparse and empty regions.
  • Outlier DetectionIsolation Forest, DBSCAN, and ensemble detection with feature-level explanations.
  • Drift DetectionReference-vs-current drift checks using PSI and KS tests.

Planned next

  • Label quality audit
  • Feature redundancy checker
  • Bias detection