For data & ML teams

Synthetic data with receipts.

Four production-grade products. One cryptographically-sealed evidence chain. From a one-line prompt to a fully-synthesised, provenance-stamped dataset — and an autonomous agent (or plain English) to run the whole pipeline.

Book a working session See the SDK

the flagship engine fidelity: 95.69 %
ADS tools: 43
ADS modules: 48
Connectors: 14

customer-cohort-q3 · /agentplan-mode

Build me a clean training set from /datasets/customers.csv, drop nulls > 5 %, encode categoricals, train a baseline.

proposed plan · 4 steps

data_clean	drop_threshold=0.05	~5c
encode_categorical	method=onehot	~3c
train_predictive	model=randomforest	~12c
shap_explain	sample=200	~3c

est. total23 credits · ~38 s

▶ data_clean ✓ 4.2 c 2.1 s

▶ encode_categorical ✓ 2.8 c 1.4 s

▶ train_predictive … 7.1 / 12 c

▷ shap_explain queued

estimated 14.2 s remaining · evidence chain rolling

Four products. One contract. One evidence chain.

Pick the entry point that matches the data you have today — the underlying evidence pipeline is identical, so you move between them without changing your downstream tooling.

Flagship

Autonomous Data Scientist

48-module autonomous pipeline agent. 43 typed planner tools. BLAKE3 audit trail.

Multi-step autonomous pipeline agent that plans, executes across 10 engines, reads the sealed evidence bundle after every step, self-heals when quality gates fail, and chains every decision into a BLAKE3-verifiable audit. Human-approval gate on credit-spending steps.

Plans first, executes second — no blind tool calls
Cost / credit estimate before each step
Same evidence chain as Mock & Synthesize

Open the flagship page

AI Assistant

Natural language for every operation.

Plain-English driver for every product on the platform. Plans first, asks for approval, streams execution, seals the transcript.

Open page

Mock Data

Schema in, sealed dataset out — sub-minute.

Describe your dataset in plain English. Generation deterministic synthetic records with sealed the sealed contract, fixed seed, full provenance bundle.

0.5–2 M rows per request
Unlimited industries
Reproducible byte-for-byte

Sub-minuteDeterministicthe sealed contract

Open product page

Synthesize

Real data in, real-quality synthetic out.

Our trained synthesis engine learns from your CSV/Parquet. Marginals, correlations, and constraints preserved by construction.

Up to 50 columns per dataset
Built-in K-S, χ², Pearson gates
Checkpoint reuse on regen

95.69 % benchmark-certifiedConstraint-awareQuality-gated

Open product page

Evidence pipeline

Six stages from prompt to sealed bundle.

Every dataset shipped from any product on this pillar passes through the same six stages. The bundle that lands in your bucket can be verified offline by anyone with the open-source evidence verifier CLI.

evidence pipeline · job_9b3df1

live

Sealed contract

Engine run

Quality gates

BLAKE3 chain

Sealed bundle

rows generated

10 000

qa score

0.957

chain root

a4f2…d801

Stage 1

the sealed contract

Schema, constraints, intent and seed sealed into a JSON artefact before any data is generated.

Stage 2

Engine run

Mock, Synthesize, or ADS-driven pipeline executes against the sealed contract; per-step I/O recorded.

Stage 3

Quality gates

K-S, Pearson, χ², constraint satisfaction, per-column drift checked. Fail-closed: regression aborts.

Stage 4

Cryptographic chain

Each step's inputs and outputs hashed and chained. Tamper-evident, verifiable offline.

Stage 5

Evidence bundle

Signed .tar.zst with contract, run-log, quality report, artefact manifest, engine SBOM.

Stage 6

Tenant isolation

Per-tenant Fernet keys, per-tenant artefact prefixes, per-tenant evidence keys.

Numbers we’ll defend.

95.69 %

the flagship engine · benchmark-certified fidelity

Validated under an independent third-party QA harness. Full reproducibility certificate published on the /verify page.

ADS planner tools

From data_clean to shap_explain — same engines callable from chat or SDK.

Encrypted connectors

Snowflake, BigQuery, Databricks, Postgres, S3, GCS, Azure Blob — all Fernet-vaulted.

Plain-text secrets

Connector creds auto-hoisted to the encrypted vault, never written to logs.

Bring a CSV. We’ll show you the evidence bundle.

30-minute working session: you upload (or we mock) a representative dataset, we run it through Synthesize and the Autonomous Data Scientist, and you keep the signed evidence bundle and quality report.

Book a session See pricing