For data & ML teams

Synthetic data with receipts.

Four production-grade products. One cryptographically-sealed evidence chain. From a one-line prompt to a fully-synthesised, provenance-stamped dataset — and an autonomous agent (or plain English) to run the whole pipeline.

the flagship engine fidelity
95.69 %
ADS tools
43
ADS modules
48
Connectors
14
customer-cohort-q3 · /agentplan-mode
Build me a clean training set from /datasets/customers.csv, drop nulls > 5 %, encode categoricals, train a baseline.
proposed plan · 4 steps
data_cleandrop_threshold=0.05~5c
encode_categoricalmethod=onehot~3c
train_predictivemodel=randomforest~12c
shap_explainsample=200~3c
est. total23 credits · ~38 s
▶ data_clean ✓ 4.2 c 2.1 s
▶ encode_categorical ✓ 2.8 c 1.4 s
▶ train_predictive … 7.1 / 12 c
▷ shap_explain queued
estimated 14.2 s remaining · evidence chain rolling

Four products. One contract. One evidence chain.

Pick the entry point that matches the data you have today — the underlying evidence pipeline is identical, so you move between them without changing your downstream tooling.

Evidence pipeline

Six stages from prompt to sealed bundle.

Every dataset shipped from any product on this pillar passes through the same six stages. The bundle that lands in your bucket can be verified offline by anyone with the open-source evidence verifier CLI.

evidence pipeline · job_9b3df1
live
K
Sealed contract
E
Engine run
Q
Quality gates
B
BLAKE3 chain
S
Sealed bundle
rows generated
10 000
qa score
0.957
chain root
a4f2…d801
Stage 1

the sealed contract

Schema, constraints, intent and seed sealed into a JSON artefact before any data is generated.

Stage 2

Engine run

Mock, Synthesize, or ADS-driven pipeline executes against the sealed contract; per-step I/O recorded.

Stage 3

Quality gates

K-S, Pearson, χ², constraint satisfaction, per-column drift checked. Fail-closed: regression aborts.

Stage 4

Cryptographic chain

Each step's inputs and outputs hashed and chained. Tamper-evident, verifiable offline.

Stage 5

Evidence bundle

Signed .tar.zst with contract, run-log, quality report, artefact manifest, engine SBOM.

Stage 6

Tenant isolation

Per-tenant Fernet keys, per-tenant artefact prefixes, per-tenant evidence keys.

Numbers we’ll defend.

95.69 %
the flagship engine · benchmark-certified fidelity

Validated under an independent third-party QA harness. Full reproducibility certificate published on the /verify page.

43
ADS planner tools

From data_clean to shap_explain — same engines callable from chat or SDK.

14
Encrypted connectors

Snowflake, BigQuery, Databricks, Postgres, S3, GCS, Azure Blob — all Fernet-vaulted.

0
Plain-text secrets

Connector creds auto-hoisted to the encrypted vault, never written to logs.

Bring a CSV. We’ll show you the evidence bundle.

30-minute working session: you upload (or we mock) a representative dataset, we run it through Synthesize and the Autonomous Data Scientist, and you keep the signed evidence bundle and quality report.

customer-cohort-q3 · /agentplan-mode
Build me a clean training set from /datasets/customers.csv, drop nulls > 5 %, encode categoricals, train a baseline.
proposed plan · 4 steps
data_cleandrop_threshold=0.05~5c
encode_categoricalmethod=onehot~3c
train_predictivemodel=randomforest~12c
shap_explainsample=200~3c
est. total23 credits · ~38 s
▶ data_clean ✓ 4.2 c 2.1 s
▶ encode_categorical ✓ 2.8 c 1.4 s
▶ train_predictive … 7.1 / 12 c
▷ shap_explain queued
estimated 14.2 s remaining · evidence chain rolling