Mock Data

Schema in. Sealed dataset out.

Describe a dataset in one English sentence, get back a fully-sealed synthetic CSV in under a minute — with the contract, the seed, the cryptographic hash chain, and the same offline verifier as every other engine on the platform. Same prompt, same seed: byte-for-byte equal across clusters, regions, and years.

Try a working session Compare with full synthesis

Latency: ~38 s
Industries: 47
Determinism: byte-equal
Bundle: sealed

mock / saas-customers · 200 rows · seed 42rapid-rrf

prompt

“200 SaaS customer accounts with MRR 50–5000, plan Starter/Growth/Enterprise, region AMER/EMEA/APAC, signup date in the last 18 months.”

sealed contract · sealed

{
  "rows":   200,
  "seed":   42,
  "fields": ["id","name","plan","mrr","region","signup"],
  "ranges": { "mrr":[50,5000] },
  "enums":  { "plan":["S","G","E"], "region":["AMER","EMEA","APAC"] }
}

rows.preview · first 6 of 200byte-equal on rerun

id	name	plan	mrr	region	signup
cust_a17	Aurora Kade	Growth	$1284.50	EMEA	2025-08-12
cust_b22	Marcus Yan	Enterprise	$4192.00	AMER	2024-11-30
cust_c08	Inès Bouchard	Starter	$79.00	EMEA	2026-02-04
cust_d31	Hiroshi Tanaka	Growth	$962.10	APAC	2025-05-19
cust_e44	Priya Ramanathan	Enterprise	$3580.75	APAC	2025-01-22
cust_f17	Liam O'Connor	Growth	$1107.40	AMER	2025-09-03

latency

38 s · 200 rows

contract sha

9c10ab…

cryptographic hash root

a4f2…d801

Six promises Mock Data keeps every time.

These aren’t configuration toggles — they are how the engine was built. Switch none of them off, configure none of them on. The engine simply behaves this way for every prompt, every tenant, every run.

Promise · 01

Sub-minute, every time

Our rapid fabricator is built for latency, not training. Most prompts complete in 20–60 s for up to 200 k rows — including evidence sealing — so it sits comfortably inside an interactive UX or a CI pipeline.

Promise · 02

Deterministic by construction

Same prompt + same seed + same engine version = byte-equal output. We test this in CI: the regression suite re-runs 47 reference contracts and checks every cell against a stored hash. If a single byte drifts, the build fails.

Promise · 03

Any industry, on demand

Mock is LLM-driven schema fabrication, not a fixed template list. Describe a banking dataset, an EHR cohort, a smart-meter feed, a retail basket, a manufacturing line — the engine drafts a domain-aware the sealed contract with sensible defaults, and you can override any of them per prompt.

Promise · 04

Same evidence chain as the rest

Mock outputs join the same cryptographic hash-chained ledger as Synthesize, Virtual SCADA, and ICS Security. The bundle that ships with a Mock dataset is structurally identical — your downstream tooling treats them the same way.

Promise · 05

Two surfaces, same engine

Use the `rady` CLI for shell pipelines and one-off ops, or call `client.mock(...)` from the typed Python SDK. The contract goes through the same compiler; the audit log records who called it and from where.

Promise · 06

Quality fail-closed

Even on the fast path we check column-level distribution sanity, range bounds, enum compliance, and nullability against the contract before the bundle is sealed. A regression aborts; you never receive a quietly-broken dataset.

How it works

From sentence to sealed bundle in six stages.

Each stage writes a typed artefact into the chain. Even the prompt itself joins the audit trail — so an auditor can trace any byte in the dataset back to the operator who asked for it.

CLI · sample

rady mock \
  --prompt "200 SaaS customers, MRR 50-5000, plan
            Starter/Growth/Enterprise, region AMER/EMEA/APAC,
            signup date in last 18 months" \
  --rows 200 --seed 42 \
  --output ./customers.csv

Mock · 200 rows · seed=42
  Prompt: 200 SaaS customers, MRR 50-5000, plan Starter/Growth/…
✓ Seal 9c10ab24 created
✓ Job e2f91b37 submitted (200 rows, seed=42)
✓ Wrote 23,487 bytes to ./customers.csv
→ inspect the evidence bundle: rady jobs evidence e2f91b37

Stage 01

Describe what you want

One natural-language sentence, or an inline JSON schema, or a YAML pulled from a referenced industry template — pick the surface that fits your tooling. The compiler accepts all three and normalises to the sealed contract.

Stage 02

the sealed contract seals before any row is born

Schema, ranges, enums, constraints, seed, engine version and operator intent are sealed into a JSON artefact. The contract hash is the unique identity of every dataset that comes from this run.

Stage 03

Generation

The fabricator pulls from the contract: sample rules, range/enum constraints, correlation rules where defined. No external model call by default; fully air-gappable.

Stage 04

Quality gates check

Distribution sanity, range bounds, enum compliance, nullability and constraint satisfaction are checked. Fail-closed — a mismatch aborts before sealing.

Stage 05

cryptographic hash-chain + sign

Per-step IO hashed and chained; the final bundle is signed and ready to verify offline with the open-source evidence verifier CLI.

Stage 06

Bundle delivered

.tar.zst lands on disk, in S3, in your tenant artefact prefix, or as a stream — your choice. Re-running the contract anywhere yields the same hash.

Why it’s not just another faker.js with a UI.

Most “mock data” tools are shell scripts behind a SaaS skin. our rapid fabricator treats every output as evidence — sealed, hashed, replayable, and auditable on the same chain as the rest of the platform.

Capability	RadMah AI Mock	Typical mock-data tool
Latency for 200 k rows	20 – 60 s incl. evidence	Minutes to hours
Same prompt → same output	Byte-equal, contract-pinned	Run-to-run drift
Industries	47, all overridable	Hand-built per project
Evidence bundle per run	cryptographic hash chain by default	Optional, paid add-on
Air-gapped operation	No external model call	LLM round-trip required
Quality fail-closed	Distribution / range / enum gates	Best-effort, silent drift
Same chain as Synthesize / SCADA / ICS	Yes — one ledger	Separate tools, separate audit trails

Where Mock Data earns its keep.

Four work patterns the team uses every week. None of them require touching production data; all of them produce a bundle you can hand to anyone in the business.

Sandbox / demo data

Stand up a realistic-looking demo environment for sales, customer success, or training without touching production. Re-seed in 30 seconds for a fresh run.

Load & soak testing

Generate millions of rows that pass your validation but exercise edge ranges your real data doesn't carry. Seed-pinned so test failures are reproducible.

Test fixtures in CI

Drop `rady mock` into your CI to generate fixture data per branch. The contract is checked into git; the bundle is not — but it rebuilds byte-equal on demand.

Rapid prototyping

Prototype an analytics dashboard, an ML model, an API contract — all against synthetic data shaped like the real thing, before legal even joins the call.

Posture you don’t configure.

All four guarantees apply by default. None of them are paid add-ons.

Tenant-isolated

Per-tenant Fernet at rest, per-tenant artefact prefix, ORM-level row filtering — same as the rest of the platform.

Air-gappable

No external LLM call by default; runs entirely inside your VPC or air-gapped enclave.

Tamper-evident

Offline verifier flags any in-place mutation of the bundle, regardless of how it travelled.

Audit-ready

Caller, scope, contract hash and bundle root recorded in the audit log on every call.

Bring a sentence. We’ll ship a sealed bundle.

30-minute working session: tell us the dataset shape you need and one downstream task you want to drive with it. We’ll cut the contract, generate the bundle, and walk you through verifying it offline.

For the non-technical reader: Mock Data is the fastest path from "I need a realistic-looking dataset for a demo / an integration test / a pitch deck" to a sealed file you can hand a teammate. No database, no ETL, no warehouse ticket, no engineer. A product manager writes the prompt in English; the fabricator returns a deterministic CSV plus an evidence bundle that anyone downstream can verify offline.

For the engineer: under the hood the prompt is parsed into a contract (columns, dtypes, cardinalities, business rules), the contract is sealed, and a domain-aware fabricator emits rows under a fixed seed. Re-run the same prompt with the same seed a year later and the output is byte-equal. The CLI path isrady mock generate— same contract, same seed, same bundle, every time.

Book the session Or train on your CSV

prompt

200 SaaS customers, MRR 50-5000, plan Starter/Growth/Enterprise, region AMER/EMEA/APAC, signup date in last 18 months.

customers.csv · 5 / 200 rowsdeterministic · seed 42

id	name	plan	mrr	region
cust_01	Amelia Park	Growth	2,180	AMER
cust_02	Noah Doyle	Enterprise	4,950	EMEA
cust_03	Priya Nair	Starter	210	APAC
cust_04	Liam Müller	Growth	1,840	EMEA
cust_05	Sofia Cruz	Enterprise	3,620	AMER

✓ 200 rows · sealed bundle 4.7 MBrady evidence verify