The deterministic synthetic data platform — tabular, healthcare, industrial, physics — with cryptographically-sealed evidence on every run.
One platform that covers what other vendors cover at most one of. Benchmark-certified tabular synthesis under an independent QA harness. HL7 FHIR R4 healthcare bundles with the clinical vocabularies already shipped. Physics- honest Virtual SCADA across the six industrial OT protocols on a deep library of CI-gated plant templates. Ground-truth-labelled MITRE ATT&CK ICS attack datasets. A contract-constrained trajectory projector. An autonomous data scientist that plans, executes, and self- heals across every engine. Every run cryptographically sealed into a multi-artefact evidence bundle, reproducible byte-for-byte across clusters.
- Benchmark-certified tabular
- 95.69 %
- Industrial OT protocols
- 6
- ATT&CK ICS coverage
- comprehensive
- SCADA plant templates
- deep library
- FHIR R4 core resources
- shipped
- Agent tool surface
- typed
- Self-heal on failure
- native
- Cryptographic evidence
- every run
- Encrypted connectors
- 14
- Reproducibility
- Byte-exact
| Scenario | Kind | Engine | Status | Duration | Started |
|---|---|---|---|---|---|
| customer-cohort-q3 | synthesize | synthesize | Succeeded | 38 s | 2 m ago |
| pump-station-w14 | simulate | virtual_scada | Succeeded | 4 m 22 | 19 m ago |
| soc-kill-chain-mix | simulate | ics_security | Succeeded | 6 m 04 | 41 m ago |
| adult-income-v12 | train | gpu_s | Succeeded | 1 m 53 | 1 h ago |
| hotel-bookings-mock | mock | mock | Succeeded | 2.1 s | 1 h ago |
| ehr-cohort-small | fabricate | ai_orchestrator | Succeeded | 14 s | 2 h ago |
| wastewater-plant-a | simulate | virtual_scada | Succeeded | 5 m 11 | 3 h ago |
Built for three audiences, integrated by construction.
We didn’t glue three products together. The pillars share one sealed-job format, one evidence chain, one tenant model, one connector vault, one agent runtime — so a SCADA run can feed a Synthesize job, an attack mix can be labelled by the same agent that cleans your CSV, and every artefact lands in the same auditable place.
Tabular synthesis at 95.69 % benchmark-certified fidelity — five engines under one contract.
A flagship enterprise engine plus four alternate engines — including a relational cascade for linked tables — all selected and driven through one sealed contract. Mock from a one-line description, synthesise from your CSV, or hand the wheel to the autonomous agent. Every artefact joins a cryptographic evidence chain; the AI Assistant lets a non-engineer drive it in plain English with explicit cost gates.
Open the synthetic data pillarHL7 FHIR R4 bundles with shipped clinical vocabularies and zero PHI.
Eight FHIR R4 resource types (Patient / Encounter / Condition / Observation / MedicationRequest / AllergyIntolerance / Procedure / Immunization), generated deterministically with 100% referential integrity by construction. Ships a large LOINC subset, a broad RxNorm set, and the full US ICD-10-CM catalogue under free licences. SNOMED CT stays BYO-licence. Two-stage validator gate (in-house structural + full R4 datatype conformance, no Java runtime required).
Open the healthcare fhir pillarSix OT protocols at IEEE-spec binary level, 67 MITRE ATT&CK ICS techniques, 67 CI-gated plant templates.
Six industrial OT protocols — Modbus, OPC-UA, BACnet, MQTT, DNP3, IEC 61850 — with real wire-level packet capture. Physics-honest process kernels spanning water-treatment, power, and chemical facilities. Air-gapped virtual controllers for red-team and operator-training scenarios. Ground-truth-labelled attack datasets mapped to the public MITRE ATT&CK ICS framework with blast-radius cascade modelling. Compose the three into a sealed cyber-range bundle your IDS or SOC platform can score against.
Open the industrial simulators pillarOne typed surface, three entry points.
Typed Python SDK with Pydantic v2 throughout and an async-first primitive set, OpenAPI 3.1 REST surface with idempotency keys and HMAC-SHA256-signed webhook payloads, and 14 encrypted source connectors that auto-vault inline secrets to a Fernet-encrypted store. A single engineer can wire RadMah AI into a real data plane in an afternoon — and a contract-test runner is already waiting for your CI.
Open the developer platform pillarSix surfaces, one evidence chain.
A walk through the real surface — the agent chat that drives the whole platform, a live SCADA HMI driving real industrial protocols, the sealed ICS attack timeline, the OpenAPI REST surface, the typed SDK and REPL, the Autonomous Data Scientist runtime, and the cryptographically-chained evidence bundle an auditor opens offline. Every panel below is wired through the same sealed-job format, the same cryptographic chain, the same tenant vault.
Six industrial tags on real wire protocols.
Modbus/TCP 502 and OPC-UA 4840 streaming at a 2 ms cycle. Six tags — discharge pressure, suction pressure, motor temperature, VFD output, flow, and the MT-621 over-speed alarm — all driven by a physics-honest plant model, not canned CSVs. The HMI shows the same values a real operator sees on the panel; the seal at the bottom is the run's BLAKE3 root, not a decoration.
- ◆ Deep library of pre-built plant templates across verticals
- ◆ 6 protocols — Modbus, OPC-UA, BACnet, MQTT, DNP3, IEC 61850
- ◆ Sealed signals + alarms + commands streams per run
Ground-truth attacks, not heuristic guesses.
A six-minute window with six MITRE ATT&CK ICS events overlaid on the SCADA pressure trace — command injection, view spoofing, parameter modification, alarm suppression, program modification. Every pin is a row in truth.ndjson with start time, stop time, technique ID, and asset ID. Your IDS no longer needs a human to label the validation set.
- ◆ 9 MITRE ATT&CK ICS classes wired end-to-end
- ◆ pcapng + signals.parquet + truth.ndjson in one sealed bundle
- ◆ Per-event severity + impact classes for regression-testing detection rules
| data_clean | drop_threshold=0.05 | ~5c |
| encode_categorical | method=onehot | ~3c |
| train_predictive | model=randomforest | ~12c |
| shap_explain | sample=200 | ~3c |
Plain-English driver, explicit cost gate.
The same chat surface drives Mock, Synthesize, Virtual SCADA, ICS attack composition, and the Autonomous Data Scientist. Before anything expensive runs, the agent returns a plan card with concrete steps, the compute class it intends to use, and a credit estimate. Nothing spends until you tap approve — and the entire transcript, including the plan, the approvals, the tool calls, and the sealed outputs, lands in the same evidence bundle.
- ◆ Soft-cap per turn, hard-cap per project — the agent stops before it overspends
- ◆ Every tool call is typed, versioned, and signed into the run ledger
- ◆ Transcript + plan + approvals are part of the BLAKE3-sealed bundle
Python on the left, real-time run events on the right.
The Python SDK is Pydantic-v2-typed throughout, async-first, and fully type-checked — no untyped dicts, no string-built payloads. The REPL side shows the live stream from the same run: epoch loss, quality gates, contract seal events, and the BLAKE3 root verification at the end. The SDK is the same object model the REST API speaks, so migrating from prototype to production is a one-line client swap.
- ◆ 100% type coverage · async-first primitives
- ◆ Identical object model across SDK and REST surface
- ◆ Contract-test runner drops into your CI out of the box
1from radmah_sdk import Client2 3client = Client(api_key="rm_live_…")4 5# Synthesize from a CSV upload6ds = await client.datasets.upload("./customers.csv")7job = await client.synthesize.run(8 dataset_id=ds.id,9 model="synthesize",10 rows=10_000,11 seed=42,12)13await job.wait()14 15bundle = await client.evidence.fetch(job.id)16assert bundle.verify() # BLAKE3, offline17print(job.metrics.qa_score)- ▶ uploading customers.csv ………… 4.7 MB
- dataset ds_4a7c81 · 20 cols · 482 931 rows
- ▶ POST /v1/synthesize/jobs 202
- job_9b3df1 · estimate 38 s · 14 credits
- ▶ training the engine on 17 numeric / 3 cat …
- epoch 12 / 20 loss 0.083
- epoch 20 / 20 loss 0.041
- ▶ generating 10 000 rows ………… 1.6 s
- ✓ K-S gate 0.018 · χ² gate ok · constraints ok
- ▶ fetching evidence bundle …
- ✓ chain verified root a4f2…d801
- >>> 0.9569
Every job runs through six guarantees.
The same six guarantees apply whether the engine is Mock, Synthesize, Virtual SCADA, ICS Security, or an autonomous agent. They are not premium features you turn on — they are the only path through the platform.
Sealed before run
Every job opens with a sealed job specification that captures what you asked for and how it must be generated. Once committed it cannot be silently mutated; the spec's cryptographic hash is the root of the entire downstream evidence chain.
Per-step provenance
As the engine runs, each step's inputs, outputs, parameters, hardware, and timing are written to a per-step proof packet. Any tool-generated code is versioned, and every quality gate is watched in real time — if a gate trips, the failing sub-graph is repaired and the chain is rebuilt without losing what already passed.
Quality fail-closed
Distribution distance, correlation structure, constraint satisfaction, and per-column drift are checked at the end of every generation step. There is no silent degradation — a regression aborts the run and the operator sees the exact gate that failed with a recommended fix.
Cryptographically-chained ledger
Each step's IO + parameters are hashed and chained into a tamper-evident ledger. The chain root makes the bundle uniquely identifiable, and any in-place mutation of any artefact downstream is immediately visible to the offline verifier — no trust in the runtime required.
Signed, portable bundle
The final .tar.zst bundle ships with the contract, the per-step run-log, the quality report, the BLAKE3 manifest, the SBOM of the engine version that produced it, and a plain-English narrative for the auditor. Re-running the contract on a different cluster yields the same dataset hash — byte for byte.
Tenant-isolated, end to end
Per-tenant Fernet keys at rest, per-tenant artefact prefixes, per-tenant evidence keys, JWT-scoped API surface, ORM-level row filtering. There is no path through the system where data from tenant A can be physically retrieved by tenant B — not in logs, not in caches, not in run state, not in evidence bundles.
What RadMah AI does, in concrete numbers.
A single platform that covers tabular synthesis, healthcare FHIR, industrial OT, ICS attack data, physics-constrained sampling, agentic orchestration, and cryptographic evidence. Each row below maps to code we ship today.
| Capability | What ships today |
|---|---|
| Tabular synthesis | Benchmark-certified fidelity under an independent QA harness; a flagship engine plus four alternates (including a relational cascade for linked tables) selected through one sealed contract |
| Deterministic re-generation | Sealed contract + seed — byte-identical output across clusters and time |
| Cryptographic evidence per run | Multi-artefact cryptographic bundle by default on every job; offline verifier ships with the SDK |
| Constraint-aware generation | Primary-key / foreign-key, monotonicity, sum, and rate-limit constraints; hard-projection post-processor for trajectories |
| Healthcare FHIR R4 | Core resource types with the standard clinical vocabularies shipped; premium vocabularies BYO; conformance validator at run time; zero PHI |
| Industrial OT simulators | Six OT protocols at binary-spec level (Modbus, OPC-UA, BACnet, MQTT, DNP3, IEC 61850); wire-level packet capture; physics-honest process kernels |
| ICS attack datasets | Comprehensive coverage of the public MITRE ATT&CK ICS framework with per-event ground-truth labels; blast-radius cascade modelling; configurable attack density |
| Autonomous data scientist | Planner + executor + self-healer driving a typed tool surface; cryptographically-chained decision audit; human-approval gate on credit-spending steps |
| Connector secrets | Per-tenant encrypted vault, auto-hoisted from inline config, per-tenant key-encryption-key rotation |
| Deployment options | Managed SaaS (multi-tenant); Enterprise Data Never Leaves (signed container delivery on customer's network, licence-bound distribution) |
Benchmark numbers are reproducible under the same independent QA harness using matched train / test splits. Full per-release benchmark certificates are published in the customer dashboard and on the public /verify page, where you can download and byte-verify a real 9-artefact BLAKE3 bundle.
Posture that survives procurement.
The four guarantees below are not toggles in a settings page. They are architectural — turning them off would mean re-writing the platform.
Tenant-isolated end-to-end
Per-tenant Fernet at rest, per-tenant artefact prefixes, ORM-level row filtering, JWT-scoped API.
Spend-bounded by design
Soft caps per turn, hard caps per project; the agent halts and asks before any large spend.
Audit-ready by default
Sealed transcripts and signed bundles are not premium — they are how the platform works.
Replayable across clusters
Re-run the same sealed job spec + seed on any cluster: dataset hash matches byte-for-byte.
Bring a real dataset. We’ll ship a sealed bundle.
30-minute working session: bring (or we mock) a representative dataset and one open question. We drive Mock, Synthesize, the Autonomous Data Scientist, or any combination, end-to-end. You keep the BLAKE3-sealed evidence bundle, the quality report, and a sandbox API key to keep iterating.