Cryptographic evidence · offline

Verify a real RadMah AI evidence bundle. Offline. Without an account.

Every RadMah AI run — SaaS or Data-Never-Leaves — emits a multi-artefact BLAKE3-sealed evidence bundle. Download the real example below, unzip it, inspect every JSON file, and recompute the release seal hash yourself. The bundle is tamper-evident: any in-place edit to any of the prior artefacts invalidates the ninth, and the offline verifier catches it without trusting our runtime.

The bundle. Every run. Every tier.

The bundle below is the real output of a RadMah AI FHIR R4 generation run with seed 42 and a ten-patient the sealed contract. Every file is JSON, every hash is BLAKE3, every byte is committed to the evidence chain.

the sealed contract — the sealed job specification that the run was committed to. Carries the contract hash that is the root of the evidence chain.

Download

Engine identity — which engine ran, its version, the pinned library versions, and the model-state hash. Auditable without shipping weights.

Download

Per-entity row counts, total generated rows, and any structured warnings raised during the run.

Download

Pass / fail counts against every constraint the contract declared. Hard-violation count drives the release gate.

Download

Seed, final BLAKE3 hash of the synth output, RNG stream metadata, and drift certificates. Re-running the same contract on a different cluster reproduces the same final hash byte-for-byte.

Download

Distributional-fidelity metrics and task scores. In the SaaS bundle, this carries benchmark scores (Kolmogorov-Smirnov similarity, two-sample classifier test, train-synth-test-real utility, and similar standard metrics); in local-mode bundles, the assessment method is explicitly marked 'local-mode evaluator'.

Download

Membership-inference / linkage / attribute-inference risk numbers. Marked 'local-mode evaluator' in local-mode bundles because those evaluators require a reference dataset that local SDKs do not have.

Download

Name + type + row count + BLAKE3 hash of every data artefact the run produced. The index itself is hashed and folded into the release seal.

Download

Release seal — status, evaluator verdict, hard / soft violation counts, and the BLAKE3 release-seal hash over the canonical concatenation of the prior artefacts. Verifying this hash is the offline integrity check.

Download

Reproduce the release seal yourself.

The release seal is the BLAKE3 hash over the canonical concatenation of the prior artefacts — lexicographic key order, two-line separator per file. Any in-place edit to any of those eight files moves the recomputed hash and the bundle fails verification without needing to trust our runtime.

# 1. Download the prior artefacts
mkdir evidence && cd evidence
for f in artifact_index.json constraint_report.json contract.json \
         determinism_report.json engine_manifest.json \
         privacy_report.json run_telemetry.json utility_report.json; do
  curl -fsSL -O https://radmah.ai/evidence/example/$f
done

# 2. Also download the release seal (the 9th file)
curl -fsSL -O https://radmah.ai/evidence/example/release_seal.json

# 3. Recompute the release seal hash — sorted by filename,
#    two lines per file ("<name>\n<content>"), BLAKE3-hashed
python3 - <<'PY'
import hashlib, json
from pathlib import Path
try:
    import blake3
except ImportError:
    import subprocess, sys
    subprocess.check_call([sys.executable, "-m", "pip", "install", "blake3"])
    import blake3

names = sorted([p.name for p in Path(".").glob("*.json") if p.name != "release_seal.json"])
parts = []
for n in names:
    parts.append(n)
    parts.append(Path(n).read_text())
payload = "\n".join(parts).encode("utf-8")
computed = blake3.blake3(payload).hexdigest()

declared = json.loads(Path("release_seal.json").read_text())["release_seal_hash"]

print(f"computed  {computed}")
print(f"declared  {declared}")
assert computed == declared, "SEAL MISMATCH — bundle was altered"
print("OK — bundle integrity verified offline")
PY

Determinism, proven

Re-run the same sealed contract with the same seed on any host and the final hash recorded in determinism_report.json matches byte-for-byte.

Tamper-evident

Any edit to any of the prior artefacts shifts the release seal. No way to silently mutate the bundle after the fact.

No runtime trust

Verification runs on the auditor's machine with nothing but Python and the blake3 package. RadMah AI's servers are not in the trust boundary.

Benchmark methodology — where 95.69 % comes from.

The flagship engine's 95.69 % benchmark-certified fidelity claim is measured under an independent third-party QA harness on a canonical public tabular benchmark. Full methodology below, so you can reproduce the claim end-to-end.

DatasetUCI Adult (``adult-income``) — 32 561 rows, 14 mixed continuous + categorical columns, canonical public benchmark
SplitCanonical 24 K-row shared train split; held-out test partition per benchmark convention
Engine + configFlagship engine at its default configuration — architecture and hyperparameters sealed in the engine manifest
QA harnessIndependent third-party tabular-synthesis QA library — distributional fidelity (KS), inter-column dependency (Pearson Frobenius), discriminator AUC (C2ST), membership-inference AUC, nearest-neighbour DCR
Overall score95.69 %
KS medianStrong on the flagship engine; strong on the parametric baseline (same slice)
Correlation Frobenius ΔStrong on the flagship engine; strong on the diffusion engine
ReproducibilitySame the sealed contract + seed 42 produces byte-identical output on any host; the full hash is written to the sealed determinism_report.json artefact
Claim ledgerInternal methodology source of truth: docs/BENCHMARK_CLAIMS.md — every qualified claim with exact dataset, split, seed, and commit-of-record

The numbers above are the anchor benchmark. RadMah AI emits the same KS / correlation / privacy metrics into utility_report.json and privacy_report.json on every run (SaaS tiers). In local-mode SDK bundles, the evaluator fields are marked "local-mode evaluator" because those metrics require a reference dataset the local SDK does not have. Full per-release benchmark certificates are published in the customer dashboard.

The full forensic bundle lives on the SaaS path.

This example is the SDK local-mode bundle, which omits the reference-dataset-dependent privacy and utility evaluators. The managed SaaS runs the full evaluator chain (release gate, privacy runners, utility runners) and seals the forensic-complete bundle in the same multi-artefact format.

Every RadMah AI run ships a bundle like the one above — by default, on every tier.