May 4, 2026
6 min read

The Trusted Setup: How Datasent Keeps Raw Data Local

A trusted setup is often associated with cryptographic ceremonies and secret key generation. Datasent's version is simpler — and more powerful. No secrets required.

image

Not a Cryptographic Ceremony

In zero-knowledge proof systems, a trusted setup is aone-time ceremony that generates public parameters and, critically, toxic waste— secret values that must be destroyed. If they are not destroyed, the security of the entire system collapses. These ceremonies require multiple independent parties, elaborate coordination, and significant operational overhead.

Datasent's trusted setup shares the name but not the complexity. It is an agreement between sender and receiver on three thingsbefore any data-dependent computation occurs: the basis type and parameters touse for modelling, the segmentation strategy, and the quantisation and rounding rules. No secrets are generated. No secret destruction is required. The setup parameters are fully public.

Their role is simply to ensure that both parties canindependently regenerate the same basis matrix from a compact metadata description — without transmitting the basis itself, and without ever needing to see the raw data.

"Raw data never traverses organisational or networkboundaries in any Datasent configuration. What crosses the network is only theresidual — or, in governed deployments, secret shares of coefficient data"

Three Roles, One Protocol

The Datasent protocol involves three logical actors. The Data Holder possesses the raw data, performs local fitting, and transmitsre siduals. The Reconstruction Party receives residuals and reconstructs the data under authorisation, but never possesses the raw source. The Custodian is an optional gatekeeper that holds shares of coefficient data and releases the monly upon explicit authorisation.

In a standard deployment, the Data Holder sends residuals to the Reconstruction Party. The Custodian holds a secret share of the coefficient data. Reconstruction requires the Custodian to release its share —which it does only after an authorisation check. Every reconstruction event is logged. Every authorisation can be revoked.

This architecture reframes data sharing as a governedtoken exchange. Neither party can reconstruct the data alone. Raw data never leaves the Data Holder. And because reconstruction is deterministic, every output can be verified against a stored hash of the original canonical representation.

Threshold Sharing

In the two-of-two XOR sharing scheme, the coefficient data for a segment is split into two shares using a uniformly random bitstring. Either share alone reveals nothing about the underlying coefficients — this is information-theoretic secrecy, not computational security. Only the combination of both shares recovers the original.

For deployments requiring more complex governance ,Shamir secret sharing generalises this to k-of-n: any k shares reconstruct thedata, but any k minus one shares reveal nothing. This supports multi-custodianmodels where data reconstruction requires sign-off from a quorum of authorisedparties.

Integrity Without Zero-Knowledge Proofs

Because the Datasent representation is deterministic, reconstructed data can be re-canonicalised and its hash compared to a digeststored at encoding time. Tampering with residuals or coefficients is detectable without any cryptographic machinery beyond a standard collision-resistant hash function.

This provides a verifiable chain of custody from raw datacollection to final output. Future work on ZK proof integration would allow parties to prove that residuals correspond to a valid fitting, that reconstruction occurred under authorisation, and that coefficients satisfy error bounds — all without revealing any underlying data. The architecture is designed to support this extension.

Conclusion

In most systems, “trust” is introduced after the fact — through encryption, controlled environments, or complex access layers. But these approaches all share the same limitation: raw data has already moved.

Datasent redefines what a trusted setup means.

Instead of relying on secrets, keys, or isolated environments, trust is established structurally — through a shared model basis and deterministic encoding. Raw data never leaves its original environment, and only the minimal, non-sensitive residual is ever transmitted.

image

John Rhye

Position

Copy link

Articles

Research, ideas, and practical guides

Why Raw Data Shouldn't Travel

The assumption that raw data must move between systems to be useful is baked into almost every layer of modern infrastructure. Datasent was built to break it.
See more

Lossless by Design: How Datasent Guarantees Exact Reconstruction

Most compression systems trade fidelity for size. Datasent refuses that bargain. Here is how exact reconstruction is guaranteed, mathematically, for every data type.
See more

The Trusted Setup: How Datasent Keeps Raw Data Local

A trusted setup is often associated with cryptographic ceremonies and secret key generation. Datasent's version is simpler — and more powerful. No secrets required.
See more

AI's Data Problem — and Why It Starts at the Infrastructure Layer

Federated learning keeps gradients local. Clean rooms control access. But neither keeps raw data where it belongs and supports lossless, governed, multimodal sharing. Datasent does.
See more