The assumption that raw data must move between systems to be useful is baked into almost every layer of modern infrastructure. Datasent was built to break it.

Every time a file, database record, or sensor reading crosses a network boundary, three things accumulate: cost, risk, and compliance exposure. Cloud egress fees are charged per gigabyte. Data-in-transit is atarget for interception. And every organisation that touches your raw data becomes a node in a liability chain you did not design.
Yet almost nothing in the conventional data stack questions this assumption. Compression reduces the size of what travels. Encryption wraps it in a protective layer. Columnar formats make storage more efficient. But the underlying model remains unchanged: to use data, you shipit.

"Datasent is designed around the opposite assumption. If sender and receiver agree on a deterministic representation upfront, most of the data can be regenerated locally. Only the unpredictable part needs to move."
At the heart of Datasent is a mathematical insight: structured data contains enormous predictability. A time-series of sensor readings follows polynomial trends. An audio file follows spectral patterns. Atable of financial records follows relationships between columns. Asufficiently expressive model can capture most of that structure — and what itmisses, the residual, can be represented compactly.
When both sides of a data exchange agree on a set of model families upfront — what Datasent calls a trusted setup — the sender never needsto transmit the raw data. They transmit only the residual: the difference between what the agreed model predicts and what actually happened. The receiver reconstructs the original, exactly, from the residual plus locally regenerated predictions.
This is not an approximation. Datasent's reconstruction guarantee is mathematically exact: every bit of the original is recoverable,regardless of how well or poorly the model fits the data. If the model captures nothing, the residual equals the original, and nothing is lost. If the model captures a great deal, the transmitted payload may be orders of magnitude smaller than the raw data.
The architectural consequences are significant. Rawdata stays with the organisation that collected it. The only thing that crosses boundaries is a structured, governed artifact — a residual token — that carriesnone of the original PII, PHI, or sensitive business logic on its own.
This reframes data sharing from a compliance burden into agoverned exchange. Partners can work with your data — running models, feeding dashboards, training AI — without ever possessing it. Every reconstruction event is logged. Every share can be revoked. The custodian model means thateven the party holding the residuals cannot reconstruct the data without anauthorisation step.
The data centre footprint shrinks. Bandwidth costs fall. And the surface area for breach, interception, or regulatory exposure contracts with every kilobyte that never leaves the building.

The belief that raw data must move in order to be useful has shaped decades of data infrastructure — but it no longer holds up.
As systems scale and regulations tighten, moving raw data across environments introduces unnecessary risk, cost, and complexity at every layer. What once enabled analytics is now one of the biggest blockers to secure, real-time intelligence.
Datasent challenges this assumption by separating what matters from what doesn’t. By keeping raw data at the source and transmitting only structured representations, organizations can still generate insights, train models, and collaborate — without exposing sensitive information.

John Rhye
Position