Privacy-Enhancing Technologies: a practical guide to using data without exposing people
Privacy-Enhancing Technologies (PETs) have become a lot more than an academic idea or a specialist security feature. They are increasingly the difference between a programme that can use data confidently — and one that spends its time stuck in approvals, exceptions, and arguments about risk.
Most organisations are trying to do three things at once:
- use more data (for insight, automation, and AI),
- reduce privacy and security risk, and
- keep pace with regulation and stakeholder expectation.
PETs exist in the overlap. They help you get value from data while reducing exposure of personal information, limiting re-identification, and making it easier to demonstrate “due care” in design.
What matters is not the buzzword list. What matters is choosing the right approach for the problem you actually have: Are you trying to share data? Analyse it? Train models? Reduce breach impact? Protect logs? PETs are a toolkit — and like any toolkit, outcomes depend on selection and how you integrate them into delivery and operations.
What counts as a PET (and what doesn’t)
A useful way to think about PETs is that they change the shape of risk. Instead of relying only on policy (“don’t do X”) or process (“get approval for Y”), PETs add technical constraints that make it harder for privacy failures to occur and easier to defend the design when challenged.
PETs usually help in one of four ways:
- Reduce what you collect and store (so there is less to lose, leak, or misuse).
- Reduce identifiability (so the dataset remains useful without being easily tied to a person).
- Restrict who can see what (so access is controlled and auditable).
- Enable analysis without sharing raw data (so collaboration becomes possible without centralising risk).
Not everything branded as “privacy” is a PET. A policy document isn’t a PET; a consent banner alone isn’t a PET. They’re important, but PETs are about technical mechanisms and designs that materially reduce exposure.
The two families: “protect the data” and “make privacy operational”
In practice, PETs fall into two families. The first is the one people talk about most: cryptography, anonymisation methods, privacy-preserving analytics. The second is less glamorous but often more impactful: the tools and patterns that make privacy controls repeatable across teams.
Data-centric PETs are about the data itself: encryption, tokenisation, differential privacy, secure multi-party computation, federated learning, secure enclaves, and privacy-preserving identity methods.
Process-centric PETs are about repeatability: privacy-by-design patterns, discovery and classification, DPIA workflows, data subject rights tooling, consent management, and continuous monitoring of controls.
If you only adopt the “clever” data-centric PETs without the operational layer, they tend to remain stuck in pilots. If you only adopt process-centric tooling, you often end up with good paperwork but weak technical constraint. Strong programmes combine both.
Start with the problem: where PETs fit in real delivery
A simple way to avoid “PowerPoint PETs” is to anchor the conversation in the lifecycle of data. Most privacy risk appears at predictable points: when data is collected, when it’s stored, when it’s processed, and when it’s shared.
A few examples illustrate this.
If your risk is breach impact, PETs like encryption and key management matter most. The aim is not just “encrypt everything”, but to ensure keys are controlled, access is auditable, and the blast radius is reduced if something goes wrong.
If your risk is exposure in non-production, then tokenisation, masking, and synthetic data often deliver faster and cheaper risk reduction than complex cryptography. Many organisations leak privacy through test environments and logs long before they leak it through production databases.
If your risk is data sharing between organisations, this is where more advanced PETs begin to earn their keep. Secure multi-party computation allows joint analytics without anyone handing over raw datasets. Federated learning allows model training without centralising sensitive records (with caveats, which we’ll come to).
If your risk is publishing or reporting insights, differential privacy becomes relevant. It’s one of the few approaches that can provide mathematically framed privacy guarantees — but it requires careful handling to avoid undermining either privacy or usefulness.
This framing keeps PET selection grounded. You aren’t choosing a technology because it sounds modern; you’re choosing it because it addresses a specific risk and supports a specific outcome.
The PETs you’ll use most often (and why)
Encryption is still the foundation. It protects data at rest and in transit, and it reduces the impact of many incidents. But encryption is only as strong as your key management and access model. If the same users and services that can query the database can also decrypt everything, you haven’t reduced misuse risk — you’ve mainly reduced exposure to outsiders. That’s still valuable, but it’s not the whole story.
Tokenisation and masking are workhorses. They reduce exposure while preserving operational usefulness, particularly in logs, analytics, and development. They are also easier to operationalise and audit than some advanced PETs. The key design question is always: who can re-link tokens to real identities, under what controls, and how is that evidenced?
Pseudonymisation is often more realistic than “anonymisation”. True anonymisation is hard because identifiability depends on context: other datasets, uniqueness, and how outputs are used. Many privacy failures happen when organisations assume anonymisation is permanent, and then later combine datasets or add attributes that re-enable re-identification. The safe approach is to treat anonymisation as a claim that must be tested and revalidated, not assumed.
Differential privacy is most useful when you need to produce statistics about groups while protecting individuals. It works by adding carefully calibrated noise to outputs. The trade-off is explicit: the stronger the privacy guarantee, the more you may lose accuracy. Implementations require governance over “privacy budget” so repeated queries don’t slowly erode protection.
Secure multi-party computation (SMPC) is about collaboration without raw sharing. It’s attractive for fraud detection, joint analytics, and regulated partnerships. The challenge is less the theory and more the operational overhead: protocol selection, latency, participant coordination, and resilience. SMPC is a strong option when you have a clearly defined computation that multiple parties need and when the cost of centralising data is unacceptable.
Federated learning has become popular because it promises “AI without centralising data.” It can work well, but it is not automatically private. Model updates can leak information, so serious deployments usually combine federated learning with additional controls such as secure aggregation and, in some cases, differential privacy. The lesson is that federated learning is an architecture pattern — privacy comes from the full design, not the label.
Zero-knowledge proofs and selective disclosure are best understood through identity use cases: proving something about a person without revealing everything about them. They can reduce data collection in verification flows, but they require careful implementation and are most valuable when you have repeated verification needs and high sensitivity.
How to implement PETs without getting stuck in “pilot-land”
A PET programme succeeds when it looks like a delivery capability, not a research project. The common path looks like this:
First, pick a small number of high-value use cases where privacy risk is blocking progress or increasing cost. Good candidates are usually analytics, non-prod environments, logging, data sharing, or ML workflows — areas where exposure accumulates quickly.
Second, do enough data mapping to be accurate. You don’t need a perfect enterprise-wide model before starting, but you do need to understand the specific flows, the systems involved, who accesses what, and where data leaves your control.
Third, design for evidence from day one. PETs are often adopted to make compliance and assurance easier. That only works if you can show: what control is applied, where, how it is monitored, and how re-identification is controlled (if it exists at all). If you can’t evidence it, you’ll still end up in exceptions and debate.
Finally, industrialise what works. Turn the outcome into patterns: standard tokenisation approaches, approved libraries, pipeline checks, reference architectures, and short “how to use this safely” guidance that engineering teams can follow without a specialist in every squad.
This is where the process-centric PETs matter. Discovery tooling, DPIA workflows, retention automation, and rights handling don’t replace technical PETs — they make them sustainable.
The trade-offs nobody should pretend away
There are three trade-offs that show up repeatedly.
Privacy vs utility: The more you constrain identifiability, the more you may lose detail or accuracy. The right answer depends on the use case and the harm model. “Maximum privacy” is not always the goal; “acceptable privacy for acceptable value” is.
Security vs usability: Strong controls can slow delivery if they’re bolted on late or implemented inconsistently. The best programmes build “golden paths” so the secure option is the easiest option.
Innovation vs governance: PETs can enable new capabilities, but they need governance that is lightweight and engineering-friendly. If governance becomes a gate, teams route around it. If it becomes a pattern catalogue and evidence framework, teams adopt it.
A practical view of “best practice”
If you want PETs to land well, focus on a few principles:
- Treat PETs as part of architecture and delivery, not just privacy policy.
- Start with the most common leakage points: logs, non-prod, unmanaged sharing.
- Use advanced PETs where they genuinely unlock collaboration or analytics that would otherwise be too risky.
- Build repeatable patterns and evidence, so teams can adopt safely at scale.
- Reassess routinely: identifiability and threat models change as datasets and uses evolve.
Conclusion: PETs are a capability, not a checkbox
PETs are not about being trendy or “privacy-washing” a system design. They are about changing what is technically possible, so privacy protection becomes a property of the system — not just a promise made in a policy.
For most organisations, the fastest wins come from getting the basics right (minimisation, access, encryption, tokenisation, safe non-prod, controlled logging), and then selectively adopting advanced PETs to enable higher-value use cases such as cross-organisation analytics or privacy-preserving AI.
The organisations that do this well end up with something measurable: reduced data exposure, fewer exceptions, faster approvals, safer collaboration, and stronger trust. In other words, PETs stop being “a privacy initiative” and become part of how modern services are designed and operated.
External references
- UK ICO — UK GDPR guidance and resources: https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/
- EDPB — Guidelines and documents: https://edpb.europa.eu/our-work-tools/our-documents_en
- NIST Privacy Framework: https://www.nist.gov/privacy-framework
- ENISA — Data protection and privacy engineering topics: https://www.enisa.europa.eu/topics/data-protection
- Harvard Privacy Tools Project — Differential privacy resources: https://privacytools.seas.harvard.edu/differential-privacy
- OpenMined — privacy-preserving ML community resources: https://www.openmined.org/
