Unlocking secure, private AI with confidential computing
All of a sudden, it seems that AI is everywhere, from executive assistant chatbots to AI code assistants.
But despite the proliferation of AI in the zeitgeist, many organizations are proceeding with caution. This is due to the perception of the security quagmires AI presents. For the emerging technology to reach its full potential, data must be secured through every stage of the AI lifecycle including model training, fine-tuning, and inferencing.
This is where confidential computing comes into play. Vikas Bhatia, head of product for Azure Confidential Computing at Microsoft, explains the significance of this architectural innovation: “AI is being used to provide solutions for a lot of highly sensitive data, whether that’s personal data, company data, or multiparty data,” he says. “Confidential computing is an emerging technology that protects that data when it is in memory and in use. We see a future where model creators who need to protect their IP will leverage confidential computing to safeguard their models and to protect their customer data.”
Understanding confidential computing
“The tech industry has done a great job in ensuring that data stays protected at rest and in transit using encryption,” Bhatia says. “Bad actors can steal a laptop and remove its hard drive but won’t be able to get anything out of it if the data is encrypted by security features like BitLocker. Similarly, nobody can run away with data in the cloud. And data in transit is secure thanks to HTTPS and TLS, which have long been industry standards.”
But data in use, when data is in memory and being operated upon, has typically been harder to secure. Confidential computing addresses this critical gap—what Bhatia calls the “missing third leg of the three-legged data protection stool”—via a hardware-based root of trust.
Essentially, confidential computing ensures the only thing customers need to trust is the data running inside of a trusted execution environment (TEE) and the underlying hardware. “The concept of a TEE is basically an enclave, or I like to use the word ‘box.’ Everything inside that box is trusted, anything outside it is not,” explains Bhatia.
Until recently, confidential computing only worked on central processing units (CPUs). However, NVIDIA has recently brought confidential computing capabilities to the H100 Tensor Core GPU and Microsoft has made this technology available in Azure. This has the potential to protect the entire confidential AI lifecycle—including model weights, training data, and inference workloads.
“Historically, devices such as GPUs were controlled by the host operating system, which, in turn, was controlled by the cloud service provider,” notes Krishnaprasad Hande, Technical Program Manager at Microsoft. “So, in order to meet confidential computing requirements, we needed technological improvements to reduce trust in the host operating system, i.e., its ability to observe or tamper with application workloads when the GPU is assigned to a confidential virtual machine, while retaining sufficient control to monitor and manage the device. NVIDIA and Microsoft have worked together to achieve this.”
Attestation mechanisms are another key component of confidential computing. Attestation allows users to verify the integrity and authenticity of the TEE, and the user code within it, ensuring the environment hasn’t been tampered with. “Customers can validate that trust by running an attestation report themselves against the CPU and the GPU to validate the state of their environment,” says Bhatia.
Additionally, secure key management systems play a critical role in confidential computing ecosystems. “We’ve extended our Azure Key Vault with Managed HSM service which runs inside a TEE,” says Bhatia. “The keys get securely released inside that TEE such that the data can be decrypted.”
Confidential computing use cases and benefits
GPU-accelerated confidential computing has far-reaching implications for AI in enterprise contexts. It also addresses privacy issues that apply to any analysis of sensitive data in the public cloud. This is of particular concern to organizations trying to gain insights from multiparty data while maintaining utmost privacy.
Another of the key advantages of Microsoft’s confidential computing offering is that it requires no code changes on the part of the customer, facilitating seamless adoption. “The confidential computing environment we’re building does not require customers to change a single line of code,” notes Bhatia. “They can redeploy from a non-confidential environment to a confidential environment. It’s as simple as choosing a particular VM size that supports confidential computing capabilities.”
Some industries and use cases that stand to benefit from confidential computing advancements include:
- Governments and sovereign entities dealing with sensitive data and intellectual property.
- Healthcare organizations using AI for drug discovery and doctor-patient confidentiality.
- Banks and financial firms using AI to detect fraud and money laundering through shared analysis without revealing sensitive customer information.
- Manufacturers optimizing supply chains by securely sharing data with partners.
Further, Bhatia says confidential computing helps facilitate data “clean rooms” for secure analysis in contexts like advertising. “We see a lot of sensitivity around use cases such as advertising and the way customers’ data is being handled and shared with third parties,” he says. “So, in these multiparty computation scenarios, or ‘data clean rooms,’ multiple parties can merge in their data sets, and no single party gets access to the combined data set. Only the code that is authorized will get access.”
The current state—and expected future—of confidential computing
Although large language models (LLMs) have captured attention in recent months, enterprises have found early success with a more scaled-down approach: small language models (SLMs), which are more efficient and less resource-intensive for many use cases. “We can see some targeted SLM models that can run in early confidential GPUs,” notes Bhatia.
This is just the start. Microsoft envisions a future that will support larger models and expanded AI scenarios—a progression that could see AI in the enterprise become less of a boardroom buzzword and more of an everyday reality driving business outcomes. “We’re starting with SLMs and adding in capabilities that allow larger models to run using multiple GPUs and multi-node communication. Over time, [the goal is eventually] for the largest models that the world might come up with could run in a confidential environment,” says Bhatia.
Bringing this to fruition will be a collaborative effort. Partnerships among major players like Microsoft and NVIDIA have already propelled significant advancements, and more are on the horizon. Organizations like the Confidential Computing Consortium will also be instrumental in advancing the underpinning technologies needed to make widespread and secure use of enterprise AI a reality.
“We’re seeing a lot of the critical pieces fall into place right now,” says Bhatia. “We don’t question today why something is HTTPS. That’s the world we’re moving toward [with confidential computing], but it’s not going to happen overnight. It’s certainly a journey, and one that NVIDIA and Microsoft are committed to.”
Microsoft Azure customers can start on this journey today with Azure confidential VMs with NVIDIA H100 GPUs. Learn more here.
This content was produced by Insights, the custom content arm of MIT Technology Review. It was not written by MIT Technology Review’s editorial staff.