In response to growing consumer demand for privacy, Apple introduced App Tracking Transparency (ATT) as a feature in 2021, enabling users to choose whether or not apps on their devices could track them. Meta, getting caught in one of the first privacy-related shake-ups, opened February 2022 with the biggest stock drop in the history of the stock market. Relying heavily on targeted advertisements as a fundamental portion of their business model not only meant that their business risked getting affected by Apple’s decision, but they had also seen its first quarter over quarter daily active user decline which is speculated to be an outcome of users’ privacy concerns.
Years earlier, Microsoft – much like Apple – had made a bet on privacy being a key differentiator to their business. In 2018, Microsoft CEO, Satya Nadella called privacy a “human right.” A statement that one could directly link to Article 12 of the Universal Declaration of Human Rights:
“No one shall be subjected to arbitrary interference with his privacy, family, home or correspondence, nor to attacks upon his honour and reputation. Everyone has the right to the protection of the law against such interference or attacks.”
While a handful of major tech companies were leading the way in creating trusted software environments with privacy at the forefront, it was yet to be one of the main topics of conversation when a new technology emerged.
Privacy and Generative AI
Enter ChatGPT… as Sam Altman, CEO of OpenAI, testified in front of the Senate Judiciary Committee in May 2023, Privacy came up at the very beginning of the hearing. Company after company is banning ChatGPT as a result of privacy concerns, with Chief Information Security Officers (CISOs), Chief Information Officers (CIOs), and Chief Privacy Officers (CPOs) scrambling to find a solution to the privacy concerns as their companies’ employees are banging down their doors for access to a technology that can be the difference between succeeding or failing at their jobs. The adoption of large language models and generative AI is even being perceived as necessary for organizational survival. And yet privacy concerns are *the* blocker preventing adoption.
Why Now?
There have been a number of events bringing privacy further and further into the spotlight over the past decade – from Edward Snowden’s revelations in 2013 to the GDPR taking effect in 2018 and having organizations scramble to get their internal affairs in order as a result. Yet, while each of those events brought privacy to the forefront of media discussion, this is the first time we are seeing a massive paradigm shift in the understanding of data privacy, its importance, and how to go about ensuring control. Three causes for this shift are (1) a shift in consumer expectations of data privacy and companies responding to this shift, (2) the recent wave of understanding that the technology behind ChatGPT – large language models, that is – can enable a massive rise in employee productivity, with companies who fail to adopt it being left behind, and (3) the shift of the concept of privacy to not only include consumer data but also company data. Expanding on (3), the definition of privacy, according to the International Association of Privacy Professionals, is:
“[…] privacy is the right to be let alone, or freedom from interference or intrusion. Information privacy is the right to have some control over how your personal information is collected and used.”
In a similar vein to individuals having control over their personal data, businesses rely on copyright, trade secrets, patents, business associate agreements, non-disclosure agreements, etc. to try to maintain complete control over how their internal company data is used. However, consumer products like ChatGPT are proving to be a significant risk, due to their ease of adoption and the massive productivity gains promised to employees. We see organizational control over their data break down in events like Samsung’s trade secrets data leak to ChatGPT and alleged copyright violations.
The future of software is being built with organizational as well as consumer personal data control in mind, and what we are seeing is behemoths and startups alike keenly tackling trust enablement in increasing levels of depth under a widening spotlight.
The Trust by Design Pioneers for ChatGPT
Microsoft
Only a few months after the launch of ChatGPT, we are already seeing strong leadership in building Trust by Design into LLM ecosystems from startups to major public companies. Microsoft was one of the first to offer ChatGPT in a secure environment through Azure Open AI Service. Then, on June 19th, Microsoft announced an additional capability in Azure Open AI Service, which allows organizations to use Open AI models on their own data. Part of the purpose of this service is to avoid the need for fine-tuning models on one’s own data, instead using the data as context for the models.
Private AI’s PrivateGPT leverages the secure environment of Microsoft’s Open AI GPT offering, adding yet another layer of privacy by allowing users to automatically remove sensitive data from their prompts before they are sent to Azure Open AI. In addition, Private AI can remove sensitive information from any datasets used to inform Microsoft’s Open AI Services with additional context. This prevents that sensitive data from being regurgitated in some of the use cases they propose for this new service, including chat playgrounds.
Salesforce
The announcement of Azure Open AI Service On Your Data is very recently preceded by Salesforce’s EinsteinGPT, which also provides a secure and privacy-preserving environment to run LLMs on your salesforce data and corporate data alike, to address a growing demand for rapid innovation with LLMs without shattering trust. As mentioned in the Wall Street Journal :
“Benioff said during the event that every CEO he has spoken to has brought up the “AI trust gap” between their “desire to rapidly move forward” with the technology, and the problems that large language models introduce in corporate environments.”
Salesforce already announced a partnership with the Auto Club Group (AAA’s second largest club) for this new push and we will inevitably be seeing other major organizations banking on Trust by Design LLM environments by the droves.
All of this said, companies should still be aware that they are themselves responsible for how they handle data, in particular Protected Health Information (PHI), as per the Salesforce BAA:
“With respect to Einstein Bots, Customer may not: (a) submit PHI to, or use PHI in, any utterance records; or (b) enable any Answer Automation or Input Recommender features or functionality of Einstein Bots that could result in the submission or use of PHI therein.”
Therefore, while Salesforce does make it easy to be compliant and safe with applications meant for an organization’s internal use, the onus is still on said organization to ensure that consumer-facing applications are compliant. This is a huge gap and ensuring PHI is removed or that data is de-identified within the organizations’ environment before being sent to an LLM is best practice.
Ultimately, a larger array of privacy-preserving services will be made available to organizations and their developers – especially as Salesforce and Microsoft lead the way, making privacy and trust standards rather than nice-to-haves. With increasing consumer demand for privacy, it will be up to each organization to make sure to build systems with Trust by Design, ensuring that all of their data, customers’ data, and employees’ data are collected, processed, and stored in a privacy-preserving, secure, and compliant manner.
The Future of Software
As we have seen, even colossal organizations like Salesforce and Microsoft cannot tackle every aspect of trust and privacy for their customers on their own, with necessary components like personal information identification and removal being missing puzzle pieces that their customers have to integrate themselves within their own environments. What their excellent tools and frameworks are providing organizations with, beyond trust when using their respective systems, is a blueprint for how to start thinking about software design and architecture under the world’s growing expectations of trust, privacy, and corporate responsibility.
The push we are seeing regarding the serious integration of Privacy by Design in order to achieve Trust by Design in the LLM ecosystem is just the tip of the iceberg of how foundational software thinking is altering. Merely a decade ago, the very concept of privacy as a part of the design of a software was alien to most computer scientists and software engineers and now it is one of the major points of discourse. For this, we have Dr. Ann Cavoukian to thank, as the inventor of Privacy by Design and a ceaseless evangelist for the cause.
Having privacy as an essential requirement for LLM use will start echoing across software used throughout organizations, including within ETL pipelines, API gateways, and products they provide to their own customers. This is, in essence, the raison-d’être of Private AI. Following closely behind as a fundamental component of software development will be bias reduction, especially as we as a community continue to discover more regarding its harms, its causes, and effective preventative measures. Interestingly, it seems privacy and bias reduction may very well go hand–in-hand, as information that can be used to identify an individual (like their origin, race, address, etc.) often also introduce bias in algorithms.
What has remained a constant in the software community is a push from a powerful subset to ensure equitable access to software, with RedHat being one of the most famous organizations enabling open source software. And tying this back to the BigTech Pioneers we’ve discussed in this article, Salesforce recently announced an AI accelerator “[…] that brings the full power of Salesforce with unrestricted grants, pro-bono expertise, and our technology to create a more equitable AI world[,]” thus extending their focus on trust and equity to also include AI. M12 (Microsoft’s Venture Fund), in turn, announced its partnership with GitHub to launch the GitHub Fund in November 2022. Its purpose is to support Open Source software with Venture Capital.
With privacy, bias reduction, trust, and equity all being core to the current software development dialog, and with reliable building blocks being made available to developers, building software with Trust by Design is becoming easier than ever. Software developers will have to adapt to the new reality or be left behind due to consumers’ and businesses’ preferences and an increasingly stringent global regulatory landscape.