OpenAI is an organization known for developing a variety of artificial intelligence (AI) models and tools, which are accessible through various platforms and services. Microsoft Azure is a cloud computing platform that offers a range of services and tools for building, deploying, and managing applications. Azure also provides AI services that enable developers and customers to leverage the power of AI in their solutions. One of these services is Azure OpenAI Service, which is a collaboration between Microsoft and OpenAI to offer access to OpenAI’s products through Azure.
Individuals as well as organizations are keen to implement these tools into their work processes to benefit from the immensely increased efficiency they provide when performing a wide variety of tasks. However, it is difficult to keep up with the different offerings and the pros and cons that come with them.
In this blog post we’ll compare five distinct offerings from OpenAI vs. Azure OpenAI: ChatGPT (free version) and Whisper, ChatGPT Plus, DALL-E, OpenAI’s API Services, ChatGPT Enterprise, and Microsoft Azure OpenAI Services, with a specific focus on the required technical expertise for their respective use and privacy considerations relevant for each of them.
ChatGPT and Whisper
ChatGPT, the free version, is an AI language model designed for end users. Users interact with it via a web-based interface, typing in prompts and reading the AI’s generated completions. It requires no programming knowledge or technical expertise of significance – all you need to do is type. This makes ChatGPT accessible to a broad audience. This model currently has a knowledge cut-off as of September 2021, meaning it cannot provide information on any events after that date.
Whisper is an open source, i.e. free, speech transcription system, but it does not have a web version like ChatGPT. That means you have to install it locally on your computer (or use it in the cloud), guided by instructions written by developers, and it requires a bit of coding too.
These are not all of the open-source models OpenAI has released. At the time of writing, there are also Point-E (a system for generating 3D Point Clouds from prompts), Jukebox (music generating model), and CLIP (visual classification tool). These models are considerably less accessible to the greater public and require technical expertise to utilize.
On the privacy front, OpenAI retains data “to provide our Service to you, or for other legitimate business purposes such as resolving disputes, safety and security reasons, or complying with our legal obligations.” In their current privacy policy, they advise that the retention period can differ, depending on the amount and sensitivity of the personal information, which includes user input. In their settings, on the other hand, OpenAI advises that unsaved conversations will always be retained for 30 days, without stating a particular purpose. Even if the chat history is turned on, users can delete conversations selectively, following which they will presumably be deleted after 30 days.
In addition, users have a choice to turn off chat history and training in their settings or to submit an opt-out form to OpenAI, in which case the chat history will not be used to train OpenAI’s models, but users also lose the benefit of being able to revisit a previous conversation. This can be relevant if a lot of effort has gone into providing context to ChatGPT on a particular topic or the tool’s output has been tweaked with additional prompts to achieve the right tone or other user-specific preferences, such as the usage of certain words as a translation of the input.
On their opt-out form and in this article, OpenAI advises that they (take steps to) remove personally identifiable information (PII) from the data they use to train their models. Furthermore, if a user detects inaccurate PII about them in the model’s output, they have a right to request rectification, and if that is technically not feasible, they can request the output to be removed by filling out this form.
With regard to the disclosure of PII, OpenAI’s privacy policy clearly states that this information is not used for commercial purposes in the sense that it is not sold to anyone. However, it may be shared with third-party services providers, such as “hosting services, cloud services, and other information technology services providers, email communication software, and web analytics services, among others” for purposes of aiding OpenAI with providing its services. OpenAI also advises that they “may also send select portions of content to third-party contractors (subject to confidentiality and security obligations) for data annotation and safety purposes.” In the past, OpenAI has been criticized for outsourcing their data annotation tasks to Kenyan workers. Since the contract was cancelled by the Kenyan company in 2021 due to the toxicity of the work of labelling violent and explicit content, further details around the current annotation work required to train OpenAI’s models are not available.
ChatGPT Plus and DALL-E
ChatGPT Plus and DALL-E are premium, paid versions of OpenAI’s services.
ChatGPT Plus provides user access to features that are still in Beta, such as Plugins, i.e., the Browsing Plugin, which has been discontinued for the moment, but which would provide the same ability as the Bing chat with up-to-date information and references to internet sources, and Code interpreter. Code interpreter knows how to write and execute python code, and it can work with file uploads. It can conduct data analysis, image conversions, and edit code files. OpenAI, in its settings, advises that files will not persist beyond a single session.
ChatGPT Plus relies on OpenAI’s GPT-3.5 and GPT-4 transformer models. It also comes with higher availability during peak hours and more accurate and safer responses, OpenAI advises. Most recently, OpenAI has added image capabilities to GPT-4 and advised that its privacy policies around image input remain the same as for text input. One aspect to highlight here is that the model advises that it is “unable to identify or provide sensitive inferences based on images of real people.”
DALL-E is an AI system that can create realistic images or art from text input. Retouching or altering images is also possible with DALL-E. More specifically, DALL-E can do “outpainting,” that is, expand on an image to add what has previously not been on the canvas, and “inpainting,” meaning it can add to or remove elements from a picture or painting. It can also provide variations of the original. In addition to generating images from text input, DALL-E also allows you to upload a photo and work with that as a starting point. You can then download the result.
OpenAI continuously works on adding and enhancing safety features, limiting the content of what kinds of images DALL-E can generate, and such as preventing realistic images of real individuals and harmful or explicit content. They also have a Content Policy in place, requiring users to refrain from creating harmful content using their tools. On July 18, 2022, OpenAI implemented a technique that makes DALL-E generate images that reflect the diversity of the population more accurately.
Users have to pay $25USD per month for access to ChatGPT Plus and DALL-E works as a pay-per-credit model ($15USD for 115 credits; it takes approximately 15 credits to generate a usable image from text). These services are also user-friendly and require minimal technical expertise, similar to the free version of ChatGPT.
These services follow the same data retention policy as the free version and the same disclosure practices apply as they concern personal information contained in the users’ prompts. However, as mentioned, some additional safety features in terms of output controls exist.
OpenAI’s API Services
OpenAI’s API services require a moderate to high level of technical expertise. Users need to understand how to construct API calls, handle the responses, and integrate the results into their applications. This typically involves programming knowledge and understanding of REST APIs. Equipped with this knowledge, the OpenAI services discussed above (and more) can be integrated directly into the user’s own product, e.g., DALL-E API, ChatGPT and Whisper (language-to-text) API.
In terms of privacy, the API services provide a more secure option than the services discussed so far. First, data from user prompts and completions is by default not used to train OpenAI’s models, but users can opt in to support the development of the services. Second, data is encrypted both in transit and at rest, preventing unauthorized access. However, once the prompt is sent, it is retained for 30 days for abuse and misuse monitoring and cannot be deleted by the user, relying instead on OpenAI’s data retention policy. This means users have less control over their data compared to the chat UI versions. With regard to data submitted by the user through the Files endpoint, for instance to fine-tune a model, is retained until the user deletes the file.
When it comes to data storage and sharing with third parties, OpenAI’s API Data Usage Policy states that data is stored on OpenAI’s and their sub processors’ systems located in the US or “worldwide.” The 30 days data retention period also applies to these sub processors.
OpenAI now also offers to enter into a Data Protection Addendum for its API offerings, removing one obstacle for compliance with various data protection laws and regulations, such as the GDPR.
OpenAI’s API services come with a price tag, but they remain cheaper than the Microsoft Azure OpenAI Services. Users currently pay for their specific usage in tokens or per image or minute, depending on the model used. OpenAI also provides a tiered pricing system with different perks for flexibility. When signing up, the user will be granted a spending limit that increases over time. A quota increase can also be requested, where required. For skeptics, OpenAI provides $5USD in free credits to be used over the first three months.
ChatGPT Enterprise
ChatGPT Enterprise is the latest offering OpenAI has announced at the time of writing. It provides access to ChatGPT-4, no usage caps, performance up to 2x faster, and API credits allowing businesses to build their own solutions. It will allow companies to train the model on their own data, customize it, and optimize it for their industry and desired use cases. OpenAI delivered ChatGPT Enterprise after less than one year of development, and it plans to release another version for smaller organizations, ChatGPT Business.
In terms of privacy and security, OpenAI advises that it does not train its models on the input data or the generated outputs. The data is encrypted in transit and at rest, it may be submitted to automated content classifiers, but humans will only access it for trouble-shooting and retrieving end user conversation with explicit permission by the organization, or where required by law. ChatGPT Enterprise has been audited for SOC2 Type I compliance, and is currently pursuing a SOC2 Type II certification as well. Businesses’ compliance with privacy laws can further be supported by entering into a Data Processing Addendum. The tool also supports single sign-on, domain verification, and an admin console to manage members. Lastly, the business data is retained for purposes of enabling chat history, providing businesses with full control over how long they are retained. Once the end user deletes a chat, the data will be retained for up to 30 days and then removed from OpenAI’s systems.
Pricing information is not available publicly and will depend on each subscriber’s size and use case. In terms of technical expertise required to deploy the solution, there is likewise little known at this time.
At this early stage, it is difficult to say how exactly ChatGPT Enterprise compares to Microsoft Azure OpenAI services (considered next). When OpenAI announced this new feature, the COO simply remarked that it is independent of Microsoft and that businesses can choose for themselves which platform is right for them.
Microsoft Azure OpenAI Services
Microsoft Azure OpenAI Services require the highest level of technical expertise. Users must set up an Azure account, configure their environment, manage resources, and understand how to use the cloud-based services. This involves knowledge of cloud computing and APIs.
At the same time, Azure OpenAI Services provide the highest level of data privacy and security. They offer end-to-end encryption and comply with various regulations and standards, such as GDPR, HIPAA, and ISO 27001. Users have granular control over their data, including who can access them, how long they’re stored, and how they’re used. Azure also provides monitoring and auditing capabilities, allowing users to track their data usage and activity. No data, whether prompts, outputs, embeddings, or training data, are shared with other customers or OpenAI, and they are by default not used to improve Microsoft’s AI offerings. The OpenAI models are hosted in the Azure environment with several possible server locations on different continents.
Microsoft retains all prompts and generated content for abuse detection and mitigation for up to 30 days. However, customers can apply for an exemption to the abuse monitoring, ensuring that no human review is performed and no prompts or other content is stored at any time.
Microsoft further provides a Data Protection Addendum which, as mentioned above, is required under various data protection laws.
The basic pay-per-usage pricing does not differ from OpenAI’s API call costs. However, Microsoft makes the disclaimer on its website that the pricing details may vary depending on the type of agreement entered into with Microsoft, date of purchase, and the currency exchange rate. Furthermore, since Azure hosts the models, there is an additional hourly hosting cost that depends on the base model but ranges roughly from a few cents to a few dollars per hour.
Summary (table generated by ChatGPT Plus’s GPT-4)
|
ChatGPT Plus and DALL-E |
OpenAI’s API Services |
ChatGPT Enterprise |
Microsoft Azure OpenAI Services |
|
Use Cases |
General conversation, answering questions, providing explanations (ChatGPT), Speech-to-text transcription (Whisper) |
Extended capabilities of free version, image generation from text (DALL-E), language-to-text (Whisper), file handling (Code interpreter) |
Integration into own product (DALL-E, ChatGPT, Whisper), requires programming knowledge |
Advance Data Analysis (formerly Code Interpreter), usage dashboard, sharable chat templates, at this time still limited customization |
Highest level of customization and security, requires extensive technical expertise |
Free or Paid |
Free |
Paid (ChatGPT Plus: $25/month, DALL-E: pay-per-credit) |
Paid, usage-based pricing |
Paid, pricing dependent on size of business and use case |
Paid, usage- and hosting-based pricing |
Suited for Enterprise Use |
No |
Yes, for individual developers or small teams |
Yes, suitable for larger teams and businesses |
Yes |
Yes, best for large enterprises with high security needs |
Privacy Protection |
Data retained for 30 days, deletion anytime, opt-out form available for model training |
Same as free version, additional safety features for output controls, certain data not retained beyond a single session |
Data encrypted in transit and at rest, data retained for 30 days for abuse and misuse monitoring, file data retained until user deletes |
Data encrypted in transit and at rest, file data retained until user deletes + 30 days after deletion |
Highest security, end-to-end encryption, compliance with various regulations and standards, granular control over data |
Technical Expertise Required |
Low for ChatGPT, Moderate for Whisper (Requires local installation or cloud setup, some coding knowledge) |
Low for ChatGPT Plus, Moderate for DALL-E and Whisper |
Moderate to High |
Likely moderate |
High |
Data Usage |
Data used to provide service, legitimate business purposes, safety, security, legal obligations |
Same as free version |
Data from user prompts and completions not used for model training by default, user can opt in |
Data is not used for training models |
No data shared with other customers or OpenAI, not used to improve Microsoft’s AI offerings by default |
Conclusion
The landscape of AI services offered by OpenAI and Microsoft Azure is diverse, catering to a wide spectrum of use cases and technical expertise levels. From the user-friendly, free-to-use ChatGPT and Whisper, to the more advanced and premium services like ChatGPT Plus, DALL-E, most recently ChatGPT Enterprise, and finally to the robust and highly customizable API Services from both OpenAI and Microsoft Azure, there’s a tool for almost every need.
Each service comes with its own considerations regarding cost, technical proficiency required, and data privacy. It’s crucial for users, be they individuals or organizations, to understand these aspects when selecting the right service for their specific needs. As technology continues to evolve and these services expand and improve, staying informed about these offerings will be key to leveraging the power of AI effectively and responsibly.
Remember, while this article provides a comprehensive overview, it’s always best to refer directly to OpenAI’s and Microsoft Azure’s official documentation and support for the most accurate, up-to-date, and detailed information.
If your organization wants to use OpenAI’s API offerings or needs to eliminate any last privacy concerns while benefiting from Microsoft Azure’s abuse monitoring, Private AI’s PrivateGPT can help with your compliance needs. With the help of PrivateGPT, users can easily scrub out any personal information that would pose a privacy risk, and unlock deals blocked by companies not wanting to use ChatGPT. In particular, depending on the sensitivity of the data, it may simply not be permitted to transfer the PII to any third party, at least not without an individual’s consent, which may be difficult to obtain. In that case, filtering out and redacting PII from a data set before transmission can remove this compliance challenge.
With PrivateGPT Headless you can: