Understanding the Features of the ChatGPT Plugin in One Article

According to our iterative deployment philosophy, we are gradually introducing plugins in ChatGPT to study their use, impact, and challenges in the real world, including their safety and coordination aspects - in order to fulfill our mission, we must do this work well.

Since the launch of ChatGPT, users have been requesting the use of plugins (many developers have also been experimenting with similar ideas) because they can unlock a wide range of use cases. We are starting with a small group of users and plan to gradually roll out broader access as we learn more (targeting plugin developers, ChatGPT users, and API users who want to integrate plugins into their products after the alpha period). We are excited to build a community that shapes the future of human-AI interaction.

Plugin developers invited from our candidate list can use our documentation to build ChatGPT plugins and then list the enabled plugins in the prompts shown to the language model, along with documentation on how the model should use each plugin. The first batch of plugins was created by Expedia, FiscalNote, Instacart, KAYAK, Klarna, Milo, OpenTable, Shopify, Slack, Speak, Wolfram, and Zapier.

We also have two plugins of our own, one for web browsing and one for code interpretation. We have also open-sourced the code for a knowledge retrieval plugin, which allows developers with information to host it themselves to enhance the functionality of ChatGPT.

Today, we will begin providing alpha access to the plugin for users and developers on the whitelist. While we will initially prioritize a small number of developers and ChatGPT Plus users, we plan to expand access over time.

Overview

Today's language models are useful for various tasks, but they still have limitations. The only information they can learn is from their training data, which may be outdated and limited in various applications. Additionally, language models can only generate text out of the box. This text can contain useful instructions, but to truly follow those instructions, another process is needed.

Although not a perfect analogy, plugins can be the "eyes and ears" of language models, allowing them to access information that is too new, too personal, or too specific to be included in their training data. Plugins can also enable language models to perform secure and restricted actions on behalf of users, thereby increasing the overall utility of the system.

We look forward to the emergence of open standards to unify the way applications expose AI interfaces. We are early in our efforts to develop such a standard and are seeking feedback from developers interested in collaborating with us.

Today, we are starting to enable existing plugins from our early collaborators for ChatGPT users, starting with ChatGPT Plus users. Additionally, we are also allowing developers to create their own plugins for ChatGPT.

In the coming months, as we learn from deployment and continue to improve our safety systems, we will iterate on this protocol and plan to enable developers using OpenAI models to integrate plugins into their own applications, not just ChatGPT.

Safety and Broader Impact

Connecting language models to external tools brings both new opportunities and new risks.

Plugins have the potential to address various challenges associated with large language models, including "hallucinations," keeping up with recent events, and (with permission) accessing proprietary sources of information. By integrating explicit access to external data - such as the latest information online, code-based computations, or information retrieved by custom plugins - language models can enhance their responses with evidence-based references.

These references can not only improve the utility of the model but also allow users to assess the credibility of the model's outputs and repeatedly verify their accuracy, potentially mitigating risks associated with overreliance, as discussed in our recent GPT-4 system card. Finally, the value of plugins may go beyond addressing existing limitations and help users accomplish a wide range of new use cases, from browsing product catalogs to booking flights or ordering food.

At the same time, there are risks that plugins can increase security challenges by enabling harmful or unintended actions, enhancing the capabilities of bad actors who engage in fraudulent, misleading, or abusive behavior. By expanding the potential scope of applications, plugins may increase the risk of the model taking incorrect or inconsistent actions in new domains. These factors have guided the development of our plugin platform from day one, and we have implemented some safeguards.

We have conducted red teaming exercises internally and with external collaborators, which have identified potential concerns. For example, our red team members discovered methods with plugins that, if released without safeguards, could enable complex prompt injection, sending fraudulent and spam emails, bypassing security restrictions, or abusing information sent to plugins. We are using these findings to inform the design of security mitigations to limit risky plugin behavior and improve transparency on how and when plugins operate as part of the user experience. We are also using these findings to decide on the gradual deployment of plugin access.

If you are interested in security risks or mitigations in this area, we encourage you to participate in our researcher access program. We will also invite developers and researchers to submit security and capability assessments related to plugins as part of our recently open-sourced Evals framework.

Plugins can have a wide range of social impacts. For example, we recently published a working paper that found language models with tooling can have greater economic impact than language models without tooling. More generally, in line with findings from other researchers, we anticipate that the current AI technology wave will have a significant impact on the speed of job transformation, displacement, and creation. We are eager to collaborate with external researchers and our customers to study these impacts.

Browsing (Alpha Stage)

An experimental model that knows when and how to browse the internet.

Motivated by past work (our own WebGPT, as well as GopherCite, BlenderBot2, LaMDA2, etc.), allowing language models to read information from the internet significantly expands the range of topics they can discuss beyond their training corpus and into up-to-date information.

Below is an example of a browsing experience for ChatGPT users. Previously, if these questions were asked, the model would politely indicate that its training data did not contain enough information to answer. In this example, ChatGPT retrieves the latest information about the Oscars and then performs a poetry performance, which showcases how browsing can be an additional experience.

In addition to providing obvious utility to end-users, we believe enabling language and chat models to conduct thorough and interpretable research has exciting prospects for scalable alignment.

Safety Considerations
We have created a web browsing plugin that allows language models to access a web browser, designed with a focus on safety and operating as a good citizen of the web. The text-based web browser in the plugin is limited to making GET requests, reducing (but not eliminating) certain categories of security risks. This makes the browsing plugin useful for retrieving information but does not include "transactional" operations such as form submissions, which involve more security and assurance issues.

Browsing uses the Bing Search API to retrieve content from the web. Therefore, we inherit a significant amount of work done by Microsoft in terms of reliability and authenticity of information sources and preventing retrieval of problematic content through "safe mode." The plugin runs in a separate service, so browsing activities of ChatGPT are separate from other parts of our infrastructure.

To respect content creators and comply with web standards, our browser plugin's user agent is labeled as ChatGPT-User and is configured to respect websites' robots.txt files. This may occasionally result in a "click failed" message, indicating that the plugin is respecting the website's instructions to avoid crawling it. The user agent is only used to take direct actions on behalf of ChatGPT users and not for automatically scraping the web. We have also published our IP export ranges. Additionally, we have taken measures to limit the rate to avoid sending excessive traffic to websites.

You can use the robots.txt file to block ChatGPT from crawling your website, and ChatGPT will display an error message in that case.

Our browsing plugin displays visited websites and references their sources in ChatGPT's responses. This added transparency helps users verify the accuracy of the model's responses and gives credit to content creators. We see this as a new way of interacting with the web and welcome feedback on other approaches to drive traffic back to the source and increase the overall health of the ecosystem.

Code Interpreter (Alpha Stage)

An experimental ChatGPT model that can use Python and handle uploads and downloads.

We provide a Python interpreter for our models to operate in a sandboxed, firewall execution environment, along with some temporary disk space. The code executed by our interpreter plugin is evaluated in a persistent session that remains active during the chat conversation (with a time limit), and subsequent invocations can build upon each other. We support uploading files to the current session workspace and downloading your work results.

We hope to empower our models to use their programming skills to provide a more natural interface to the most basic functionalities of our computers. Enabling a very enthusiastic novice programmer to work at the speed of thought can make new workflows easy and efficient, opening up the benefits of programming to new audiences.

From our initial user research, we have identified several useful use cases for the code interpreter:

Solving mathematical problems, both quantitative and qualitative
Performing data analysis and visualization
Converting files between different formats
We invite users to try the code interpreter integration and discover other useful tasks.

Safety Considerations
The primary consideration in connecting our models to programming language interpreters is proper sandboxed execution to ensure that AI-generated code does not have unexpected side effects in the real world. We execute code in a secure environment and use strict network controls to prevent external internet access to executed code. Additionally, we have resource limits set for each session. Disabling internet access limits the capabilities of our code sandbox, but we believe it is the right initial trade-off. Third-party plugins are designed as a safety-first approach to connect our models with the outside world.

Retrieval

The open-source retrieval plugin allows ChatGPT to access personal or organizational sources of information (with permission). It enables users to retrieve the most relevant snippets from their data sources, such as files, notes, emails, or public documents, by asking questions or expressing needs in natural language.

As a self-hosted and open-source solution, developers can deploy their own versions of the plugin and register them with ChatGPT. The plugin leverages OpenAI Embeddings and allows developers to choose a vector database (Milvus, Pinecone, Qdrant, Redis, Weaviate, or Zilliz) for indexing and searching documents. Information sources can be synchronized with the database using webhooks.

To get started, you can access the retrieval plugin library.

Safety Considerations
The retrieval plugin allows ChatGPT to search the content of a vector database and add the best results to the ChatGPT conversation. This means it has no external impact, and the main risks are data authorization and privacy. Developers should only add content to their retrieval plugins that they are authorized to use and can share in a user's ChatGPT conversation.

Third-Party Plugins (Alpha Stage)

An experimental model that knows when and how to use plugins.

Third-party plugins are described by a manifest file that includes a machine-readable description of plugin capabilities and how to invoke them, as well as user-facing documentation.

The process for creating plugins is as follows:

Set up an API with endpoints that you want the language model to call (this can be a new API, an existing API, or a specialized API package designed for LLMs).
Create a manifest file that records your API's OpenAPI specification and includes a link to the OpenAPI specification along with some plugin-specific metadata.
When starting a conversation on chat.openai.com, users can select which third-party plugins they want to enable. Documentation about the enabled plugins is shown as part of the conversation context to the language model, enabling the model to make appropriate API calls based on user intent. Currently, plugins are designed for calling backend APIs, but we are also exploring plugins that can call client-side APIs.

Future Outlook

We are working hard to develop plugins and bring them to a wider audience. There is much to learn, and with everyone's help, we hope to build something that is both useful and safe.

Note: The article contains multiple video demonstrations, which can be viewed in the original source.

Source: https://www.8btc.com/article/6810742