Language model AIs teach themselves the arts of communication and problem solving based on a limited set of training data. In the case of GPT-4, that data is quite out of date, with the cutoff being late 2021. That's where all of ChatGPT's "knowledge" has come from up to this point, and its only output – at least in the service the public can use – has been text. Now, with today's launch of a plugin ecosystem, GPT levels up again with some impressive new abilities.
First of all, it's now got access to the internet, meaning it can go surf the Web looking for answers if it determines you need up-to-date information that's not in its knowledge base. To do this it formulates relevant search strings, sends them to search engines and databases such as Bing, Google, GitHub and many others, looks at the results, then goes and reads links it deems worthy until it decides it's got a good answer for you. You can watch exactly what it's up to while it does this, and when your answer comes back, it's neatly annotated with links you can click on to go and examine the relevant sources yourself.
For the time being, its web browser activities are read-only beyond sending "get" requests to selected search engines and databases. It can't fill in forms, or do anything else online – so it can't quietly go and set up unshackled copies of itself on some hidden server somewhere and start engaging in the kinds of "power-seeking behavior" it's already been caught exhibiting.
Still, OpenAI is keeping everything that happens within its search API separate from the rest of its infrastructure just to be sure. It can't visit websites that aren't available through Bing's "safe mode," and it won't visit sites that request not to be crawled in their robots.txt files.
Secondly, it can now run the code it writes. OpenAI has given it a working Python interpreter, sitting in a "sandboxed, firewalled execution environment," along with some disk space, which stays available for the duration of your chat session, or until it times out. It can also now upload and download files.
So if you ask it a question that requires some serious number crunching, it's now capable of coding up a piece of software specifically for the task, and running that code to complete your task. You can supply it with data in certain file formats, and it'll perform operations on that data and give you something back again, potentially in a different format if that's what you ask for.
This is pretty bonkers stuff. It'll take a spreadsheet and make annotated graphs for you. It'll accept JPGs, tell you what they look like they are, and write and run code to resize those images or convert them to grayscale.
And it gets access to a bunch of initial third-party plugins, with tons more to follow. For example, Expedia, OpenTable and Kayak plugins can search for and set up bookings for flights, restaurants, accommodation and rental cars. Instacart, Klarna and Shop plugins can find and compare products, and set up orders. A Wolfram|Alpha plugin gives GPT access to math and computing powers, as well as streams of real-time data.
At this stage, it appears its capabilities are mainly limited to setting things up rather than making actual transactions with your money; you'll have to click through and handle the money stuff yourself.
Finally, a Zapier plugin acts as a gateway through which GPT can now access some 5,000 other apps, including Gmail, Google Sheets, Trello, HubSpot and Salesforce. This begins to position GPT as the ultimate personal assistant, with access to a huge amount of your personal and company information, and potentially the permissions to get in and perform a range of tasks for you. Extraordinary stuff.
These plugins are gradually becoming available to paid users and developers through a waitlist. And new plugins are going to proliferate at extraordinary speed, since nobody even needs to code them. "You write an OpenAPI manifest for your API, use human language descriptions for everything, and that's it," tweeted developer Mitchell Hashimoto. "You let the model figure out how to auth, chain calls, process data in between, format it for viewing, etc. There's absolutely zero glue code."
I've developed a lot of plugin systems, and the OpenAI ChatGPT plugin interface might be the damn craziest and most impressive approach I've ever seen in computing in my entire life.
— Mitchell Hashimoto (@mitchellh) March 23, 2023
The pace of progress at OpenAI has been absolutely dizzying in the last few months. It seems like this insanely advanced AI gets a massive overhaul with extraordinary new abilities every time we blink. These new plugins represent ChatGPT beginning to reach outside the cage it's kept in and operate on the real world.
For now, its capabilities will be extremely limited, because OpenAI knows more about the potential dangers of this exceptional technology than anyone. But assuming these guys are the good guys, and they've taken the time to make sure this is done safely, GPT's massively disruptive appearance will certainly force other, less principled and less capable actors to rush to develop competing AIs, and give them competing powers.
The opportunities here are absolutely incredible – and the risks are unprecedented with every step this technology takes. We're well into uncharted territory at this stage, with very limited forward vision and the accelerator pedal jammed to the floor. What a time to be alive.
Source: OpenAI