China's Baidu launches its "ERNIE Bot" AI early in response to GPT
Rushed by investors and partners after the stunning launch of ChatGPT, Chinese web giant Baidu has launched its own anything-machine multimodal AI, with plans to integrate it into all its apps and services – but the company admits it may not be ready.
OpenAI's ChatGPT has made the public aware of the insane power, potential and threat of large language models. The ripples from ChatGPT's launch late last year are being felt in every industry – you can't responsibly plan for the future of nearly any business without factoring in how these ludicrously capable and seemingly intelligent bots are going to disrupt nearly every process.
When OpenAI released the even more astonishing GPT-4 a couple of days ago, it did so with a measure of trepidation. As outlined in the GPT-4 paper, the language model itself was built and pre-trained some eight months ago, and the company spent that eight months working feverishly to make it safe and sanitized for public consumption.
Knowing that many competitors are working on similar AI models, OpenAI recognized that launching this thing would kick off a furious technology race – and that when a "racing dynamic" develops, other companies would be forced by their shareholders and customers to accelerate their own AI programs. Safety measures, the company reasoned, would be among the first things on the chopping block. As a result, the company considered sitting on GPT-4 for six months to give the competition time to proceed with caution – but decided against it.
Yesterday, Baidu proved these concerns well-founded, with what appears to be a hasty launch of its "ERNIE Bot" AI in an attempt to placate its shareholders and partners. Baidu Co-Founder, Chairman and CEO Yanhong "Robin" Li was upfront about this in his introductory comments.
"Many people have been asking me why we're releasing this product at this time," said Li through a translator. "Are we really ready? In fact, over the past decade, Baidu has invested in the R&D for ERNIE Bot. Indeed, the initial version was launched in 2019, and every year a new version has been launched... Others – including Google, Facebook, etc – have no product at the same level. Baidu is the first to launch such a product."
"Our expectations for ERNIE Bot are close to GPT – or even GPT-4," he continued. "During our initial internal testing, we experienced the capabilities of ERNIE Bot. We feel it's not perfect yet. So why are we launching it today? Because there's huge demands in the market. For various product lines including search, AI cloud, autonomous driving... Everyone's waiting for this technology. More importantly, our customers and our partners are waiting for such a technology and a product. They're urging us to launch such a product. Of course, for large models, once products are launched, they'll have real feedback from users, and with users' feedback, they'll iterate and improve their capabilities very quickly. We hope ERNIE Bot will grow quickly and provide value for our customers and users as soon as possible so everyone can benefit from it."
He proceeded to give five short, apparently pre-recorded demonstrations of the AI's capabilities. In the first, he asked it to summarize the plot of the Three Body science fiction series by Cixin Liu (a banger of a read, if you haven't dipped a toe into Chinese sci-fi). He asked the AI how it might continue the books from a philosophical perspective, and what were the commonalities between the lead actors in the films.
Without speaking Chinese, it's hard to gauge how the model performed, but Li claimed that these tasks would typically lead to a high probability of errors in a language model like this, and said ERNIE Bot was able to breeze through them. It appears the AI has access to search results and knowledge beyond its initial training data: "ERNIE Bot adopts a search augmentation and knowledge enhancement," said Li. We have 550 billion knowledge data points, so that we can guarantee that the answers from Ernie Bot will be basically correct."
A second task was more business-oriented and creative, asking for suggested business names and related slogans for a high-tech service company, as well as an initial 600-word company newsletter. The third challenged it on "mathematical logical reasoning" and its ability to point out flaws in the input prompts.
The fourth showed off ERNIE Bot's understanding of Chinese language and context, as well as its ability to work within creative constraints. A well-known Chinese idiom translating roughly to "the paper from Luoyang is expensive" was put in, and the AI correctly recognized it as an allegory for supply and demand economics. It was then able to write a Chinese poem, in which every character of the idiom was embedded in each sentence of the poem.
To see if this separates ERNIE Bot from English-based competition like ChatGPT, I ran a similar prompt into GPT-4, and found that it had no problem doing the same – assuming its Chinese doesn't come out sounding stilted to native speakers.
The final demo showed off its multi-modal creativity – and here, ERNIE Bot showed off some abilities that GPT hasn't turned on yet, since it's only just beginning to open up access to image-based inputs, and is currently only outputting text.
It was asked to create a poster image for the 2023 World Intelligent Transport conference, which it appeared to do with impressive speed. Then it was asked which cities would be most suitable for the development of said intelligent transport. Then it was told to read out its answer in the Sichuan dialect, which it did. Then, Li simply told it to "make a video out of it." It completed this task with staggering speed, taking about ten seconds to be ready for playback. You can see this demo in the video below – skip to around 1:20 if you just want to see the video.
Personally, I found this demo absolutely jaw-dropping to begin with – it spat out a minute's worth of terrific-looking video in about the time DALL-E 2 takes to respond to a single image request. Looking a little closer, though, it seems it's nowhere near as sophisticated; the AI is simply linking together sections of stock footage that seem relevant to the narration content, rather than creating vision in its own right.
And such a service is already well-known to Chinese users, since it's already used extensively on Baidu news services like Haokan, and has been for several years now. "I believe if you're a creator on [one of Baidu's video services], this will already be familiar to you," said Li. "Every day, tens of thousands of articles are converted to video automatically and distributed on Baidu's platform. But we've now connected all these technology points to that."
Chinese media in attendance were given invitations to begin testing the service. "From Ernie Bot's performance, you can tell that it can to some extent understand and express natural language just like humans," said Li. "It can also do some reasoning. All of these features are being improved continuously. Sometimes, we may feel really surprised, but of course other times we identify some mistakes. But one thing can be sure: it's progressing very fast, and there will be more very fast progress in the near future. We're going to fine-tune the model to make it adaptive to all Baidu products, so Ernie Bot can show its powerful capabilities and user-friendliness in the user interfaces and experiences, and draw all Baidu product closer to our clients and users."
Li went on to reveal that Baidu is looking to build additional layers on top of the generalized ERNIE Bot AI model, using training sources specifically chosen for pretty much every industry, giving the examples of energy, transport and media.
"There are estimates that by 2030, some workers will have their productivity quadrupled using artificial intelligence," said Li. "There would be an irreversible change to the nature of their work because of AI... We want to use industry-specific big models, because we think there might be a middle layer for each industry's quite specific and unique data or knowledge. If this is integrated with the grounded large model, huge creativity and productivity would be generated. Maybe that kind of capability can not be owned by Baidu. Maybe those data would be provided to Baidu. But Baidu offers the capability to fine-tune such knowledge or data to avoid mistakes and make the models more adaptive to each and every industry."
This somewhat rushed launch did not appear to impress Baidu's investors, and as CNBC reports, the company's shares plunged by as much as 10% during the presentation, ending the day some 6.4% down in Hong Kong for the company's lowest close since January 19.
Li was keen to tell the rest of the world that ERNIE Bot comes in peace: "What I want to say is the Ernie Bot is not used for the Sino-US conflict of technology," he said. "It represents the dream that we pursue as technology developers. It's a platform we want to use to empower all industries, clients and users. It's a testament to development driven by innovation."
He did not specifically address the issue of Baidu's safety strategy or the sanitization of the AI's output – perhaps this is unsurprising, given that China's "Great Firewall" policies already heavily censor nine very broad categories of information across the country's entire domestic internet – as well as completely banning a long list of foreign websites that might host "spiritual pollution." This is all done in partnership with internet companies like Baidu, and ERNIE Bot can definitely be expected to toe the line.
But these AIs have extraordinary capabilities, and can be misused in a myriad of creative and unexpected ways; ChatGPT users have had a lot of fun finding and exploiting its weaknesses since it launched, and OpenAI has teams working hard to plug these holes. Baidu can expect plenty of the same.
You can see the entire launch below, with spoken English translation.