Sam Altman asks: Should OpenAI let GPT-4 off the leash?
Does OpenAI have a responsibility to let GPT-4 off the chain right now as a shock-and-awe demonstration of AI power, a Hiroshima moment that might spur the world into action? In an interview yesterday, CEO Sam Altman hinted that he's considering it.
In a few short months, ChatGPT has convinced a lot of people – particularly the ones closest to it – that we're standing at the inflection point of the most significant technological leap humanity has ever made. Fire, the wheel, science, money, electricity, the transistor, the internet – each of these made humanity vastly more powerful. But AI is different; it seeks to create machines that will in some sense be our equals, and will eventually become our superiors.
OpenAI strikes me as an incredible organization of insanely smart, highly effective, and, I believe, genuinely well-intentioned people. From the semi-capitalist way the business is structured, to the remarkably open way in which it's dealing with its creations, this company appears to be trying to do the world a great public service and limit not only its own potential for unimaginable societal destruction, but the potential of the many other AIs that are in development.
That's why ChatGPT exists: it's OpenAI telling the world "hey humanity, this is a broken, janky, toddler version of what's coming. You need to look at it closely, and work with it. You need to understand what this is, the amazing things it can do, and the massive risks it carries, up to and including an existential risk for humanity itself. You need to move on this thing immediately, and have a say in where it goes next, because it will not be a broken, janky toddler for long. Soon it will work very, very well. Soon it will be indispensable. And soon it might become uncontrollable.'
It's a radically different, radically open and radically cautious approach than what you might expect from the tech world. If anyone should be at the forefront of technologies like these, it should be people that truly understand the weight of responsibility that falls on their shoulders. Listening to OpenAI CEO Sam Altman's two and a half hour interview yesterday with podcast host Lex Fridman – who's heavily involved in the AI field himself – made me thankful that OpenAI is at the pointy end of this blade. It also made me wonder if there's really any human equal to the responsibility Altman now carries.
Whether you take a utopian or dystopian view of AI, this interview documents an incredible point in history, as Altman wrestles with the potentially transformative benefits, as well as the potentially existential consequences, of his life's work. Most people are at least a little scared by what's happening right now, and Altman is too.
Fundamentally, humans can't build AIs this advanced. Nobody could sit down and code you a ChatGPT; like the human brain itself, language models are too mysterious and complex. What OpenAI and others have done instead is to create the systems and circumstances under which GPT has effectively built itself.
This incredible piece of alchemy has not created a human-like consciousness. It has created an intelligence of a kind entirely alien to us – nobody can truly say what it's like to be GPT, or how exactly it generates a response to a given input. Nobody truly understands how it works. But it's been trained and fed with so much human writing and expression that it has learned to imitate consciousness, and to translate messages between human and machine and back again in the most fluid and beautiful way ever shown.
GPT is not like us – but it is of us. It has read more of humanity's writing than any human, ever, by orders of magnitude. All of its behaviors, good and bad, hold up a mirror to the human soul. We are capable of immense good, and true evil, and while definitions of these terms are widely varied across different cultures, a freshly trained-up GPT model will happily apply the full weight of its power to any request without judgement, or answer questions about its own sentience in the same ways a human would. That's why OpenAI spent eight months attempting to tame, cage and shackle GPT-4 before it was let out to be seen and prodded at by the public.
Dr. Frankenstein may have wondered whether to throw the switch on his creation, but Altman doesn't have that luxury. He knows OpenAI is only one of many companies working toward advanced AI through language models. Most are still working behind closed doors, each is coming up with its own approach to ethics and safety, but everyone knows there are trillions of dollars on the table, and that every minute they spend on ethics and safety is a minute they're not scooping up that loot.
If the world wants to steer toward the utopian vision, governments and the private sector need to adapt to this new technology faster than they've ever adapted to anything before. Even if they do, Stanford has already demonstrated that bad actors can go and build themselves a rudimentary copy of ChatGPT for a hundred bucks; there will soon be thousands of these things, each loaded with a huge chunk of humanity's knowledge and the capability to communicate at extraordinary levels, and each imbued with the ethics, morals and safety standards of its owner.
Further to that, even if the world as a whole could agree on limits for AIs tomorrow, there's no guarantee that we'll be able to control them once they advance to a certain point. In the AI world, this is referred to as "alignment" – as in, somehow making sure the AI's interests are aligned with our own. It's not exactly clear how we can possibly do that once these things reach a certain level.
Indeed, the extreme pessimist's view is that a superior Artificial General Intelligence, or AGI, will kill us all, with near-100% certainty. As decision theory and AI researcher Eliezer Yudkowsky put it, "The big ask from AGI alignment, the basic challenge I am saying is too difficult, is to obtain by any strategy whatsoever a significant chance of there being any survivors."
"I want to be very clear," Altman told Fridman in yesterday's interview, "I do not think we have yet discovered a way to align a super powerful system. We have something that works for our current scale, called RLHF (Reinforcement Learning from Human Feedback)."
So knowing what's at stake, here are some choice quotes from Altman, pulled from Lex's excellent interview, that should give you a sense of the moment we're living in.
On whether GPT represents an Artificial General Intelligence:
Someone said to me over the weekend, "you shipped an AGI and somehow, like I'm just going about my daily life. And I'm not that impressed." And I obviously don't think we shipped an AGI. But I get the point. And the world is continuing on.
If I were reading a sci-fi book, and there was a character that was an AGI, and that character was GPT-4, I'd be like, well, this is a shitty book. That's not very cool. I would have hoped we had done better.
I think that GPT-4, although quite impressive, is definitely not an AGI – still, isn't it remarkable that we're having this debate? But I think we're getting into the phase where specific definitions of AGI really matter.
Someone else said to me this morning – and I was like, oh, this might be right – this is the most complex software object humanity has yet produced. And it will be trivial in a couple of decades, right? It'll be like kind of anyone can do it, whatever.
On building GPT in public:
We are building in public, and we are putting out technology, because we think it is important for the world to get access to this early. To shape the way it's going to be developed, to help us find the good things and the bad things. And every time we put out a new model, the collective intelligence and ability of the outside world helps us discover things we cannot imagine, that we could never have done internally. Both great things that the model can do, new capabilities, and real weaknesses we have to fix.
And so this iterative process of putting things out, finding the great parts, the bad parts, improving them quickly, and giving people time to feel the technology and shape it with us and provide feedback, we believe it's really important. The tradeoff of building in public is, we put out things that are going to be deeply imperfect. We want to make our mistakes while the stakes are low, we want to get it better and better each rep.
I can't emphasize enough how much the collective intelligence and creativity of the world will beat OpenAI and all of the red teamers we can hire. So we put it out. But we put it out in a way we can make changes.
On whether the tool is currently being used for good or evil:
I don't – and nor does anyone else at OpenAI – sit there reading all the ChatGPT messages. But from what I hear, at least the people I talk to, and from what I see on Twitter, we are definitely mostly good. But not all of us are all the time. We really want to push on the edges of these systems. And, you know, we really want to test out some darker theories of the world.
There will be harm caused by this tool. There will be harm, and there'll be tremendous benefits. Tools do wonderful good and real bad. And we will minimize the bad and maximize the good.
On whether OpenAI should release the base model of GPT-4 without safety and ethics restrictions:
You know, we've talked about putting out the base model at least for researchers or something, but it's not very easy to use. Everyone's like, give me the base model. And again, we might do that. I think what people mostly want is they want a model that has been RLHFed to the worldview they subscribe to. It's really about regulating other people's speech. Like in the debates about what showed up in the Facebook feed, I haven't listened to a lot of people talk about that. Everyone is like, well, it doesn't matter what's in my feed, because I won't be radicalized, I can handle anything. But I really worry about what Facebook shows you.
On how the hell humanity as a whole should deal with this challenge:
Let's say the platonic ideal, and we can see how close we get, is that every person on Earth would come together, have a really thoughtful, deliberative conversation about where we want to draw the boundaries on this system. And we would have something like the US constitutional convention, where we debate the issues, and we look at things from different perspectives and say, well, this would be good in a vacuum, but it needs a check here... And then we agree on, like, here are the overall rules of the system.
And it was a democratic process, none of us got exactly what we wanted, but we got something that we feel good enough about. And then we and other builders build a system that has that baked in. Within that, then different countries, different institutions, can have different versions. So there's different rules about, say, free speech in different countries. And then different users want very different things. And that can be done within the bounds of what's possible in their country. So we're trying to figure out how to facilitate that. Obviously, that process is impractical, as stated, but what is something close to that, that we can get to?
We have the responsibility if we're the one like putting the system out. And if it breaks, we're the ones that have to fix it, or be accountable for it. But we know more about what's coming. And about where things are harder, or easier to do than other people do. So we've got to be heavily involved, we've got to be responsible, in some sense, but it can't just be our input.
I think one of the many lessons to take away from the Silicon Valley Bank collapse is, how fast and how much the world changes, and how little I think our experts, leaders, business leaders, regulators, whatever, understand it. The speed with which the SVP bankruptcy happened, because of Twitter, because of mobile banking apps, whatever, was so different than the 2008 collapse, where we didn't have those things really. And I don't think that the people in power realize how much the field has shifted. And I think that is a very tiny preview of the shifts that AGI will bring.
I am nervous about the speed with which this changes and the speed with which our institutions can adapt. Which is part of why we want to start deploying these systems really early, while they're really weak, so that people have as much time as possible to do this.
I think it's really scary to like, have nothing, nothing, nothing and then drop a super powerful AGI all at once on the world. I don't think people should want that to happen. But what gives me hope is like, I think the less zero-sum and the more positive-sum the world gets, the better. And the the upside of the vision here, just how much better life can be? I think that's gonna unite a lot of us. And even if it doesn't, it's just gonna make it all feel more positive sum.
On the possibility that super-powerful AIs might decide to kill us all:
So first of all, I will say, I think that there's some chance of that. And it's really important to acknowledge it. Because if we don't talk about it, if we don't treat it as potentially real, we won't put enough effort into solving it. And I think we do have to discover new techniques to be able to solve it.
I think a lot of the predictions, this is true for any new field. But a lot of the predictions about AI in terms of capabilities, in terms of what the safety challenges and the easy parts are going to be, have turned out to be wrong. The only way I know how to solve a problem like this is iterating our way through it, learning early and limiting the number of "one-shot-to-get-it-right scenarios" that we have.
I think it's got to be this very tight feedback loop. I think the theory does play a real role, of course, but continuing to learn what we learn from how the technology trajectory goes. It's quite important, I think now is a very good time. And we're trying to figure out how to do this to significantly ramp up technical alignment work. I think we have new tools, we have no understanding. And there's a lot of work that's important to do. That we can do now.
On whether he's afraid:
I think it's weird when people think it's, like, a big dunk that I say I'm a little bit afraid. And I think it'd be crazy not to be a little bit afraid. And I empathize with people who are a lot afraid.
The current worries that I have are that they're going to be disinformation problems or economic shocks, or something else, but at a level far beyond anything we're prepared for. And that doesn't require super intelligence, that doesn't require a super deep alignment problem in the machine waking up and trying to deceive us. And I don't think it gets enough attention. It's starting to get more, I guess.
Like, how would we know if on Twitter we were mostly having, language models direct whatever is flowing through that hive mind? And as on Twitter, so everywhere else, eventually. My statement is we wouldn't, and that's a real danger.
On what the solutions might be:
I think there's a lot of things you can try. But at this point, it is a certainty: there are soon going to be a lot of capable open-source LLMs with very few to none, no safety controls on them. And so you can try with regulatory approaches, you can try with using more powerful AIs to detect this stuff happening. I'd like us to start trying a lot of things very soon.
We can't control what other people are going to do. We can try to like build something and talk about it and influence others, and provide value and, you know, good systems for the world. But they're going to do what they're going to do. I think right now, there's like, extremely fast and not super deliberate motion inside of some of these companies. But already, I think, as they see the rate of progress, people are grappling with what's at stake here. And I think the better angels are going to win out.
The incentives of capitalism to create and capture unlimited value, I'm a little afraid of. But again, no, I think no one wants to destroy the world. No one wakes up saying like, "today, I want to destroy the world." So we've got the Moloch problem. On the other hand, we've got people who are very aware of that. And I think a lot of healthy conversation about how can we collaborate to minimize some of these very scary downsides?
I think you want decisions about this technology, and certainly decisions about who is running this technology to become increasingly democratic over time. We haven't figured out quite how to do this. But part of the reason for deploying like this is to get the world to have time to adapt, and to reflect and to think about this, to pass regulation, for institutions to come up with new norms for that, people working out together. Like that is a huge part of why we deploy. Even though many of the AI safety people think it's really bad, even they acknowledge that this is of some benefit.
On whether OpenAI is being open enough about GPT:
It's closed in some sense, but we give more access to it than, like... If this had just been Google's game, I feel it's very unlikely that anyone would have put this API out. There's PR risk with it. I get personal threats because of it all the time. I think most companies wouldn't have done this. So maybe we didn't go as open as people wanted. But like, we've distributed it pretty broadly.
I think there's going to be many AGI's in the world. So we don't have to like out-compete everyone. We're going to contribute one, and other people are going to contribute some, I think multiple AGIs in the world with some differences in how they're built and what they do and what they're focused on – I think that's good. We have a very unusual structure. So we don't have this incentive to capture unlimited value. I worry about the people who do but you know, hopefully, it's all gonna work out.
I think people at OpenAI feel the weight of responsibility of what we're doing. It would be nice if like, you know, journalists were nicer to us and Twitter trolls gave us more benefit of the doubt. But I think we have a lot of resolve in what we're doing and why, and the importance of it. But I really would love – and I ask this of a lot of people, not just if cameras are rolling, – like, any feedback you've got for how we can be doing better. We're in uncharted waters here. Talking to smart people is how we figure out what to do better.
How do you think we're doing? Like honest, how do you think we're doing so far? Do you think we're making things better or worse? What can we do better? Do you think we should open-source GPT-4?
While no single quote makes it crystal clear, here's what I believe Altman is suggesting: GPT-4 is capable and impressive enough that, if unleashed without safety protocols and given free rein to do whatever it's told, it's likely to result in some seriously shocking consequences. Enough to stop the world in its tracks and spur rapid and widespread action, but since this is still embryonic and crude tech compared to what's coming, it's probably not yet powerful enough to wipe out civilization.
I believe – and I may be wrong – that Altman is asking whether his company has a responsibility to let GPT-4 off the chain right now as a shock-and-awe demonstration of its power, a Hiroshima/Nagasaki moment that the world simply can't ignore and keep going about its business. OpenAI can't control how anyone else is building their AIs, but maybe by allowing, or even encouraging, a bit of chaos and destruction, the company might be able to force the world to take action before subsequent GPTs and other AIs launch that truly do have the power to end us.
If that's what he's asking, then first of all: good grief. Such a decision could put him up there with some of the best-intentioned supervillains in all of fiction – or it could genuinely give the world a badly-needed early jolt – or it could prove a woefully inadequate gesture made too late. Or heck, it could backfire as a gesture by not really doing anything all that bad, and in doing so, might lull people further into a false sense of security.
Two and a half hours is a decent whack of time out of anyone's schedule, but given the nature of what's being discussed here, I wholeheartedly recommend you take the time to check out Lex's interview to get a sense of who Altman is, and what he's wrestling with. It's complicated.
And both Altman and I would love to hear what your thoughts are in the comments section.
Source: Lex Fridman/OpenAI
Please keep comments to less than 150 words. No abusive material or spam will be published.
Humans make mistakes, tell lies, and sometimes tell lies thinking they're truth. GPT is expected to mirror its training in its outputs. The training is us. So it's absurd to expect GPT to yield anything perfect, or moral, or always correct. We aren't those things, therefore GPT can't be either.
I think what's happening is that working with these networks, these models, is showing us that we're really not all that intelligent, not all that creative, and generally just not all that.
Which is a hard pill for many to swallow.
But I for one welcome our new AI overlords, lol. If you don't want it to shoot things, don't give it a gun. ;-)
In the middle of political conflict with Russia, China, I am curious to see the result of the applications of this technology.
("Get out the popcorn to admire the show" )
When Altman says, " one of the many lessons to take away from the Silicon Valley Bank collapse is, how fast and how much the world changes, and how little I think our experts, leaders, business leaders, regulators, whatever, understand it." He's right. He gave the example of the SVB collapse, but there many other examples like that. A commentator in the Washington Post said, "Congress doesn't have more than a handful of actual elected leaders who understand any technical issues at all. This is beyond them. Facts, science itself, are beyond them. A few years ago, Mark Zuckerberg was in front of Congress, and the questions, assumptions, the bizarre speeches, and general comments from Congress members were a carnival of fools, unprepared and unaware dolts. They knew so little, they didn't know what questions to ask, and didn't fully understand many of Zuckerberg's answers."
When Altman says, "I am nervous about the speed with which this changes and the speed with which our institutions can adapt", I think he should be MORE worried than he is. As another commentator said, "we are no longer a serious country capable of serious debate about serious issues. The main reason for that is that the Republican Party is riddled with people who simply have not done nor mastered the 'homework' necessary to uphold their side of that conversation."
So, no, I don't have any hope that elected leaders, business leaders, regulators will respond to the challenge of AGI effectively or quickly enough to put guardrails in place to avoid major harm.
Where do we go from here? I was quite surprised (and happy) to hear Altman asking for "feedback for how we can be doing better". I would like to point out the National Institute of Standards and Technology (NIST) just released version 1.0 of their AI Risk Management Framework on January 26, 2023. (https://nvlpubs.nist.gov/nistpubs/ai/NIST.AI.100-1.pdf). The AI RMF is "intended to be practical, to adapt to the AI landscape as AI technologies continue to develop, and to be operationalized by organizations in varying degrees and capacities so society can benefit from AI while also being protected from its potential harms."
So that's a place to start. If Altman is serious about seeking feedback on the development of ChatGPT, the assessments from applying the NIST framework should be made public.
For a set of high level principles to serve as guidance for developers, I'd refer to the Asilomar Principles - https://futureoflife.org/principles/principled-ai-discussion-asilomar/. I was hoping Altman would mention them.
"The 2017 Asilomar Conference took place against a backdrop of growing interest from wider society in the potential of artificial intelligence (AI), and a sense that those playing a part in its development have a responsibility and opportunity to shape it for the best. The final list of 23 “Asilomar Principles” have since become one of the most influential sets of governance principles, and serve to guide our work on AI. "
I'd also like to reference the European Union's AI Act (https://artificialintelligenceact.eu) for awareness, not because I think current elected leaders in the US or Canada will introduce something similar. We, as individuals, should be aware of how AI systems should be governed and demand similar treatment. The proposed EU AI law assigns applications of AI to three risk categories. First, applications and systems that create an unacceptable risk, such as government-run social scoring of the type used in China, are banned. Second, high-risk applications, such as a CV-scanning tool that ranks job applicants, are subject to specific legal requirements. Lastly, applications not explicitly banned or listed as high-risk are largely left unregulated.
Hope this helps.
Sam Altman, OpenAI CEO
Sam, the road to hell is paved with good intentions, and arrogance.
As far as letting GPT4 loose to show how good it is, I think that step needs to be taken with care and one very good sense of humour! It can still miss the point of a question or request or give totally incorrect or even irrelevant answers. That Hiroshima event could very easily be a Hiroshima size facepalm moment!
Just keep on improving things day by day, help people to understand the benefits and limitations better and let’s just see how far we can get with this.