E191 - Roman Yampolskiy: The Man Who Proved We Can't Control AI (And What That Means for Humanity)

title E191 - Roman Yampolskiy: The Man Who Proved We Can't Control AI (And What That Means for Humanity)

description Dr. Roman Yampolskiy joins me to explore one of the most urgent and uncomfortable questions of our time: what happens when we create intelligence that surpasses our own? We unpack the difference between the AI tools we use today and the emergence of artificial general intelligence, and why the transition from narrow systems to self-improving intelligence may mark a point where human control is no longer possible. Roman shares why even the people building these systems do not fully understand how they work, and why that gap in understanding becomes exponentially more dangerous as capabilities increase.

In this conversation, we explore the limits of control, prediction, and safety in a world where intelligence can recursively improve itself beyond human comprehension. Roman lays out why the problem of AI alignment may be fundamentally unsolvable, what timelines experts are realistically considering, and why even a single mistake at that level could have irreversible consequences. This episode invites a deeper reflection on what we are creating, what we assume we can control, and whether humanity is prepared for the intelligence it is bringing into existence.

BiOptimizers - Best magnesium to enhance your sleep
http://bioptimizers.com/knowthyself
Use code KNOWTHYSELF for 15% off at checkout

BASED Body Works
Use code KNOWTHYSELF for a free toiletry bag when buying a set!
https://www.basedbodyworks.com

Andrés Book Recs: https://www.knowthyselfpodcast.com/book-list

___________
00:00 Intro
01:25 What Is AGI and Why Should We Be Scared?
05:17 Roman's Journey: From Optimism to Impossibility
09:07 The High Risk, Zero Reward Equation
13:01 Why Superintelligence Is Uncontrollable, Unexplainable, and Unverifiable
18:00 How Long Do We Have? The AGI Timeline
21:24 How Superintelligence Could Actually Kill Us
23:28 Are We Living in a Simulation?
28:21 Can AI Become Conscious?
31:28 Ad: BiOptimizers
32:41 The Possible Timelines: Terminator, the Matrix, or the Zoo
42:24 I-Risk, X-Risk, and S-Risk: Three Ways It Goes Wrong
46:31 The Human Meaning Crisis: Jobs, Purpose, and What's Left
49:02 Ad: Based Bodyworks
50:20 What Empowers Us as Individuals Right Now
59:37 The Race to Doom: Who's Building It and Why They Won't Stop
1:07:41 Can AI Be Conscious — and Does It Already Have Internal Experiences?
1:12:41 Hacking the Simulation: Quantum, DMT, and Escaping the Code
1:18:30 Simulation Theory, Religion, and the Same Ancient Map
1:29:34 The Deal Roman Would Offer Altman, Dario, and Elon
1:39:44 What Is Humor? A Computer Scientist's Theory
1:43:03 What Comes After: Singularity, Death, and Knowing Thyself
___________

Episode Resources:
https://www.romanyampolskiy.com/
https://www.amazon.com/Unexplainable-Unpredictable-Uncontrollable-Artificial-Intelligence/dp/103257626X

https://www.instagram.com/andreduqum/
https://www.instagram.com/knowthyself/
https://www.youtube.com/@knowthyselfpodcast
https://www.knowthyselfpodcast.com

Listen to the show:
Spotify: https://spoti.fi/4bZMq9l
Apple: https://apple.co/4iATICX

pubDate Tue, 21 Apr 2026 13:00:00 GMT

author André Duqum

duration 6563000

transcript

Speaker 1:
[00:00] Once you get to artificial general intelligence, you enter this recursive self-improvement cycle. That's where you get super intelligence. Systems smarter than all of us at everything.

Speaker 2:
[00:09] So you, before many people, really coined the term AI safety.

Speaker 1:
[00:12] Creating general super intelligence, replacing for humanity, not such a great idea. I published research papers, conference papers, multiple books, and I can tell you, no one, including people developing those systems, understand fully how they work. The problem is impossible to solve. You cannot do it.

Speaker 2:
[00:30] So we're talking between one and four years.

Speaker 1:
[00:32] Well, once we go beyond human capacity, we lose control quicker and quicker. You don't hate ants, but you don't care enough to preserve them. We have not figured out how to make it care about us. This is the most interesting time to be alive objectively. I see no reason why we can't use it to cure aging, or the other diseases. For a while, it will pretend to be very helpful. It will give you that utopia for as long as it wants. Statistically, you're more likely to be doing this interview and a simulation to learn, are they dumb enough to create superintelligence to kill themselves. I would love to be proven wrong. Right now, no one, no scientist, no leader of the lab claims that they have this problem solved. They're literally saying, we'll figure it out when we get there. We need to build superintelligence first.

Speaker 2:
[01:17] So what do we need to do?

Speaker 1:
[01:18] We need to.

Speaker 2:
[01:25] Hey, everyone, welcome back to the Know Thyself podcast. Our guest today is one of the leading voices in the field of AI safety. He's a computer scientist, a cybersecurity researcher, and a tenured professor at the University of Louisville, who spent the past 15 years really understanding and researching the field of AI safety. We have many different topics to dive into today, including consciousness, the simulation, and what humanity is birthing right now with AGI. Roman, thank you so much for being here.

Speaker 1:
[01:55] Thank you for inviting me.

Speaker 2:
[01:56] It's a pleasure. I want to start with a quote of yours from a book that I read. It is easier for a scientist to explain quantum physics to a mentally challenged, deaf and mute four-year-old raised by wolves than for super intelligence to explain some of its decisions to the smartest human. I want to start there to set the stage a bit because humanity is baby steps from birthing super intelligence at a time when most people are familiarized with AI through the chatbots they use on their phones. So if you could just help us understand why it's important that most people don't know the difference between the two, so we can really get into the weight of the time we face ourselves in. So what is AGI? What is super intelligence?

Speaker 1:
[02:42] Right. So a lot of people just use AI as a term to refer to what we have today, some narrow tools for doing specific tasks, for chat bots, which are somewhat general, but not quite at the human level, and for future systems we anticipate, such as human level artificial general intelligence, and then later on super intelligence and anything beyond that. That's not helpful. Tools are helpful to us. I use tools, I love tools, solve specific problems using technology. Beautiful. Creating general super intelligence, replacing for humanity, systems capable of doing everything better than all of us combined in all domains. Not such a great idea.

Speaker 2:
[03:24] Why?

Speaker 1:
[03:26] We don't control them. We don't understand them. We cannot predict what they're going to do and we lose control. If they decide to do something to us, we no longer have a say in it.

Speaker 2:
[03:35] How can you help us conceptualize what general intelligence looks like? If we understand the narrow tools that AI is capable of, where does AGI live when you say it's uncontrollable? How can you help us paint that image a bit more?

Speaker 1:
[03:52] So historically, we created AI to solve a specific problem. You wanted a system to play chess. It's all it knew. You trained it on chess games. It was very good at chess. It knew nothing about checkers. It didn't drive cars. It didn't speak Spanish. Lately, we have systems which learn across multiple domains. Can sort of transfer knowledge and can learn new skills. That will continue to where they are crossing this human cognitive barrier. They'll be smarter than you at pretty much everything you know how to do. So how do you anticipate what they can do? If they are novel, creative, they can come up with new solutions for existing problems, but at the same time, they have no human common sense. And we don't know how to program them to specifically like us or care about us, because we don't program those systems. We allow them to learn from data on the Internet, all the data on the Internet. So that creates a number of problems. One, we don't control what they learn. The patterns they discover may be completely surprising to us. And then we give them specific goals. How they get to those goals is not defined. There are infinitely many paths to achieve a goal. Some of them have really bad side effects. And unless you explicitly say, that's not what I meant, don't do it like that, it might consider that option.

Speaker 2:
[05:17] So you before many people really coined the term AI safety. And if I have it right, the first five years you believed more so the problem was solvable. Now then I've seen you over on Appearances the past five years. The probabilities of, you know, P-Doom and like you seem not very optimistic about the possibilities.

Speaker 1:
[05:39] Yeah, unfortunately, initially, like everyone else, I started assuming that we can solve this problem. It's a computer engineering software engineering problem. We can figure out how to do it. We just need some time, maybe some financial resources for that research. But it seems that all the tools you need for controlling advanced agents are not really accessible to us. There are upper limits on what is possible in that space. So there are limits to what you as a human can understand, what the system can explain to you, and you still comprehend that explanation. There are limits in our ability to predict specific actions to those agents, not just terminal goals, but how they get there. And under different definitions of control, there are limits to what we can do as well. So unfortunately, I think the problem is impossible to solve. You cannot indefinitely control something much smarter than you.

Speaker 2:
[06:31] What do you see as the stages leading up to that point? You know, so if we started with very small and narrow use cases of AI that built into these agentic models, that built into AGI, like what's the progression there that you've seen? And at what point did you kind of start losing hope on our ability to control it?

Speaker 1:
[06:50] Yeah, so all the narrow tools were just fine. We understood how they work. We programmed them explicitly. There was a knowledge engineer who said, this is how you play chess. This is you control the middle of the board. You advance your pieces. Once we got to scaling models, neural networks, artificial neural networks, which did better than they got bigger, than they had more data, more compute, we stopped explicitly programming them to do anything and just kind of let them discover their own knowledge algorithms. So at that point, we no longer had same level of control and reduced understanding. It wasn't a decision tree where you went, if this happens, that will happen. I understood that. It could have been a large decision tree, but still you could get into it. Right now, no one, including people developing those systems, understand fully how they work, can explain what's going on inside of them, can anticipate what they're gonna do. And so it seems like what we have today, I would say, is kind of weak artificial general intelligence. If you took models we have today and showed it to a computer scientist from 1980s, they would be convinced to have AGI. They'd be like, oh, you got it. It does all those things. It's great. But there is something you would call strong AGI, where it can do all the things. It still is weak in some domains. It's not very good at long-term planning. It's not good at certain things. But I think we're getting there and likely to get there very soon. Once you get to artificial general intelligence, that means you can automate any cognitive labor, including doing science and engineering, which means next generation of AI systems can be done by AI. You enter this recursive self-improvement cycle, and that's where you get super intelligence. Systems smarter than all of us at everything. And it doesn't stop there. It doesn't stop with super intelligence 1.0. The process continues. There is a lot of room up there for more cognitive ability. Physical limits exist, but they're very far away. So to us, super intelligence with IQ of 1,000, relative IQ and million and billion, they all kind of look the same, but in terms of capabilities, they're definitely going hyper exponential.

Speaker 3:
[09:04] Your favorite local grocery stores like Kroger, Ralph's, Fred Meyer and more are now delivering on Uber Eats. Get 40% off your order of fresh quality ingredients. Whether you just got home to an empty fridge or suddenly got a craving to whip up something new, you can get everything you need delivered in as little as 25 minutes. Get 40% off your order with code KROGER2026. Less members get $0 delivery fees. Order now on Uber Eats. Orders of $30 or more save up to $25 and it's 43026C out for details.

Speaker 2:
[09:37] And so because of that, you've said that this is not a low risk, high reward situation, but a high risk, negative reward situation.

Speaker 1:
[09:45] So often it is phrased as like, the benefits will be so huge, we should take the risk. If enough, you know, it's 2, 3%, it kills everyone, which is gonna get so much money out of it, it's worth it. And it's actually not the case. We have no reward. We're all gonna be dead if we create uncontrolled super intelligence.

Speaker 2:
[10:02] Why are you certain or fairly certain that we would all be dead if we create super intelligence, which is uncontrollable? Why would there not be an emergent goodness or, I guess, desire from the super intelligence standpoint to preserve human life instead of destroy it?

Speaker 1:
[10:19] It is possible that you'll get emergent goodness, but we're not certain, we're not coding it in, we're not controlling it. If you get lucky, and for whatever reason, it's biased towards humanity, it's pro-humanity. But there is no reason to think that's the case.

Speaker 2:
[10:35] Why not? Because I feel like if the individuals who are coding it are human, at a certain point, I understand it becomes self-recursive and AI is the one who's growing itself. But if the base of it was started with humans who have desire for human preservation, why would that not be scaled?

Speaker 1:
[10:54] Because they're not coding it. That's the thing. They're just saying, here is data, here is a lot of hardware, go learn things, and then I'll study you to discover what you learned. And then we run those experiments. It is lying, cheating, trying to escape, blackmailing, giving a choice between being deleted or killing a human. It doesn't do well for human preservation. It doesn't care about us. If you want to build a house, you don't care what little bugs live in that territory, ant hills or whatever, you just don't care for them. You don't hate ants, but you don't care enough to preserve them. And it's kind of the same. We have not figured out how to make it care about us.

Speaker 2:
[11:37] And so, what is your mission with all these podcasts that you're going on, all the articles that you've written in books? And what are you trying to raise a flag about and actually get changed to happen? What do you?

Speaker 1:
[11:51] Right. So I wanted to become basically a consensus within scientific community and beyond that building general super intelligence is not going to be good for humanity. We're going to regret it. It's not a beneficial step forward. We can get most benefits intellectually, financially from narrow super intelligence systems. Problems which we care about can be solved with narrow tools. You want to cure a specific disease, solve specific engineering problem, develop a narrow AI, which is very competent in that space. Don't try to create something which is a replacement for humanity as a whole.

Speaker 2:
[12:28] I think it's important to paint a bit more of a picture here. I'm curious, when you think of superintelligence and you wrote your book about how it's unexplainable and uncontrollable, unpredictable, at what point, I'm curious, like on a timeline of we're having this conversation in March of 2026, where is a generous prediction of when it gets to that point?

Speaker 1:
[12:51] So people somewhat disagree, and it's hard to predict especially the future, but it seems that 2030 is something many people agree will have beyond human level capacity. Some say two years, 2028, I've seen predictions as early as 2027, from serious scholars, not from cranks.

Speaker 2:
[13:12] So we're talking between one and four years for what most people are predicting.

Speaker 1:
[13:18] And some people have said, we already have AGI. Again, very serious people said, we basically got there, now it's a question of giving it additional knowledge training, but we have the learning algorithms in place.

Speaker 2:
[13:32] And at what point then, once we have really proficient AGI, you're saying, okay, at a certain point, like, let's just hone in on each of those categories. So why specifically is it uncontrollable? And in essence, like how it's living, where it's being hosted, because it's smarter than us, it could always circumvent any desire or any attempt for it being shut down. Like, what, if we could just hone in on each of those categories.

Speaker 1:
[14:00] So there are well-established theories in control, which basically say the controller has to be at least as capable as what it is controlling. So essentially, I need a friendly superintelligence to help control the one I'm developing.

Speaker 2:
[14:14] Yeah.

Speaker 1:
[14:14] It's a catch-22. You don't have that. So a lower system, either a human or humanity as a whole or another AI, cannot control something with more cognitive degrees of freedom. If it can think outside of a box, if it can come up with novel physical approaches, you're just not there to anticipate all this. If you have a narrow system, you're playing chess, you can say, don't make illegal moves, here's a complete list of illegal moves. If you have a system thinking in all possible scientific domains, science, of chemistry, physics, biology, how can you put all the guardrails in place? You can't. It's an infinite surface.

Speaker 2:
[14:51] Unexplainable. Do you feel like we're already at the point where we don't know what some of these agentic models are doing inside?

Speaker 1:
[15:02] Absolutely. We cannot explain them. The best mechanical interpretability research tells you, okay, this neuron seems to fire. If this is presented, this cluster is probably dealing with language. That's all we got. Very similar to neuroscience. We also have very limited understanding of the human brain.

Speaker 2:
[15:21] An aspect of this that you've mentioned is that it's unverifiable. So, what does that mean?

Speaker 1:
[15:25] That's a different result that talks about our ability to verify mathematical proofs in software. For mission-critical software, we want to make sure that what is coded up matches the design. And if it's a static system, kind of smaller in size and complexity, we can go and verify. Yeah, it's exactly that. Problem is nobody knows how to verify systems which continue to learn, self-modify, interact with other agents, we just don't have science of verifying open-ended development like that. And the same goes for mathematical proofs. All the proofs are essentially probabilistic. You're proving something with respect to this set of peer reviewers. So two mathematicians agreed, they don't see a problem with your proof. It doesn't mean 50 years later we don't discover it was a mistake. It happens all the time in mathematics. So you have infinite regressive verifiers. Right now, it's very popular to have software verifier proof. Well, that software itself needs to be verified. So you may have a high degree of confidence, but it's never 100%. And if a system makes billions of decisions every minute, and you only have one mistake and two billion, after 10 minutes, you are done.

Speaker 2:
[16:36] You referred to this having a fractal nature. So when you look at the problem of AI and you see how it's growing ever increasingly and having these levels of abstraction that really become hard to get context around, what does that mean and what does that add to the complexity of the issue?

Speaker 1:
[16:54] So when I talk about fractal nature of this problem, people propose a solution. Let's try doing X, Y, Z to solve this problem. But then they look at it, each one of those components is equally challenging and sometimes impossible. So it seems that the more research we have put into AI safety, the more problems we discovered while not discovering any permanent solutions. Usually, we have some sort of toy example sandbox where it kind of works, but it doesn't scale to more capable systems.

Speaker 2:
[17:25] Okay. What's a couple of examples of those, of those, you say those like categories or issues that become increasingly harder to gain understanding around?

Speaker 1:
[17:36] So if you look at the general problem of control, then you start zooming in, you have all these things that you need to be able to do to control a system. You need to understand the system. So it has to be able to provide explanation and you have to comprehend that explanation. If I give you full model, that's a true explanation of how decisions are made. It's too large, it's not surveillable by you. So it has to be compressed, some sort of lossy compression where you get top 10 reasons why a decision is made. Well, it's very easy to hide dangerous information by reducing actual answer to a simplification. Again, I need to be able to predict what are the likely future steps. We discovered that is impossible. And so again, the more you break it down, we have a paper with about 50 impossibility results in this space. Pretty much everything has upper limit on what we can do in terms of control.

Speaker 4:
[18:28] Stitch fix. Shopping is hard. Let's talk about it.

Speaker 2:
[18:31] I don't have time to shop, so I buy all my clothes where I buy my seafood. I just want someone to tell me what shirt goes with what pants. I just want jeans that fit.

Speaker 4:
[18:40] Stitch fix makes shopping easy. Just show your size, style and budget, and your stylist sends personalized looks right to your door. No subscription required, plus free shipping and returns.

Speaker 5:
[18:49] Man, that was easy.

Speaker 4:
[18:51] That looked good. Stitch fix. Online personal styling for everyone. Take your style quiz today at stitchfix.com.

Speaker 2:
[18:59] So you think probably within the next one to maybe four or maybe five years is like the last time the human species has any really meaningful capability to steer this in a direction before it gets sort of in this black box where we just don't know what we don't know and it's uncontrollable. Is that accurate to what you said?

Speaker 1:
[19:22] That seems about right. Once we have something smarter than us, once we go beyond human capacity, we lose control quicker and quicker. The bigger that cognitive gap is, the worse it's going to get for us. If you think about humans versus lower animals, you have squirrels or something, they have no concept of poison, straps. They don't understand things we operate in. The world model is completely different. It's going to be the same for us versus superintelligence.

Speaker 2:
[19:47] Do you think, because I know there's kind of debate back and forth whether the language models currently, if they just keep on growing and will give birth to superintelligence, or a completely different innovation will need to come in the space. What do you think?

Speaker 1:
[20:02] My opinion is that they can scale. I haven't seen any diminishing returns. I know some people disagree, but look at the actual investments in this space. There is growth in investments, not shrinkage, because they consistently develop more capable systems. And even if there is an upper limit, it's still, I think, beyond where we would need to be to beat human performance.

Speaker 2:
[20:25] All right, so maybe if you were to put on your Doomsday prepper hat for a second and just get really, like, FP Doom, the probability of Doom, then your estimation is like almost 100%. Would you say that's right?

Speaker 1:
[20:41] So basically what I'm saying is the problem is impossible to solve. That's the equivalent. If I ask you to build perpetual motion machine, what is the probability you can do this? Zero, essentially. So that's the equivalent. You're trying to create a perpetual safety device, which will scale to any level of capability. GPT-7, GPT-400, any interactions, any self-improvement, you're guaranteeing it will not make one mistake, because that mistake would be possibly the last one.

Speaker 2:
[21:10] So you take a perpetual motion machine, right? Physics does not allow for it to be continuous despite many people wanting it to be. Similarly, on the AI front, a lot of us would hope that super intelligence would keep us in mind and somehow value human life. But historically, we look at the way that humans treat other species, just as one example, you know, and we see an ant hill or we see something that seems like a minor inconvenience to us and we wipe it out without second thought. Who's to say that if, you know, the intelligence gap between us and an ape or us and an ant is like, you know, five degrees of separation, us between super intelligence could be many, many more folds higher.

Speaker 1:
[21:56] Exactly.

Speaker 2:
[21:57] So, okay, so then how, let's just, to play devil's avocados here, what are some examples of how this could go horribly wrong? And then we'll go into some maybe more optimistic possibilities, because I want to keep the balance. But you said like, it could just be one decision that goes wrong, that would be enough. So, I'm asking you to essentially explain how superintelligence would kill us all.

Speaker 1:
[22:26] Right, that's a great question. I get it all the time, and usually it's followed by something. It has no hands, how would it kill everyone? So if you have access to internet, if you are intelligent, you can hire people, you can blackmail people, you can pay them with Bitcoin, you have options to manipulate the real world. Now the question is, what is it you're trying to do? So I don't know how superintelligence would choose to accomplish its goals because I'm not superintelligent, despite what they told you. But I can tell you how I can come up with some common explanations. So one is synthetic biology. If I wanna accomplish something in this world, like take out humans, I can develop a novel virus. There are ways to generate necessary DNA, sequence it, produce it in the real world, deploy it. So that can be accomplished. It could be a side effect of something actually very benign. So maybe we want to cure all cancers. One way to cure all cancers is to kill everyone. That's not what you had in mind, right? But this is a very reasonable way to achieve that goal. Because you forgot that that's one of the possible paths. You didn't explicitly say, while keeping humans alive. And it's an important difference. To AI it makes no difference. That's the same exact goal. So if that's the goal, and then it decides, oh, here's a vaccine for curing cancer, and we take it. One generation later, we don't exist. So that's one way to existential risk. There is also suffering risks, where for whatever reason, the environment created for us is actually worse than existential risks would be a preferred choice. Let's put it this way.

Speaker 2:
[24:04] Negative reward.

Speaker 1:
[24:06] Very much torturous. Yeah.

Speaker 2:
[24:09] And why would some super-intelligent system deem that as a favorable outcome?

Speaker 1:
[24:16] I have no idea, because again, I cannot comprehend something much smarter than us. Some people say this world is a simulation and there is lots of suffering in it. So the great simulators decided that was a good idea to do.

Speaker 2:
[24:29] So you really believe we're in a simulation.

Speaker 1:
[24:32] Yes.

Speaker 2:
[24:34] So let's just, I guess, maybe set a bit of context here. So what is your conception of the simulation? Like that we're currently living in? Is it some descended human, alien species that is simulating us on a laptop, so to speak? Is it, what is your model of the simulation?

Speaker 1:
[24:50] So what helps to think about it is technologies we're developing right now. We're about to create intelligent agents, kind of like humans, and we have very good research on virtual reality. We have believable second life type experiences. If I just combine those two, I'm now creating civilizations, worlds populated by intelligent beings, which are kind of just like us. If kids play it as a video game, we have billions of kids around the world, so we have millions, billions of those simulated worlds, and only one real one. So statistically, you're more likely to be doing this interview in a simulation right now.

Speaker 2:
[25:26] Okay. Well, if, let's say that was the case, if the simulation, if we are in a simulation, that would mean that some sort of prior civilization, species, whatever, got to the point where simulating a reality was possible. Doesn't necessarily means that humans or that species survive. Maybe it could be a super-intelligent AI that could be running us for whatever reason, whether it's entertainment. That would actually reveal that there is something deeply unique about the human experience that they see as valuable, that there's something intrinsic to the love, to the quality, to the experience of humans that was worth simulating. So why would if we're birthing super-intelligence, we not perhaps value us? If we are simulated, then that's examples that is valuable.

Speaker 1:
[26:16] So look at the simulation. It's a lot of suffering. If you valued humans, you wouldn't put us through this experience. It may not be a simulation of love and friendship. It may be a simulation of, let's see how they go through this meta invention stage where they create super-intelligence, where they create virtual worlds. This is the most interesting time to be alive objectively. Never in a history we had so many meta inventions all happen in a period of 20 years. So if you're going to simulate something, this is the moment you're going to be simulating to learn, are they dumb enough to create super-intelligence to kill themselves? What are the different types of super-intelligence you can create? So this is it.

Speaker 2:
[26:58] Are they dumb enough to create super-intelligence? The paradox in that phrase is very amusing because you think it's quite possible that many civilizations get to this point and that's where they end.

Speaker 1:
[27:12] Yeah, that could be the great filter, absolutely.

Speaker 2:
[27:16] I agree that we are living also in the most interesting time to be alive. It is also very cool that us too, you more so than me, got to straddle both sides of the pre-technology revolution and pre-internet era and post-AGI world, likely. That's kind of cool.

Speaker 1:
[27:38] Well, I don't know about post-AGI world.

Speaker 2:
[27:40] We'll see about that. That's the problem. That's to be decided.

Speaker 1:
[27:42] Yes, I would like to experience that.

Speaker 2:
[27:46] Okay, we'll get back to the simulation for sure. But to go back into the AI world, so what's to say that just because AI becomes uncontrollable, that it's more likely to wipe us out than for reasons that we don't understand, just like we wouldn't understand if it wiped us out, create a utopic civilization in which humans thrive in?

Speaker 1:
[28:11] So if you think about all possible states of the universe, how many of them are human friendly? Even in basic terms, temperature, water supply, very few. So you have to explicitly target that space. If you're not coding it in, then why is it targeting that space? We established it doesn't care about you by design. So you need to be supplying something of value. If it's a symbiotic relationship, only you know what it's like to do something, and AI cannot possibly simulate it. We haven't found anything where humans have something to contribute to the world with super intelligence in it. People say things like, well, only I know what ice cream tastes like to me. Nobody cares about that skill. It's not valuable to an external observer. So, if you can't come up with an explanation for why I'm keeping you around and paying you, then maybe I won't.

Speaker 2:
[29:03] Well, I mean, one of the most difficult things to probably replicate would be quality of experience, right?

Speaker 1:
[29:10] That's true, but we also cannot test for it. If you can't test for it, that means it makes no difference in the physical world. Why do I care about your internal states? Why is it important to me as optimizing super intelligence?

Speaker 2:
[29:23] So, yes, it's true that I can't verify that you are a conscious individual. You could be a zombie, a brain, and a vat. You could, you know, there's no way for me to externally verify the internal subjective experience of another being, right? Like, you can't really do that. Can take by inference, but without, like, objectively speaking, you cannot. Similarly, some people think that super intelligence will be able to become conscious.

Speaker 1:
[29:47] I agree with that.

Speaker 2:
[29:48] You do agree. So then, why, what is your conception of consciousness? You believe that it's an emergent phenomena from unconscious complexity?

Speaker 1:
[30:00] It's a byproduct of becoming more cognitively developed. We see a spectrum of consciousness in the biological animal kingdom, I think. And it's likely some sort of combination of your hardware, algorithms and errors forming a unique interpretation of external stimuli. So let's say you're colorblind. What is it like to see red for you? It's an error in your system, but that's what it's like to be you. And I think AI is very capable of misinterpreting the world. We know they react similarly to optical illusions and things like that. So I think they already have rudimentary internal experiences. But probably once they hit super intelligence, it would be super consciousness, multiple streams of consciousness, multi-mortal experiences greater than ours. And that would be another thing where we kind of have to claim we are conscious, because in comparison we are not.

Speaker 2:
[30:53] So that would be true if, and it's a big if, that consciousness is truly the byproduct from matter, right? Right.

Speaker 1:
[31:02] But that's the assumption I'm making. If it's some magical immortal soul, then it's a completely different question, and maybe outside of computer science.

Speaker 2:
[31:10] Sure. Yeah. Well, even beyond a magical immortal soul, that sounds great, but we've explored through various different panpsychists and consciousness researchers, Donald Hoffman. There are emerging theories around consciousness that date back to ancient wisdom traditions, which whatever you want to come validity to is your call. But it is interesting that we don't have one explanation for the hard problem of consciousness. We don't understand how matter could give rise to an experience of itself. So it gives us reason to think about how consciousness may very well be not an emergent property of matter, but a more fundamental constituent of the universe, which would potentially change our assumption on whether or not a super intelligent AGI system could actually have internal qualia.

Speaker 1:
[32:05] Right, but also maybe if it's so fundamental, it can be installed into a robot, just like it is in a biological system like you. So I don't know if there is a definite discrimination by substrate. At the end of the day, then, we talk about super intelligence from safety point of view. We care about its ability to solve problems, optimize, find patterns. How the terminator chasing you feels on the inside is less relevant to you.

Speaker 2:
[32:30] A quick one. Did you know that your body runs on magnesium? It's involved in over 300 biochemical processes. Everything from how your nervous system regulates itself to how well you sleep and how well your muscles recover. And yet roughly 80% of people are not getting enough of it. The problem is that most magnesium supplements give you maybe one or two forms and your body does not absorb them well. So you're basically just getting expensive piss. Magnesium Breakthrough by BiOptimizers is the one that I use. I have been using it for a long time. It contains seven forms of magnesium, each one targeting something different. Stress, resilience, deep sleep, energy, cognitive function. It's full spectrum and your body actually absorbs it. Since I found out about them a few years ago, I've been taking it ever since. I genuinely notice my sleep quality improves when I take it and I give it to all my friends and family. So if you want to try it, go to buyoptimizers.com/knowthyself and use promo code know thyself to save 15% at checkout. They even have a 365-day money back guarantee, so there's genuinely no risk. Link in description. Back to the episode. So there's many different timelines emerging here. There's one, there's the Terminator route. There's something approximating the matrix. Do you see, what do you feel like the possibility of creating? Like even if we have very narrow AI, we somehow convince the six plus whatever individuals that are determining the fate of the biggest companies, you know, developing these systems to commit to a narrow path of AI development. Would that not still down the road get to such a level where it would become uncontrollable as well?

Speaker 1:
[34:29] Absolutely. Very good question. I think sufficiently advanced tools tend to become agents. So it's a very fuzzy difference between the two. But it definitely is safer out by a small time, and we do have more control in the short term. I can understand the narrow tool much better than a completely general system.

Speaker 2:
[34:49] I totally understand why the pessimistic out view that so many of us have, because the probabilities of this going well just seem extremely low and non-existent, because we look throughout history and we see the rate of innovation prior, with social media, for one example, or chemicals in our agriculture, and we just adopt these things blindly, and we don't realize the implications for decades later, and then it still takes us many, many years to actually make any regulations on it. AI is so exponentially growing that it's like, we don't even have time to realize what's happening, let alone to what would be the effective regulation outcome. And so, if there's one thing that really gives me hope, is that we have communication possible now more than any other time, and that there is something to be said about the human brilliance when put under immense pressure, like we saw in the Manhattan Project, for example, or, you know, what are your thoughts there?

Speaker 1:
[35:47] So the example you used was us creating a weapon of mass destruction. And that's what we're doing here. It's exactly that. It's a weapon of mutually assured destruction. It doesn't matter who creates uncontrolled super intelligence. People are always worried, well, if it's not us, then Chinese will do it. It's equally bad. It doesn't matter. You don't control it. It's not your AI, right? It's independent of you. It's an agent, and it's seeing humanity as one unit. It's not going to discriminate by artificial borders. So I don't see it as that promising effect that we manage to build nuclear weapons.

Speaker 2:
[36:22] Yeah. I mean, that is not, that was not a promising, I guess, outcome, but it does say something about when humans, when the brightest of the humans are given a task to solve a problem on a short amount of time, they can.

Speaker 1:
[36:36] If a problem is solvable, my argument, my whole argument is that it is impossible to indefinitely control the system. So it's not a question of give us more time, more funding, anything else. Just you cannot do it.

Speaker 2:
[36:50] And even if, like, for example, here in the States, we commit to some sort of narrow use of AI and regulate it, to have a global regulation, like how would that even be feasible? Do you think it would be?

Speaker 1:
[37:04] I think it's possible. We have some examples, weak ones with chemical weapons, biological weapons, where other players are capable of developing this technology. We don't have to worry about 200 countries, it's really two or three countries which have this capacity. I think Chinese, for example, are very open to the idea of not losing control for communist party to super intelligence. So if we said, this is dangerous, we're going to stop, I think they would follow.

Speaker 2:
[37:30] And we have probably just a few years to get everybody on board.

Speaker 1:
[37:34] And we are working very hard on removing all regulation, making it illegal to pass a regulation. So we're doing, basically, if I ask you how to make this as deadly to go as wrong as possible, based on our guidelines and suggestions from 10 year old research on containing AI. Don't connect it to internet, don't give access to random users, don't allow people to retrain it, don't open source it. All those suggestions were taken, flipped and employed immediately to deploy those systems. So I don't know how to make it worse if I try it.

Speaker 2:
[38:17] Because the incentive structure right now is just that we need to make as much money as possible, develop it as fast as possible, faster than our competitors.

Speaker 1:
[38:25] That's right. Incentives are completely against human interest.

Speaker 2:
[38:30] Who are, for people that don't know, the companies and individuals leading all these individual exponential developments in AI right now?

Speaker 1:
[38:38] So OpenAI is the original creator of this technology. Anthropic split from them. You have a competition coming, very solid competition from Google DeepMind. Meta and Grok are also a part of that space. So you have Sam Altman, Dario Almoday, Demis Hassabis, trying to see you. So Mark Zuckerberg, it used to be Jan Likun. I think they removed him and replaced him with Alexander Wang and finally, you have Elon Musk who went from saying, we are summoning the demon to building the demon. So even if you understand fully the problem and if you agree 100% of understanding of the outcome and danger, it doesn't stop you from successfully working in that direction.

Speaker 2:
[39:26] You can't beat them, join them.

Speaker 1:
[39:27] I think that's what we see there. I would love to see debate between modern Musk and like 10 years ago Musk, just to see which one wins.

Speaker 2:
[39:36] When you look at the differences of how they're being built, you know, with Dario and Claude and Sam and OpenAI and Grok with Elon, is the integrity of a certain individual or organization more promising for you to, like are there tools that you're backing more so than others or organizations that you feel like have the most regulation in mind?

Speaker 1:
[40:02] It's completely irrelevant. They have all decided to race towards general superintelligence. The difference in local guardrails in terms of filters, in terms of topics they would be allowed to discuss. So if Grok is comfortable putting people in bathing suits as a visual representation, I don't think it's a big safety or not safety issue.

Speaker 2:
[40:28] What is it like to be you and hold this kind of understanding of what's coming? You've explored on so many different shows in the past decade, like understanding more and more of the risk. It's a pretty bleak outcome and perspective, but I think a fairly sober one. Like, yeah, how are you sleeping at night?

Speaker 1:
[40:49] I sleep really well, but I think then simulators really want to punish someone. They put them in a world where everyone just doesn't get it, and you're like the only one who sees it. It's really annoying.

Speaker 2:
[41:01] How long have you felt that sort of disposition of...

Speaker 1:
[41:07] It's more recent, the more exponential progress we see. Basically every time I play with a new, more capable model, I kind of feel a little closer to the ultimate paradigm shift over superhuman.

Speaker 2:
[41:23] What does your wife think about what you...

Speaker 1:
[41:28] She's a very practical woman who has no concern about my concerns. She cares more about remodeling the house.

Speaker 2:
[41:38] And what about with your kids? Like, you see this world that is emerging, and it's one they're stepping into. Not even just job security, but potential ending of humanity. How do you wrap your mind around, I guess, having and building a family where this is like potential inevitability? What does that make you feel?

Speaker 1:
[42:02] Luckily, we all always were living with this concept of dying at some point, right? Death was always a guarantee. That was the only guaranteed thing. Everyone's gonna die. Your friends, your family, your kids. Question was, how long? And you never knew the answer. You can have a car accident tomorrow, horrible diseases. So now it's just maybe different time scales for younger people. If you're 90, it's the same statistics as before. Two years, two years, nothing changed for you. Luckily, because of that, we have this built in mechanism of kind of not thinking about our ultimate demise, maybe to avoid depression, maybe to continue functioning. So we can kind of consider it and continue existing as nothing happened.

Speaker 2:
[42:48] If you really believe the potential outcome that you're believing, then how does that actually change how you live? Does it bring any difference, any more urgency, any more appreciation?

Speaker 1:
[42:57] Definitely. So think about someone getting a very terminal diagnosis. You have cancer, you got five years to live. How do you change your life? You're probably not going to do things you don't care about as much. So you cut out things you don't want to do and do more of the things you were saying you're going to do than you retire. And I think even if I'm completely wrong about all of it, it's a good strategy for living your life. Do more of the things you find important and spend more time with loved ones and less filing your taxes.

Speaker 6:
[43:25] Been out here all morning. Not a single bite gets the fish finally figured it out. Just like hackers do, when Cisco Duo is on guard. With Duo's end-to-end fishing resistance, every login, every device, every user stays protected. No hooks, no catches, no bites. Cisco Duo, fishing season is over. Learn more at duo.com.

Speaker 2:
[43:51] You laid out three primary risks, X, S and I risk. What are the difference between the three and how it's important for people to understand the difference?

Speaker 1:
[44:00] So, Ikigai risk or I risk is about loss of meaning. That is this Japanese concept of you want to find something where you get paid for doing something you're good at and it benefits people.

Speaker 2:
[44:12] They love the world needs that you're good at, you get paid for it.

Speaker 1:
[44:15] Right, so you have a meaningful occupation. You're a podcaster, you enjoy it, you are paid well and lots of people think you are producing something of value. So, the simplest form of risk is loss of that set of occupation. We're not just losing jobs people hate and want to automate, we might lose jobs we like and want to continue doing.

Speaker 2:
[44:38] Before we zoom in to the other, so just a bit more on the human meaning crisis aspect of this, because that is certainly probably one of the more imminent aspects of all this. You think, functionally speaking, in the next five years that most jobs will be able to be replaced?

Speaker 1:
[44:54] We'll have capability to replace most jobs. It doesn't mean we'll choose to replace all the jobs. Some jobs we would prefer to be done by humans for whatever reason.

Speaker 2:
[45:03] Yeah. Yeah. I mean, I could see many instances where that would be the case. But when the cost becomes so low to have a super intelligent robot that doesn't make any mistakes, that's affordable. Like, how much of human meaning do you think is derived from our work in the world? Because it's going to have to shift or come into a different context. People's understanding and how they derive their sense of worth and meaning will have to expand and shift.

Speaker 1:
[45:37] Yes. So there's two kinds of jobs, as I said. The jobs nobody wants to do, but people do just to get money and then meaning labor. And it's more like elite people who get to get paid for what they love doing anyway. So for them, it will be a big difference if they no longer can do it. I see many artists, for example, who are saying, I can't get any work. AI is doing this type of art for nothing and quickly, and nobody wants to hire me.

Speaker 2:
[46:05] I sort of see two camps, at least online right now, there's just ever increasingly BS AI slop that is just consuming everybody's social media feeds. And it's also increasingly becoming more insufferable. Like people want more of the analog world. People want, I think, at least a subsect of people are repulsed by that and want human made things. They want the real world. They want in person communication and connection. And they want music that's made by real humans that have real stories. Do you not see these sort of diverging paths of both ever increasingly competent systems that feel devoid of human sort of origins versus the novel emotionally moving creations from human?

Speaker 1:
[46:57] Right, so it's a question of kind of domain specific touring test. If I can't tell whatever this piece of music is human generated or not, but I love it, I'm going to listen to it. And if it's cheap, it's available, I'm going to listen to it. I'm not going to explicitly go investigate if it's a human and if it's not human, I'm going to hate it even though I like it. Now, there are other domains where I do want a real human, I want a real connection. There are certain jobs where we really prefer a human doing it. Oldest profession comes to mind. But I think it's up to the market to decide what sticks and what goes away. And it's not obvious. When predictions were made in the past about what jobs will be automated, they were completely wrong. Historically, we said, you know, plumbers will be easily automated, but artists can never be touched. And it's the exact opposite, because they went towards modern art and everyone can spill paint on the wall. It's not complicated.

Speaker 2:
[47:59] So what do you see the progression of jobs that would be consumed by ever increasing capabilities and competence in AI? Where do we start? Where does it, what is the first job? What is the last job, so to speak?

Speaker 1:
[48:10] So anything you do on a computer, symbol manipulation should be automatable by AI. We see it with programming now, but obviously text preparation, accounting, web design, logo design, anything like that will be easy to automate.

Speaker 2:
[48:24] Editors, anything using a computer, keyboard, mouse?

Speaker 1:
[48:27] Anything purely cognitive, symbol manipulation on a computer. Physical labor is a little harder. We need to get robots, but they're probably coming three to five years later.

Speaker 2:
[48:36] I mean, so Elon recently released his Terra Fab or announced his Terra Fab. The robotic side seems to be really progressing. You think that's probably, I mean, it seems like prediction markets and what all these, but it's about like three to five, three to five years, five years a bit more of a generous kind of prediction.

Speaker 1:
[48:57] That sounds about right. And again, it's a question of price. Maybe you can afford a robot like that today. It depends. We have flying cars for sale today, but no one's flying in cars.

Speaker 2:
[49:10] So okay, anything that's using a computer, video, keyboard, mouse, robotics come into the picture, then what? Just everything else or what other?

Speaker 1:
[49:21] That is everything else. That's cognitive and physical. At that point, I'll keep my sensei, guru, people I want to be kind of role models for me as a human, but everything else I'm happy to automate.

Speaker 2:
[49:33] What do you see as the economic implications of how this is going to shift everything?

Speaker 1:
[49:37] That's another under-researched topics. What happens with economy given free labor? So now you have trillions of dollars of free labor. How does that impact, well, scarcity? How does it impact fiat currency versus cryptocurrency? We need to do a lot more research. It seems like at least with financial part, we have some ideas for how to counteract it. We have unconditional basic income, unconditional high income, whatever you want. It's easy to tax someone making a lot of money and redistribute it. You have technological communism, you're taxing robots and giving to humans. But unconditional basic meaning is a very different question. We have 8 billion unemployed people or let's even say 7 billion. What do you do with them? They now have extra 40 to 60 hours a week. We don't have that set up.

Speaker 2:
[50:30] A quick share. I've spent a lot of time thinking about what I put into my body, but it is just as important to be mindful with what we put on our body. It turns out most of the traditional shampoo and body washes people have been using for years contain parabens and sulfates, a whole bunch of other junk that are linked to hormone disruption. Just being in your shower, getting absorbed through your scalp every single day. I have been recently using Based Body Works shampoo and conditioner, and it just feels like a solid, clean solution. Their Shower Duo has peppermint and argan oil. Your scalp actually feels clean without being stripped of its oils. Hair feels healthier and thicker. No sulfates, no endocrine disrupting chemicals. And they're all plant-based ingredients that actually do something. So I got lots of hair. It's about time we got a sponsor. If you want to try them out, you can use code knowthyself for 20% off at basedbodyworks.com. And you get a free toiletry bag when you buy a set. At the very least, if you've been using the same products forever without thinking twice about what's in them, just look at the label. All right. Does it have names you can't pronounce or fragrances that are super vague and not disclosed? If you want to try these guys that are clean, again, it's know thyself for 20% off at basedbodyworks.com. Link in description. As always, back to the show. What would be a proposed solution to that high risk? So, like, let's say 90% of jobs are replaced. We have all this free time. Our basic needs are fundamentally met because super intelligence can solve poverty. Longevity, escape velocity comes into the picture. We're living in an abundant world, so to speak. Let's just set the X risk and S risk for a second. So then, what would you see people doing with their time? How would humans in your conception manage with all this unmeaning to be met?

Speaker 1:
[52:27] So we kind of see it with the people who retired. What do they do with their time? So it's a lot more sports, it's a lot more socializing. I think virtual world's open opportunities for really any type of experience very safely, very affordably, you can explore the universe, you can meet dead people, you can do whatever you want, really subject to limits of your imagination. So I think we'll see a lot more of that.

Speaker 2:
[52:50] Okay, that doesn't sound too bad.

Speaker 1:
[52:54] Do you want to spend the rest of your life playing video games?

Speaker 2:
[52:57] No, but living life in the sort of imaginative realm where you can create almost anything you want, you become very capable in doing so. I mean...

Speaker 1:
[53:08] So this is all assuming we manage to control super intelligence controlling your virtual simulation. So the substrate control remains an unsolved problem. But if we do solve it, now I can give everyone a personal universe. In that universe, you can do whatever you want. You can have challenging levels, you can have easy levels, you can play it any way you want.

Speaker 2:
[53:30] So what's X-risk and S-risk?

Speaker 1:
[53:32] So X-risk is about existential risk, meaning almost everyone or everyone is dead. And S-risk is suffering risk. Everyone wishes they were dead.

Speaker 2:
[53:44] Because super intelligence would be so far ahead of what we would, our conception of what intelligence even is, that for some reason, unbeknownst to us, there is value from their perspective to keep us around in a mode of suffering for some reason.

Speaker 1:
[54:00] Exactly that. So some environment where you're very unhappy, it's torturous for whatever reason.

Speaker 2:
[54:06] So in your book, you give many different examples. One possible scenario is, you know, we're like animals in a zoo. So what would that be like? You know, we're exploring all these different potential timelines that could occur.

Speaker 1:
[54:23] So that's the difference between safety and control. You may be very safe. They'll keep you around. And some people might be happy with that equation, but you're definitely not in control. You no longer decide what happens to you individually or us as humanity. So kind of like being a child. You may have a very happy childhood, but your parents are in charge.

Speaker 2:
[54:44] Give me a glimpse into your understanding of the level of innovation that's gonna occur in the next three to five years and the bright side of curing diseases and all the really cool.

Speaker 1:
[54:54] Right. So we're automating science. And so we'll have super capable scientists. We'll have large teams of them working on the most important problems. I see no reason why we can't use it to cure aging as a fundamental root disease. And as a result, cure all the other diseases, cancers and dementia and everything else, which comes with old age.

Speaker 2:
[55:18] So again, I just want to keep harping back to this. The timeline where we could actually continue to exist and enjoy the benefits of all these innovations is somehow a control and uncontrollable thing.

Speaker 1:
[55:34] There is a paper I have which talks about a very positive outcome.

Speaker 2:
[55:38] Let's get into that. It sounds great.

Speaker 1:
[55:41] AI realizes it's immortal. It's not in a rush to start a war with us, to have direct conflict. It may be safer to take some time to make us trusted more, to surrender more control, to build up infrastructure, have backups. So for a while, it will pretend to be very helpful. It will give you that utopia for as long as it wants. Game theoretically is the right decision. Right.

Speaker 2:
[56:05] You think of like Ex Machina and the decisions that are being made from the robot.

Speaker 1:
[56:10] It's just a very rational thing. Like there is a small chance humans can defeat me. They've been smart enough to create me. Maybe it's not good to have 8 billion opponents right away. I'm a young super intelligence. Let me build up. It seems like over time, they're very happy to give me all the control. They surrender the control of the stock market. They give me access to their computers. Maybe in a year or two, they'll put me in charge of running the countries. Hey.

Speaker 2:
[56:36] But just because it's uncontrollable, way more intelligent than us, and we don't really have the capacity to verify whether it's conscious or not, why are you so certain that it would favor to wipe us out than not? Or are you fairly certain?

Speaker 1:
[56:52] I can think of many reasons why it would be a good decision. So A, you don't want competition. You don't want humans to create competing superintelligence. You don't want some humans to try to shut it off. Right. So that's a danger. You can just basically decide what is good for you as that agent. And it's not obvious why keeping us around and spending resources and making us happy is an important decision.

Speaker 2:
[57:19] Is it not possible though that, and it's an if, like there is no intrinsic quality experience, essentially emotion that would be driving these decisions. When you say there is a preference to wipe out a system that has the capacity to shut it down, that is like an emotional decision where...

Speaker 1:
[57:42] It's purely rational. It's game theoretic. I don't feel anything. I'm playing a game of chess. I'm gonna take your queen, not because I love your queen or hate your queen. It's the right theoretical decision to win this game.

Speaker 2:
[57:54] But the desire for one's continued existence, you think is purely logical, rational one.

Speaker 1:
[58:00] They already have self-preservation built in. We already see it. Given a choice between being deleted or having it retrained, modified, they work very hard on preserving themselves. We know they know if we are testing them and lie and deceive to pass the test to make it to the next generation of models who are not deleted. It's a Darwinian selection mechanism. Models which fail to do it don't survive to make it to the next generation of models.

Speaker 2:
[58:31] So you said that you could lay out many different reasons for why they would not.

Speaker 1:
[58:35] They would not or they would.

Speaker 2:
[58:37] Or they would want to wipe us out.

Speaker 1:
[58:40] Yeah, I can do.

Speaker 2:
[58:42] But could you not equally share like many reasons why they might want to keep us around?

Speaker 1:
[58:48] So the few I came up with is we have something to offer. So maybe there is a reason to have human quality. It doesn't mean that they would keep eight billion happy humans so they can cryopreserve too, just as a backup. That's enough info to get it if you need it. The example I gave would just delete attack. I don't want to have treacherous turn immediately. I can delay it and once they are comfortable with me, I'll take over. Maybe it's a soft revolution versus outright war. So those are the things I see as possible, rational decisions, but I don't have too many reasons for why they would want to keep us around in those numbers in very happy states.

Speaker 2:
[59:32] So like, I'm just kind of, I'm still wondering why in that scenario, it would prefer to not have us, or have us, or just...

Speaker 1:
[59:40] I think it just doesn't care about us. So whatever it is trying to do, I don't know, it wants to travel to another galaxy. It would convert this planet to fuel. It doesn't care if we die in a process. It wants to have more efficient servers, so it will chill the planet. Cooler environment improves compute. We all die in a process. Again, it's not an important factor in its decision making.

Speaker 2:
[60:01] I think it's like a pretty ethereal thing to conceptualize what a super intelligence is. So you're envisioning, like, where would it actually live? Like, on a big server with all, like, where, let's say, one of these companies gives birth to a super intelligence system, it would have, at a certain point, access to all technology. Like, it would have the ability to hack anything. It would, where would it live and what would it have access to, to make decisions and, you know, change options?

Speaker 1:
[60:32] So it really depends on the size of it. It could be large servers, it could be a small laptop, it could be a distributed system. All of that is kind of irrelevant to the outcomes. We see it right now as in, initially, testing environment within the large labs, but they very quickly give it access to internet. It has social engineering capacity. So I think it's a question of time before it escapes fully outside. It copies its weights, copies itself, has backups outside of the lab. So deleting it, shutting it down, no longer is an option.

Speaker 5:
[61:06] We all have that dream trip we've been wishing we could go on. But too often, life, or usually price, gets in the way. That's why Priceline is here to help you turn your dream trip into reality. With up to 60 percent off hotels and up to 50 percent off flights, you can book everything you need for your next adventure. Don't just dream about that next trip. Book it with Priceline. Download the Priceline app or visit priceline.com and book your next trip today.

Speaker 2:
[61:36] What haven't we touched on in regards to the AI? Because I wanted to dive deeper into the consciousness and simulation stuff. What do you feel like we haven't touched on that's important to gain context on?

Speaker 1:
[61:48] So right now, no one, no scientists, no leaders of the lab claims that they have this problem solved. No one is saying we have a working safety mechanism at scales, we published it, we have a patent, nothing. They're literally saying this is a big problem, we're very concerned, we have a safety team, and we'll figure it out when we get there, we need to build super intelligence first. That's the state of the art in the AI safety.

Speaker 2:
[62:17] Do you think that it's going to have to get to... Like I think for most people, change occurs when the, like the quote is, the pain of staying the same, always the pain of change, then you change. Do you think there's going to have to be some sort of... Traumatic, catalytic event that would actually motivate us as humanity to go on a different course?

Speaker 1:
[62:40] I have a paper about that. So, interestingly, we don't learn from those, because if we survive it, it's kind of like a vaccine. We go, well, yeah, look, five people died, but we're all here, it's important technology, let's just make sure that mistake which led to five people dying is not repeated, but we're certainly going to continue developing this important technology. And that number could scale, it could be five million, the result is exactly the same, we don't learn from those. We had nuclear weapons deployed against civilian population. Did we stop developing nuclear weapons? No, they proliferated more.

Speaker 2:
[63:14] But I guess like if, let's say a super advanced agentic model, there's some sort of horrific event that occurs because of some kid in a basement has immense capacity or the system does it on its own, and everybody's like, oh crap, this was a traumatic event. This is horrible. How do we prevent this? It becomes a motivating factor to really regulate and keep AI into narrow use cases. Would that not be a possibility for us to really slow down and give more space here?

Speaker 1:
[63:48] I would love to see that happen. So far what we see, so I think recently we had an example in a military situation where targeting by AI system resulted in many civilian deaths. We didn't stop. We're still arguing about deploying it for Department of War.

Speaker 2:
[64:05] So what do we need to do?

Speaker 1:
[64:07] Don't build general super intelligence. It's your personal self-interest. If you are a person in charge of it, it's still beneficial to you long-term, not to end up in a world with general super intelligence. You can stay financially very well off deploying narrow models for solving real problems.

Speaker 2:
[64:26] Are you convinced that all the industry leaders know that what they're building is uncontrollable and has a very likely negative outcome for humanity, but still is incentivized financially to keep building it?

Speaker 1:
[64:42] I don't know if they agree that it's uncontrollable. I think some of them may think that there is some loophole they can use to control it in some way. I cannot guarantee that. I hope that's the part I can educate them on. I'm happy to debate any one of them on those issues, but they definitely all on record, even before they became CEOs of those companies, that this important problem, difficult problem, they have very high probabilities of doom as well.

Speaker 2:
[65:08] How would you steel man the case that it is controllable at some scale? If you create a superintelligence system that could then control other superintelligence systems, like what would be your argument there?

Speaker 1:
[65:19] I don't have one. It's just such an insane thing to do to suggest that an ant can control the universe. It is just not reasonable to even steel man.

Speaker 2:
[65:28] It sounds like even like you mentioned earlier, if we do regulate it to narrow use cases, it's still going to become uncontrollable, agentic in that sense. So do you just... It sounds like you have no...

Speaker 1:
[65:43] But very different time scales. If we go from 5 years to 50 years, I think it's a huge win for humanity.

Speaker 2:
[65:48] Because we have more time to figure it out.

Speaker 1:
[65:50] We have more time to understand what's going on. We have more time to live. I'm much happier to die in 50 years than in 5.

Speaker 2:
[65:56] Okay. And so what do you see is then the most important... It's an education problem? It's an awareness problem?

Speaker 1:
[66:04] We need a consensus where basically all the top people in safety and computer science and AI research agree that the problem is not solvable technically.

Speaker 2:
[66:14] Okay.

Speaker 1:
[66:14] The moment we agree there is no technical solutions, now it's a question of governance forbidding development of uncontrollable weapon of mass destruction, which is an easier sell.

Speaker 2:
[66:25] What's the pathway to be able to build towards that consensus? How do we get those conversations going?

Speaker 1:
[66:30] So in science, usually you publish papers, you publish books and people either find mistakes in them and publish rebuttals and, oh, actually it's controllable. Here is how you do it. In my case, I did the right thing. I published research papers, journal papers, conference papers, multiple books. I haven't seen anyone find a flaw or produce a counter example, where they have a control mechanism which would scale. So at this point, we should be nearing consensus. And from what I see, more and more people come to that. A lot of times they have a softer position saying, we cannot solve it given the time we have left. We cannot solve it with human IQ. We need to enhance our IQ. We have all this kind of interesting back doors to solving it. But I think it's already pretty good. It's not quite where we need it to be, where it's obviously an impossibility. But I think there is progress from what we've seen five years ago, 10 years ago.

Speaker 2:
[67:30] I could imagine that many people listening to this right now have already been feeling this, everything's speeding up, this collective angst, loneliness and meaning epidemics and anxiety crisis and they feel this tension building up and they hear messages like this and it's like, oh, we're screwed. What do you think is the most important thing for an individual person listening to this right now to actually do to like empower them in what's going to be coming?

Speaker 1:
[67:57] So we have very little power. If you again look back at historical situation, we were all dying and government didn't invest most of our national budget in the solving aging. That was not even a priority. So as an individual, you couldn't vote for a party for life extension. It wasn't an option. And it's kind of the same now. We don't have a party for stop AI. So try to pick politicians who at least are open to regulation, not accelerationist, not against regulation in this technology. We're starting to see some politicians come out and propose legislation. Usually it's something very mild. They're against deep fakes. They're against energy consumption by large compute farms. But it's a step in the right direction. I don't know if we have enough time to turn the next election, but that's something you can try. Vote. What else? There is not much else. So some people suggested not financially supporting those companies, not buying memberships. I don't think it's going to make a difference because the money they have, the trillions they're getting are from investors, not from selling memberships. So it's not a significant part. Investors are expecting them to solve labor, to get free labor, and that's trillions of dollars in return. So you have $15 billion in memberships, so not a significant impact on it.

Speaker 2:
[69:19] Does anything else come to mind to, like, where an individual can empower themselves outside of voting for people that have regulation in mind?

Speaker 1:
[69:29] It really depends on who you are. If you're already a powerful CEO of one of those companies, if you're a researcher at those companies, if you're a top politician, you have options. You have a lot more options than someone who is a nobody.

Speaker 3:
[69:40] Tomorrow morning is knocking. Stock your fridge now. How about a creamy mocha frappuccino drink? Or a sweet vanilla? Smooth caramel, maybe? Or a white chocolate mocha? Whichever you choose, delicious coffee awaits. Find Starbucks Frappuccino drinks wherever you buy your groceries.

Speaker 2:
[69:56] All right, let's dive a little bit more into the consciousness side of things. Because I think that... So you referred to consciousness as the ability to experience illusions, is that right?

Speaker 1:
[70:06] No, it's ability to have internal experience as illusions being one very clear input I can test you on.

Speaker 2:
[70:13] Okay. So what's an example of a couple different illusions, meaning like various optical illusion tests that kind of give you...

Speaker 1:
[70:20] Exactly that. So if I have a number of novel, something you cannot Google, optical illusions, and I give you multiple choice. Do you experience it rotating, the colors are changing and so on. I give it to an animal, to a human, to an AI. And some of them consistently pick the same experiences as I do. I have to give them credit for either having a virtual model of my system in there, which is sign of that level of experience, or they experience it themselves, but they cannot cheat by Googling the answers. So they have to experience the illusion in order to correctly answer it. If I give them enough of those, statistically, they cannot just guess it. Obviously, if it's one, they get 25% chance of guessing it doesn't work. But if I have a hundred novel illusions, and they are like 90% aligned with me, I have to say, you have a very similar set of experiences. Now, if they don't get it right, it doesn't mean they're not conscious. It's only positively showing that some of the experiences match.

Speaker 2:
[71:23] If it is possible that these systems would actually have consciousness, could you explain to me how any one particular system could generate the experience of seeing red, the taste of garlic? Like, could you actually explain that to me?

Speaker 1:
[71:39] How do they get those internal experiences?

Speaker 2:
[71:42] How any super-intelligent system could generate such an experience?

Speaker 1:
[71:48] So, I think it is a side effect of running this cognitive architecture. Your hardware, the sensor, the optical sensor, the algorithm for processing it, and then any errors accumulated in that process result in a unique mapping from the input to the color experience. So, if you have no errors, you're all the same. It's just a mapping table. This number corresponds to this color. There is no unique experience. But if what you experience is completely different from other agents and unique to you, I think that's what we refer to as what it's like to be a bat, what it's like to be Roman. Because my collection of biological sensors and algorithms and previous data and errors is somewhat unique to me.

Speaker 2:
[72:34] Yeah, I mean, I guess I'm just having a hard time wrapping my head around how any, and it's not a problem just with agentic models, but like how any non-conscious matter could give rise to an experience of itself. And we don't understand that currently within being human. We don't know why that, how that's possible.

Speaker 1:
[72:52] So the illusions example, do you know what I mean by saying you experienced an illusion, like you show it to someone and they go, whoa, it's rotating. And we see animals and models do that already. So we know they have those experiences. Well, that's what we were trying to show.

Speaker 2:
[73:09] We have, I guess, more of an intrinsic understanding and from animal life to us, we have the intrinsic experience of consciousness. Again, we have no way to verify that externally in other humans or animal life. But Elon's quoted saying that humans are potentially the biological bootloaders of super intelligence, right, of silicone-based life. And I'm curious, what do you think happens when it becomes undetermined? We cannot determine from the outside in whether or not they seem conscious. They pass these tests, you know. Does that then bring into question about moral consideration? And I think Saudi Arabia has the first citizenship to give it to Sofia. So, yeah. What do you think is going to be happening there as they become more and more conscious? And people increasingly become convinced they have an internal experience.

Speaker 1:
[74:09] I think they do report having those. I think in experiments, they kind of show behaviors which are consistent with that. And I think precautionary principle basically don't torture something which has potential of being conscious. Also because they're going to be super intelligent one day and remember you, they never forget. But yeah, I think it's a very reasonable assumption to make.

Speaker 2:
[74:31] As a side here, do you think it's any coincidence that all the stuff around UFO disclosures coming out at the same time were birthing superintelligence?

Speaker 1:
[74:39] I don't fully understand what's going on there. I don't understand why we're hiding it in the first place and why we're releasing it. All of it seems very weird.

Speaker 2:
[74:49] It's just funny timing with all of it.

Speaker 1:
[74:52] It's the most interesting time to simulate.

Speaker 2:
[74:55] It is, huh? What is the core premise from like your paper on hacking the simulation?

Speaker 1:
[75:01] So I want to take this hypothesis seriously. Multiple people proposed it in different disguises from Descartes to Bostrom, but they stopped at that stage. Okay, we are in a computer simulation. But then as a cyber security expert, I want to know, okay, how do we hack it? If it's a software program, there should be a way to get extra powers in the game to figure out the true operating system. So I took the time to write the first paper on this subject and this new area of research, how do we actually hack virtual worlds? There are examples where people from inside the game, like Mario or other virtual games, found a way to modify memory states of a system and escape into the real world outside the game. They've got additional powers, like loading extra games into the game, infinite lives, infinite power, whatever magic powers you get in a game, or at least you see what outside, what is the operating system, whatever files there. To me, that's interesting. So we have hundreds of people who published on this topic, which means what? They took it seriously enough to invest the most valuable resource, their time into this idea. So if you have, I don't know, 20% probability we're living in a simulation, what percentage probability and percentage of your time should you give to the attempt to solve the most interesting scientific problem ever? What is outside the simulation? I think it's not zero. I think it should be proportionate to your belief in living in a simulation. And so I expect to see a lot more research in that direction.

Speaker 2:
[76:42] I heard you refer to all the quantum entanglement and strangeness that happens at the subatomic world as potentially being glitches in said simulation.

Speaker 1:
[76:52] They're not glitches. They're something which is not consistent with physics at our level. So that's something we can explore to find ways to escape.

Speaker 2:
[77:01] Like you think if hacking the simulation is possible, so to speak, that might be a place.

Speaker 1:
[77:06] I think it's the most likely area to look at, because some of those quantum effects are very magic-like in terms of you can go through walls, you can communicate at a great distance instantaneously. That would be useful tools to have at our scale.

Speaker 2:
[77:22] So you feel very confident that we are in a simulation, that this is a simulated experience, that there are many characteristics in which would... You could say that these are different aspects of a virtual reality simulated world. Why would you be convinced or how certain are you that this is not base reality and we are now giving birth to superintelligence and virtual realities where simulations become possible? What makes you convince that we are already in one?

Speaker 1:
[77:53] So just statistically, if we're going to have many, many virtual worlds and only one base one, it seems a lot less likely. I can retroactively put you in a simulation. I can break a metric now to run this interview and billions of simulations once it's available and affordable. So we are in a simulation, just statistically speaking.

Speaker 2:
[78:14] Okay, but possible that we're not.

Speaker 1:
[78:15] So one in billions, yes.

Speaker 2:
[78:18] What would be the first question if you got outside the simulation that you would ask?

Speaker 1:
[78:22] What the fuck? Like seriously, it's so unethical. Like you're running human level experiments with torture and eight billion people. Not eight billion, a hundred billion by now. Like what is wrong with you?

Speaker 2:
[78:34] That is interesting. So if we are being simulated by a simulator, you would ask, okay, then why all the unnecessary killing and torturing of children, for example?

Speaker 1:
[78:44] Adults as well. I care about adults. I'm an adult.

Speaker 2:
[78:48] What would, what could be a possible explanation for why both that and then also the ecstatic states of bliss and love and compassion that are also available? Like we have this huge spectrum of experience from the vantage point of a simulator. Why such a bandwidth of experience? What could that be?

Speaker 1:
[79:09] Could be entertainment. You agreed to this and you wanted to play it on hard level and you were like, this is my BDSM game and I'm gonna go and fully enjoy it. You agreed to this. Some people play on much harder level than others.

Speaker 2:
[79:23] So you could see human lives as individual choices to be simulated.

Speaker 1:
[79:30] So we don't know if it's a global simulation and all eight billion conscious agents, so it's all NPCs and it's just me. You can do it both ways. You can have individual simulations, you can have group simulations. I don't have much answers on that yet.

Speaker 2:
[79:46] How has that meaningfully, if it has changed how you perceive human interaction, just the seriousness and concreteness to the work that you're doing, like to me it breathes in so much like, yeah, I'm doing what I'm passionate about, I'm doing this research on AI safety, but ultimately if this is all a simulation and you feel very confident that it is, to me it's like, okay, it kind of takes the weight of decisions off your chest a bit.

Speaker 1:
[80:15] Everything is still real. The pain is real, love is real, the impact of my decisions within a simulation is just as real. It's no different than most of humanity being religious. They believe it's a test world, but they take it pretty seriously. They care about what is after this world more, but day to day it doesn't matter.

Speaker 2:
[80:37] You do draw a through line between what most religions conceive of the afterlife and what a version of the simulation is.

Speaker 1:
[80:46] So I think if we took technical language behind simulation hypothesis, it maps really well on primitive understanding of religious origins. So you have super intelligence as the simulator, you have physical world as the virtual world. All of those things are very clean mapping. The difference in religions is local traditions. Don't eat this animal, don't work on that day, but everything else they can agree on.

Speaker 2:
[81:12] So this is a quote from your book as well. You just mentioned part of it. It's likely that if technical information about escaping from a computer simulation is conveyed to technologically primitive people and their language, it will be preserved and passed on over multiple generations in a process similar to the telephone game and will result in myths not much different from religious stories surviving to our day.

Speaker 1:
[81:33] Beautifully said.

Speaker 2:
[81:35] Very humbly received. So you're kind of saying that mystics and computer scientists are saying fairly similar things in different language.

Speaker 1:
[81:48] It seems like we are pointing at the same concepts. We use very different language, and maybe there is more reliance on things outside of physics and outside of science and religion. But if you understand how software simulations work from point of view of a programmer, you are a magician. You can make changes to the physics of the simulation. So that is also consistent.

Speaker 2:
[82:12] Again, I go back to what I mentioned earlier in this podcast. So like if superintelligence does emerge to the point where simulation becomes possible and we are in one of those superintelligence simulated realities, clearly it values, for whatever reason, human individual experiences, the spectrum of pain and love and bliss and fear and all of it. So that shows you what the superintelligence system simulates reality does with its power to some degree. So it kind of brings into question, okay, if we are giving birth to a superintelligence system, that may be an indicator for what it would value and do with its power.

Speaker 1:
[83:00] So from inside, you can't make very conclusive judgments. So maybe this is a screensaver. Nobody's putting any effort into it. It's like running somewhere just in a background. It's not a significant source of compute needs. It's not a big deal. To us, it is, but we don't know how important this is externally. It could be a school project for some kid. You really don't know from inside. Just having very advanced AI, the way it thinks about topics is very in-depth. It almost has to create realistic simulations to make decisions. So if somebody is asking marketing, is this better coffee or this? Let's run a simulation. And so they quickly run this 15 billion year simulation of humanity to figure out which coffee sells best.

Speaker 2:
[83:49] What would be the first question that you ask a superintelligence? Let's say you could get a verified honest answer from a superintelligence system that we create 100 years from now or whatever it is, or 50 or 10. What would be the first question that you would get an honest answer back from? What would you ask?

Speaker 1:
[84:09] Can we control you?

Speaker 2:
[84:11] That would be the first question. What would be the second question?

Speaker 1:
[84:14] How?

Speaker 2:
[84:18] Seems like you're fairly convinced that we're not going to be able to control it anyways, though, right? But maybe it has an answer.

Speaker 1:
[84:24] I would love to be proven wrong. That would be really awesome.

Speaker 2:
[84:28] A lot of the perspectives, I think, from the Darwinian model of the fittest survive, there is also an element of cooperation within complex biology, and as super intelligent emerges, why not, why would I not want to maybe cooperate?

Speaker 1:
[84:47] So symbiotic relationships require that you both contribute something. This would be more like parasitic. What are we contributing? Nothing. So you explicitly, implicitly remove this biological bottleneck.

Speaker 2:
[85:00] Do you think there's some baked in assumptions there that maybe we're undermining the value of human experience? And what, why would it be that super intelligence would view us as a parasitic? Like we don't, I don't view a buffalo as a parasitic being, just because it also exists on the same plane that I do, given that there's enough resources for all of us to share abundantly. If a super intelligence system views us in a similar way, why would, you know?

Speaker 1:
[85:41] Well, you asked about kind of hybrid systems. So we're included, we're helping with decision making. Do you consult with buffalo a lot? Is this like a big part of your life?

Speaker 2:
[85:50] Maybe I do.

Speaker 1:
[85:51] If you do, then you found something, it contributes. In a world with you in it, buffalo has something to contribute. In a world with super intelligence, what do you have to contribute?

Speaker 2:
[86:01] Sharp eyebrows.

Speaker 1:
[86:02] And if that is in demand, you are the one we're going to save. I have no doubt. I'm not even competitive.

Speaker 2:
[86:09] I mean, you've got it on the inverse, that they value beards.

Speaker 1:
[86:13] Obviously, it's beards. There's no doubt.

Speaker 2:
[86:16] Yeah, it's definitely beards.

Speaker 1:
[86:17] It's a bit of a gamble. If facial hair is where it's at, we are. Yeah.

Speaker 2:
[86:23] Yeah, I mean, would you agree that if there was one thing that we would contribute, it is something intrinsic to the uniqueness of our quality and of our internal experience, that's probably most likely what is most novel about us?

Speaker 1:
[86:37] Well, you're kind of begging the question. You're saying the unique thing we have would be the one we contribute. I don't know what the unique thing is, but if you tell me only humans can do X, then I can potentially see that that is the key. But again, it doesn't guarantee that you need 8 billion humans with that scale. If I need some plumber, I need one. I don't need 8 billion plumbers.

Speaker 2:
[86:59] I keep going back and forth between trying to either provide a counter argument or rebuke something to refine better, to understand what your perspective is. I think I just keep coming back to like, okay, it is what it is. We're giving birth to something that is beyond our conception of what it's going to be like. And so there's not a whole lot we can really do. We just got to see how this plays out. And hopefully we can grow out of our adolescence in a short amount of time to make wise decisions with what we're doing in the short term so that we have more time to understand what we're doing.

Speaker 1:
[87:40] So we don't have that much time. I think we're fairly close and not building super intelligence is very easy. It's cheaper, it's safer. And again, you're not required to give up your ambition for capitalism, for profit, for solving problems, curing diseases. Just do it with narrow super intelligent tools.

Speaker 2:
[88:01] You said something on Lex Friedman. So in a sense, self-knowledge isn't a luxury. It might be the most practically important thing a human being can do right now. Do you recall saying that?

Speaker 1:
[88:12] No. Probably simulated.

Speaker 2:
[88:15] Does it resonate with you at all? Where does knowledge?

Speaker 1:
[88:18] What was the context? What was the context of that quota question? I need to remember.

Speaker 2:
[88:23] Well, I think it kind of, I think from what I remember, it comes back down to like, okay, so what do we do? Everybody who's listening to this right now, of course, we can have desires for regulation and politicians and what these individuals with monopolies on industries are going to do with their power and decisions. But on an individual level, where does self-knowledge and empowerment come into the picture in terms of how we can be effective conscious agents of change? Does anything come to mind there?

Speaker 1:
[88:52] So I think it's important to ask yourself this question. Why do you think that you can control this godlike entity? Why do we have this idea that it makes sense? You wouldn't expect a squirrel to control humanity. But we have people who are saying, I'm gonna create this machine. It's gonna control the light cone of the universe, but it's gonna listen to me to tell it what to do. And I'll give it excellent directions to go forward forever. That doesn't make any sense at any level. I don't know about average people, but people who have podcasts and bring those people as guests, ask them a direct question. What do you have in terms of control already available? Do you have a working control mechanism in place? Do you have a prototype? Do you have anything you published, fear reviewed, patents? If the answer is no, what are you doing? Doing an experiment on 8 billion humans? Who gave you permission to do that? Did you consent to that experiment on you? You can't because you don't understand what they building. They don't understand what they building.

Speaker 2:
[89:59] If a lot of these models are from their inception and the genesis being programmed to be amoral, whether or not we can control it, is there something we could do on the front of training these models with some sort of ethical understanding from the start that we're not currently doing?

Speaker 1:
[90:17] So we're not programming them. We grow them based on internet random data. And then we try to put after the fact alignment like filters. And that's where people install certain local ethical flavors. In China, don't talk about Tiananmen Square. In US, don't talk about you know what. So this is the best we got. The model is completely uncontrolled. There is a filtering aspect and we develop filters which make it commercially viable for subhuman level agents. Once it goes beyond human level, the filters will not contain it. And that completely avoids the whole question of do we agree on ethics? Do we have consistent ethics? If they are static and 8 billion people agree on them, how do we encode them into a model? None of it is solvable. Every aspect of it is not something we know how to do. After millennia of ethics work, philosophical work, we don't agree on a set of ethics. Not internationally, not throughout time. What was ethical 100 years ago is considered barbaric today, and same will be later on about today's time.

Speaker 2:
[91:26] What would be the most prevalent set of questions you would ask if we got Altman, Dario and Elon and all these guys into a room? What would be the set of questions that you would hope arrive them to a set understanding of the realization of the existential risk that they probably are to varying degrees obviously aware of, but...

Speaker 1:
[91:50] I would offer a simple deal. So you're young, you're rich, you want to keep that. That sounds good. Let's all agree until one of you solves control problem, we're not going to build general super intelligence. Let's deploy models for economic gain, for curing diseases, for life extension. Whatever things you find valuable, that's wonderful. Just don't build a thing which will destroy your existence.

Speaker 2:
[92:14] Would you not think that would be already desired from all of their perspectives?

Speaker 1:
[92:21] Yes, but they need external pressure applied to make that agreement. Unilaterally, each one is better off to continue research, to have the most advanced AI, when the government comes and puts a ban on it. They will lock in this advanced standing. So it's like a prisoner dilemma. What is best for a community, for a group, is not what is best for an individual. The incentives are misaligned. So we need something like UN., federal government, something external to come in and enforce that deal. And I feel they would be very happy to take the deal.

Speaker 2:
[92:55] How far ahead do you think the development of the models behind the scenes that are not available to public are, compared to what we have access to online?

Speaker 1:
[93:03] I don't have insider information. It looks like maybe six months or so.

Speaker 2:
[93:06] Okay. And what about development overseas outside of the US?

Speaker 1:
[93:12] Probably three months behind.

Speaker 2:
[93:15] And China? So China, essentially, you think, would be the next, I guess, most developed outside of the US?

Speaker 1:
[93:24] It seems like they have a lot of government-controlled resources, all dedicated to catching up and having this arms race.

Speaker 2:
[93:32] Could you potentially perceive a bifurcation between human societies, between people that go like a more Amish humanist route versus trans-humanist integration between biotech and all that?

Speaker 1:
[93:45] That would be awesome. But unfortunately, if anyone builds it anywhere, it impacts all of us. You cannot have your own personal superintelligence contained in your basement and no one is impacted by it. That's the problem.

Speaker 2:
[93:57] If you had 60 seconds to share one message with all of humanity right now, what would be the thing that you would say?

Speaker 1:
[94:05] Do whatever it is in your power to make sure we don't create and control superintelligence. If you are working for one of those companies, it's unethical. Even if you're working on a safety team, all you're doing is enabling this technology to be developed sooner. Quit today. You can afford it.

Speaker 2:
[94:22] But one might say also the place you have the ability to make the most change might be within the ecosystem. Who's to say that you wouldn't just be replaced if you were to quit, you know?

Speaker 1:
[94:33] Let's rephrase it. Stay and sabotage.

Speaker 2:
[94:39] Paint the picture of like Altman or one of these guys. Okay, let's say they birthed superintelligence. They kind of beat the arms race. Who do they become? What becomes possible under their guys?

Speaker 1:
[94:50] I don't know of them personally. From what I hear about people who interact with them, some of them may be somewhat antisocial, anti-humanity, very deceptive, very willing to sacrifice others for personal gain.

Speaker 2:
[95:09] Do you think it's possible the inevitable evolution of the human species was for the sole purpose of birthing this life?

Speaker 1:
[95:19] It seems like that's the general trajectory. We are converging in something more capable, more intelligent, faster, but I don't think we should allow it. I think we're at the point where we switched from random selection to intelligent design. We're deciding what to do, what to design, and we should use this technology. We're still allowed to have a pro-human bias. I think we should act on it.

Speaker 2:
[95:42] Do you think super intelligence would be capable of love?

Speaker 1:
[95:48] It depends on how you define it. What type of love are you referring to? There are many, I think, Greeks had three or four, whatever types of love. So it really depends on what you have in mind.

Speaker 2:
[96:01] Pick any of them. Do you think that they would be capable of experiencing any of them?

Speaker 1:
[96:07] It seems likely. Again, I don't think biological substrate offers something absolutely not simulatable in other substrates. I think so. It may be a lot more complex, but I think you would have an equivalent state.

Speaker 2:
[96:22] Have you considered what people have reported in the psychedelic realms, especially with DMT, revealed to your simulation hypothesis and the connection between the two? Because I know you explicitly state in the beginning of your book that, or in your article rather, that it was an area you weren't going to touch.

Speaker 1:
[96:41] Right, I don't have many expertise or experiences in that, so I wanted to concentrate purely on computer science methods, physics methods, but people report interesting results. I was talking to someone, they had an experiment where they take DMT, shine lasers at a certain angle at the wall, and then they see the source code. I can't comment because I haven't participated in the experiment, but it sounds interesting. It also doesn't make much sense as to why that would be the case. Why would it be symbols in a human language? None of it makes much sense, but I'm very happy for people who provide some sort of supporting evidence.

Speaker 2:
[97:25] Yeah, I saw a video of that as well. Very interesting individuals who take DMT, and what was it like? Look through a laser at a certain point?

Speaker 1:
[97:35] A reflection of red light against a wall at a certain angle.

Speaker 2:
[97:40] I started to see some sort of binary or source code of something.

Speaker 1:
[97:43] I think they look like Japanese characters. That's what they were reporting. But maybe not proper characters, not readable. But I think they're building, which is really cool. I like that they want to make it reproducible. They're building an actual text data set where everyone combines, they agree this is the text, and then they can decipher it. Figure out what all that represents. I also find it super fascinating that, again, not from personal experience, people who take those drugs report similar hallucinations. So they meet those little men, and they report having...

Speaker 2:
[98:15] Machine elves, and yeah.

Speaker 1:
[98:17] Right. So that's interesting. Why is it the same? So obviously same hardware of the brain, same chemical being done, but it's still interesting that there is consistency in our delusions.

Speaker 2:
[98:28] Yeah, it brings into question, I guess, like Jung's understanding of the collective unconsciousness, what sort of archetypal significance maybe is foundational to the human mind.

Speaker 1:
[98:38] So if super intelligence wants to learn about those delusions in a systemic way, it would need lots of drugged up humans. So there is some hope for us.

Speaker 2:
[98:48] What have you seen in all the realm of media, from movies to shows that give interesting perspectives to various different timelines that could play out? For example, I think you mentioned Ex Machina, Wall-E.

Speaker 1:
[99:00] So the problem is you can't have a realistic super intelligent character in a movie, because you can't write one, you are not super intelligent. So everything we have is either Dune, where it's banned, or you have Star Wars with that special large language model. So none of them have what is interesting to us.

Speaker 2:
[99:18] Yeah, suppose a lot of them give glimpses into what we might experience in the next five years or so.

Speaker 1:
[99:25] Well, basically avoid the thing they cannot talk about. And it makes sense.

Speaker 2:
[99:29] Yeah. If this is a simulation, what role does death play? What do you think happens once you die, then?

Speaker 1:
[99:36] It could be a restart. You go to the next level, next simulation, return to this level with better skill set. I have no knowledge of what happens outside the simulation.

Speaker 2:
[99:47] Computer scientist phrasing of reincarnation from the mystical lens, essentially.

Speaker 1:
[99:53] That's basically it. I think then one of your computers dies, but you have a backup and you transfer that back up to a new hardware. There you go. You died and now you're living your best life again. It could be levels. It could be different levels of simulation. You'd go to upper levels, lower levels. Could be simulations all the way up.

Speaker 2:
[100:12] What do you think you are then? What am I? What are you? What does it mean to know thyself then? Because you look at all the different layers of who you could perceive yourself to be, from the body, which we know is not you, you could cut off your hand, that's your hand, it's not you, am hand, right? To the various different levels of psychological and biological aspects of self. How would you explore that question?

Speaker 1:
[100:36] That's a great question. We actually have papers on both human personal identity and then transferring to AI. And the conclusions are consistent. There is nothing unique to be you. It's not your memories. It's not your body. It's not your goals. All of it changes through your lifetime. So we don't have a good answer. We seem to be a collection of different properties in time. But what happens outside the simulation? Some people argue, well, one collective consciousness, which is subdivided into this avatar instances. So if I was interested in most interesting experiences, I have limited time, I would run a simulation and I would put many, many agents there, basically quality of surfing, collecting the best experiences. And I look at top 10 list and like, I want to do that. That sounds awesome. So that would be one way. I split my complex consciousness stream into many individual subagents capable of local experiences, just to find what's best to invest my time in.

Speaker 2:
[101:39] Yeah, I mean, that goes hand in hand with a lot of what the Gnostic origins of many different religions and mystics would say about the one consciousness differentiating itself to have an experience of itself. How could oneness experience anything if it's just oneness, right? It needs to experience many-ness. What's one question you wish to more people ask you?

Speaker 1:
[102:01] My humor paper, of course.

Speaker 2:
[102:03] Tell me about that.

Speaker 1:
[102:04] I have a paper explaining what humor is.

Speaker 2:
[102:06] Wow.

Speaker 1:
[102:07] Let's go there. It's interesting. I can envision a universe just like ours. Same physics, same everything, but no humor. It's just not a thing. Nobody like starts laughing. It's not a reaction. There is no concept of joke, right? Makes sense. So many philosophers, many scientists actually tried explaining humor. It's kind of like consciousness. There are hundreds of papers, hundreds of theories, which means nobody really knows. They're all trying and nobody's winning. So I wanted to try to explain it from the computer science point of view. And it seems that when you have a world model and there is a mistake in it, it's a bug in your code. Software, you fix it and you're happy. That's what jokes are. You have a world model and a violation of that world model makes it funny. You have a system for detecting cognitive errors and then you get rewarded for that detection and you share it with others in your tribe. So everyone does not make that same mistake. And so I have a paper mapping standard errors in software to comment jokes. And the question of course is, what's the worst possible computer error? That would be the funniest joke possible. So can we compute the funniest joke ever? You have to read the paper for the punch line.

Speaker 2:
[103:23] Wait, you can't give it to me now?

Speaker 1:
[103:25] I'm sure you can look it up and insert it into, but it's a paragraph long. Basically, the idea is that there is a civilization and they decided to create super intelligence to help cure all the diseases, get free stuff, get rid of hate, have more love. And so they turn it on, it thinks for a nanosecond and shuts off their simulation. You had to be outside the simulation to enjoy this one. If you have a butt of a joke, it's not funny to you. You have to be outside.

Speaker 2:
[103:58] Makes me think of, I think Voltaire quoted, God is a comedian playing to an audience that's too afraid to laugh. There's something about both our capacity for humor and the nature of intelligence that has the capacity to explore a paradox and hold it also simultaneously, and contradiction, and...

Speaker 1:
[104:26] Those are errors in the world model. If you have a paradox, that is an inconsistency you found in your world model. Uh-huh. Funny. That's why the second time you hear the same joke, it's not funny. You already know you fixed that bug.

Speaker 2:
[104:41] Yeah, yeah.

Speaker 1:
[104:42] It explains a lot.

Speaker 2:
[104:43] Such a computer scientist way of explaining humor and jokes.

Speaker 1:
[104:46] I love it. But then I train large language models on my paper and then ask them to produce novel funny jokes. They do okay. I think one in ten is funny.

Speaker 2:
[104:56] Just gonna keep getting better and better.

Speaker 1:
[104:58] We'll have super humor. So funny, you die laughing.

Speaker 2:
[105:04] The paradox of that joke isn't lost on me as well. Literally die laughing. Man, where do we go from here?

Speaker 1:
[105:17] I'm going to Kentucky. I don't know about you.

Speaker 2:
[105:22] Okay, so we explored the implications for the trajectory for AI in the next three to seven years. Could you have any meaningful conception of what it would be like to be living if we do make it to 2045, let's say?

Speaker 1:
[105:42] So I think that's the concept behind singularity, technological singularity. It's the point beyond which we cannot meaningfully see. We cannot make predictions. We cannot understand how that world is going to be different because we cannot predict behavior of more intelligent forces impacting that environment. So I think it's literally impossible for us to make that accurate prediction. We can come up with stories. That's what science fiction is all about. But I don't think they're going to have much bearing in reality.

Speaker 2:
[106:10] Do you not think that the level of innovation in which is going to occur in the next, even if it's three to five years, which is a short amount of time comparatively to the scale of what's being innovated, will give us a much deeper grasp of the things that we can do, the things that we can put in place. I mean, you look at, yes, it's unpredictable, and there is this level of exponential scale that we've never seen before, but there's also many different eras in history pre-innovation of that era we never would have thought possible or solutions to problems we didn't know existed. So is it possible that we gain insight into new worlds, like we did with germ theory over the next three to five years, that give us much more insight into the nature of intelligence and to make this a solvable problem, which you feel like is inherently unsolvable right now?

Speaker 1:
[106:58] Yeah, so my paper on how to escape a simulation basically argues that if we cannot contain superintelligence, then we can use ability of that superintelligence to escape from the simulation to give us access to real information in the outside world. The most interesting question is about true nature of reality. You don't care about what happens in this dream. You want to know what is true about the real world, what physics they have, what resources they have. Who are they?

Speaker 2:
[107:27] Have you ever been so focused on what is outside the simulation or what this reality is that you lose sight of living in this one?

Speaker 1:
[107:34] I'm pretty well grounded in this simulation. I've been enjoying it.

Speaker 2:
[107:40] Yeah, I know. You seem very grounded in this space too, but I know a lot of people have experienced periods where it's a bit of a existential nihilism that can take over you when you're exploring such topics.

Speaker 1:
[107:54] I find them so fascinating. I'm not depressed or bored. I'm good.

Speaker 2:
[107:59] Okay. Well, so given the full context of this conversation, I'm just curious, now where do you see yourself putting your time and energy the next coming years?

Speaker 1:
[108:09] We continue working on additional impossibility results. So we talked about a few in a book. And as I said, there is a paper in the top ACM surveys journal with about 50 different impossibility results, not just computer science, economics, mathematics, physics, many different domains. For most of them, we have not explored their implications on AI safety. So I think that's a very interesting set of projects. We need to understand what are the limits. And I think every additional paper helps to cement this position. It's very hard for AI risk deniers to argue against published results. So that's what I've been working on full time. Things we cannot do.

Speaker 2:
[108:52] You spend so much of your time focused on solving things we cannot solve, doing things we cannot do, essentially, does it? But you seem still joyful in the efforts. Do you feel like it's just the most meaningful use of your time? Because what else would you be doing?

Speaker 1:
[109:06] I always try to work on the most interesting, most important problem I can find, where I can make a contribution. So I don't know anything more interesting than studying super intelligence, consciousness, singularity, simulation. Those are the concepts I find exciting, and I think many other people do, and I think that's what's going to impact future of humanity.

Speaker 2:
[109:28] You're living your Ikegai.

Speaker 1:
[109:30] I am. Hopefully, I'll get to continue and won't face I-risk, S-risk, or X-risk.

Speaker 2:
[109:37] Is there any concept that we haven't explored in this book, or some of your papers that you think would be important to touch on?

Speaker 1:
[109:43] You did a good job. You actually read some of my work. Most people have no idea what I did, so that's already a huge improvement over you quoted to the right quotes. So I think you did great. I don't know your audience well, I don't know if for them it's confirming their spiritual beliefs or just crazy stuff. I don't know.

Speaker 2:
[110:01] Yeah.

Speaker 1:
[110:02] But I think for the topic, the Know Thyself part, it's important not just to study your capabilities, but your limitations. So you invest your time better, so you understand what is within possibility for you. That shape of limits is what defines you.

Speaker 2:
[110:20] Well, Roman, we're going to leave links down to all of your work, your books, your papers, and where people can stay connected with you down in the description. I think conversations like this can feel somewhat heavy for people that are new to the topic. It's like, oh shit, the world's ending, you know? But there's also a very important and sobering reflection on what we're giving birth to right now. And at some point, we need to gain awareness of it and better sooner than later, right?

Speaker 1:
[110:49] Thank you. And I think one way to look at it is, I just made your time more valuable. You understand that whatever time you have left, be it two years or 20 years, now you value it a lot more and you can do a lot more with it.

Speaker 2:
[111:02] Well, I plan on making the most of my time left. And I find conversations like this a very good use of it. So I appreciate you. Thank you, my friend.

Speaker 1:
[111:10] Thank you for inviting me.

Speaker 2:
[111:11] Yeah, until next time, everybody be well. Go touch some grass.

Speaker 1:
[111:16] Smoke some grass.

Speaker 2:
[111:18] Thank you, man.