JRE 2345 ·

Roman Yampolskiy

technologysciencephilosophypsychology

Who is Roman Yampolskiy?

Dr. Roman Yampolskiy is a computer scientist, AI safety researcher, and professor at the University of Louisville. He’s the author of several books, including "Considerations on the AI Endgame," co-authored with Soenke Ziesche, and "AI: Unexplained, Unpredictable, Uncontrollable."

🌐 Website

Topics and Timestamps

01Roman Yampolskiy discusses AI safety concerns and why current AI systems are fundamentally unpredictable and difficult to control
02The conversation covers existential risks from advanced AI and why alignment with human values is an unsolved problem
03Yampolskiy explains the concept of AI endgame scenarios and what happens when systems become superintelligent
04Discussion of adversarial examples and how easily AI models can be fooled or manipulated in unexpected ways
05The limitations of current AI safety research and why traditional cybersecurity approaches don't work for AGI
06Exploration of whether we should be building advanced AI systems at all given our inability to control them
▶Roman explains why AI systems are fundamentally unpredictable and can't be opened up like human understanding0:05:30
▶Discussion of adversarial examples and how easily AI models can be fooled by inputs humans would never fall for0:18:45
▶The alignment problem explained, why creating superintelligent AI that cares about human values is unsolved0:32:00
▶Joe and Roman debate whether AI development should be slowed down given safety concerns0:54:20
▶Yampolskiy discusses AI endgame scenarios and what happens if superintelligent systems escape human control1:15:00

The Show

Roman Yampolskiy brings serious academic weight to JRE 2345, diving deep into AI safety concerns that go way beyond the typical tech bro optimism you usually hear. He's not here to sell you on the singularity as a good thing. He's here to explain why we're essentially building systems we don't fully understand and can't reliably control, and that should worry everyone.

The core issue Yampolskiy keeps hammering is that modern AI systems are fundamentally unpredictable. You can train a neural network to do something specific, but you can't actually open it up and understand why it makes the decisions it does. It's a black box problem that gets exponentially worse as systems get more capable. Joe keeps pushing on practical examples, and Yampolskiy walks through how adversarial attacks work, how you can fool AI systems with input that humans would never fall for. He's talking about the basics of why we should be skeptical about deployment of these systems in critical infrastructure without understanding their failure modes.

Yampolskiy's books, particularly 'AI: Unexplained, Unpredictable, Uncontrollable,' aren't just academic fear mongering. He's laying out concrete technical problems. The alignment problem gets discussed here too, the idea that even if you create a superintelligent AI, getting it to actually care about human values and human welfare is basically unsolved. You can't just write it in the code. It's not like programming a calculator.

Joe's skeptical in the right way here, pushing back and asking practical questions about whether these risks are overblown, but Yampolskiy keeps bringing it back to first principles. We've never successfully controlled a system more intelligent than ourselves. We're assuming we can figure it out as we go, and that's the gamble. He's not saying AI is definitely going to kill everyone, he's saying the risk profile is insane relative to how much we actually understand about what we're building.

The conversation touches on everything from whether we should slow down AI development to how to even think about AI endgame scenarios. Yampolskiy's perspective is refreshing because he's not trying to sell anything or hype the technology. He's genuinely concerned about where this is heading and he wants people to think more carefully about the implications before we build systems we literally cannot control.