Eliezer Yudkowsky | Encyclopedia MDPI

Eliezer Yudkowsky: Comparison

Please note this is a comparison between Version 3 by Sirius Huang and Version 2 by Conner Chen.

miri
shlomo
ai

1. Introduction

Eliezer Shlomo Yudkowsky (born September 11, 1979) is an American AI researcher and writer best known for popularising the idea of friendly artificial intelligence.^[1][2] He is a co-founder^[3] and research fellow at the Machine Intelligence Research Institute (MIRI), a private research nonprofit based in Berkeley, California.^[4] He has no formal secondary education, never having attended high school or college.^[5] His work on the prospect of a runaway intelligence explosion was an influence on Nick Bostrom's Superintelligence.

2. Work in Artificial Intelligence Safety

2.1. Goal Learning and Incentives in Software Systems

Yudkowsky's views on the safety challenges posed by future generations of AI systems are discussed in the undergraduate textbook in AI, Stuart Russell and Peter Norvig's A Modern Approach. Noting the difficulty of formally specifying general-purpose goals by hand, Russell and Norvig cite Yudkowsky's proposal that autonomous and adaptive systems be designed to learn correct behavior over time:

Yudkowsky (2008)^[6] goes into more detail about how to design a Friendly AI. He asserts that friendliness (a desire not to harm humans) should be designed in from the start, but that the designers should recognize both that their own designs may be flawed, and that the robot will learn and evolve over time. Thus the challenge is one of mechanism design – to design a mechanism for evolving AI under a system of checks and balances, and to give the systems utility functions that will remain friendly in the face of such changes.^[1]

In response to the instrumental convergence concern, where autonomous decision-making systems with poorly designed goals would have default incentives to mistreat humans, Yudkowsky and other MIRI researchers have recommended that work be done to specify software agents that converge on safe default behaviors even when their goals are misspecified.^[7]

2.2. Capabilities Forecasting

In the intelligence explosion scenario hypothesized by I. J. Good, recursively self-improving AI systems quickly transition from subhuman general intelligence to superintelligent.^[8] Yudkowsky argues that this is a real possibility. Nick Bostrom's 2014 book Superintelligence sketches out Good's argument in detail, while citing writing by Yudkowsky on the risk that anthropomorphizing advanced AI systems will cause people to misunderstand the nature of an intelligence explosion. "AI might make an apparently sharp jump in intelligence purely as the result of anthropomorphism, the human tendency to think of 'village idiot' and 'Einstein' as the extreme ends of the intelligence scale, instead of nearly indistinguishable points on the scale of minds-in-general."^[6][9]

In their textbook on artificial intelligence, Stuart Russell and Peter Norvig raise the objection that there are known limits to intelligent problem-solving from computational complexity theory; if there are strong limits on how efficiently algorithms can solve various computer science tasks, then intelligence explosion may not be possible.^[1] Yudkowsky has debated the likelihood of intelligence explosion with economist Robin Hanson, who argues that AI progress is likely to accelerate over time, but is not likely to be localized or discontinuous.^[10]

3. Rationality Writing

Between 2006 and 2009, Yudkowsky and Robin Hanson were the principal contributors to Overcoming Bias,^[11] a cognitive and social science blog sponsored by the Future of Humanity Institute of Oxford University. In February 2009, Yudkowsky founded LessWrong,^[12] a "community blog devoted to refining the art of human rationality".^[13] Overcoming Bias has since functioned as Hanson's personal blog. LessWrong has been covered in depth in Business Insider.^[14]

Over 300 blogposts by Yudkowsky on philosophy and science (originally written on LessWrong and Overcoming Bias) have been released as an ebook entitled Rationality: From AI to Zombies by the Machine Intelligence Research Institute in 2015.^[15][16]

Yudkowsky has also written several works of fiction. His fanfiction story, Harry Potter and the Methods of Rationality, uses plot elements from J.K. Rowling's Harry Potter series to illustrate topics in science.^{[13][17][18][19][20][21][22]} The New Yorker describes Harry Potter and the Methods of Rationality as a retelling of Rowling's original "in an attempt to explain Harry's wizardry through the scientific method".^[23]

4. Personal Views

Yudkowsky identifies as a "small-l libertarian."^[24]

5. Family

His younger brother, Yehuda Nattan Yudkowsky, died in 2004 at the age of nineteen.^[25]

6. Academic Publications

Yudkowsky, Eliezer (2007). "Levels of Organization in General Intelligence". Berlin: Springer. https://intelligence.org/files/LOGI.pdf.
Yudkowsky, Eliezer (2008). "Cognitive Biases Potentially Affecting Judgement of Global Risks". in Bostrom, Nick; Ćirković, Milan. Global Catastrophic Risks. Oxford University Press. ISBN 978-0199606504. https://intelligence.org/files/CognitiveBiases.pdf.
Yudkowsky, Eliezer (2008). "Artificial Intelligence as a Positive and Negative Factor in Global Risk". in Bostrom, Nick; Ćirković, Milan. Global Catastrophic Risks. Oxford University Press. ISBN 978-0199606504. https://intelligence.org/files/AIPosNegFactor.pdf.
Yudkowsky, Eliezer (2011). "Complex Value Systems in Friendly AI". Berlin: Springer. https://intelligence.org/files/ComplexValues.pdf.
Yudkowsky, Eliezer (2012). "Friendly Artificial Intelligence". in Eden, Ammon; Moor, James; Søraker, John et al.. Singularity Hypotheses: A Scientific and Philosophical Assessment. Berlin: Springer. ISBN 978-3-642-32559-5. https://link.springer.com/chapter/10.1007/978-3-642-32560-1_10.
Bostrom, Nick; Yudkowsky, Eliezer (2014). "The Ethics of Artificial Intelligence". in Frankish, Keith; Ramsey, William. The Cambridge Handbook of Artificial Intelligence. New York: Cambridge University Press. ISBN 978-0-521-87142-6. https://intelligence.org/files/EthicsofAI.pdf.
LaVictoire, Patrick; Fallenstein, Benja; Yudkowsky, Eliezer; Bárász, Mihály; Christiano, Paul; Herreshoff, Marcello (2014). "Program Equilibrium in the Prisoner's Dilemma via Löb's Theorem". AAAI Publications. http://www.aaai.org/ocs/index.php/WS/AAAIW14/paper/viewFile/8833/8294.
Soares, Nate; Fallenstein, Benja; Yudkowsky, Eliezer (2015). "Corrigibility". AAAI Publications. http://aaai.org/ocs/index.php/WS/AAAIW15/paper/view/10124/10136.

References

Russell, Stuart; Norvig, Peter (2009). A Modern Approach. Prentice Hall. ISBN 978-0-13-604259-4.
Leighton, Jonathan (2011). The Battle for Compassion: Ethics in an Apathetic Universe. Algora. ISBN 978-0-87586-870-7.
Dowd, Maureen. "Elon Musk’s Billion-Dollar Crusade to Stop the A.I. Apocalypse" (in en). https://www.vanityfair.com/news/2017/03/elon-musk-billion-dollar-crusade-to-stop-ai-space-x. Retrieved 28 July 2018.
Kurzweil, Ray (2005). The Singularity Is Near. New York City: Viking Penguin. ISBN 0-670-03384-7.
Saperstein, Gregory (August 9, 2012). "5 Minutes With a Visionary: Eliezer Yudkowsky". https://www.cnbc.com/id/48538963.
Yudkowsky, Eliezer (2008). "Artificial Intelligence as a Positive and Negative Factor in Global Risk". in Bostrom, Nick; Ćirković, Milan. Global Catastrophic Risks. Oxford University Press. ISBN 978-0199606504. https://intelligence.org/files/AIPosNegFactor.pdf.
Soares, Nate; Fallenstein, Benja; Yudkowsky, Eliezer (2015). "Corrigibility". AAAI Publications. http://aaai.org/ocs/index.php/WS/AAAIW15/paper/view/10124/10136.
Yudkowsky, Eliezer (2013). "Five theses, two lemmas, and a couple of strategic implications". https://intelligence.org/2013/05/05/five-theses-two-lemmas-and-a-couple-of-strategic-implications/.
Bostrom, Nick (2014). Superintelligence. ISBN 0199678111.
Hanson, Robin; Yudkowsky, Eliezer (2013). The Hanson-Yudkowsky AI Foom Debate. Machine Intelligence Research Institute. https://intelligence.org/ai-foom-debate/.
"Overcoming Bias: About". Robin Hanson. http://www.overcomingbias.com/about. Retrieved February 1, 2012.
"Where did Less Wrong come from? (LessWrong FAQ)". http://wiki.lesswrong.com/wiki/FAQ#Where_did_Less_Wrong_come_from.3F. Retrieved September 11, 2014.
Miller, James (2012). Singularity Rising. ISBN 978-1936661657. https://books.google.com/books?id=P5Quj8N2dXAC.
Miller, James (July 28, 2011). "You Can Learn How To Become More Rational". Business Insider. http://www.businessinsider.com/ten-things-you-should-learn-from-lesswrongcom-2011-7. Retrieved March 25, 2014.
Rationality: From AI to Zombies, MIRI, 2015-03-12 https://intelligence.org/2015/03/12/rationality-ai-zombies/
Miller, James D.. "Rifts in Rationality - New Rambler Review" (in en-gb). http://newramblerreview.com/book-reviews/economics/rifts-in-rationality. Retrieved 28 July 2018.
David Brin (June 21, 2010). "CONTRARY BRIN: A secret of college life... plus controversies and science!". Davidbrin.blogspot.com. http://davidbrin.blogspot.com/2010/06/secret-of-college-life-plus.html. Retrieved August 31, 2012. "'Harry Potter' and the Key to Immortality", Daniel Snyder, The Atlantic
Authors (April 2, 2012). "Rachel Aaron interview (April 2012)". Fantasybookreview.co.uk. http://www.fantasybookreview.co.uk/blog/2012/04/02/rachel-aaron-interview-april-2012/. Retrieved August 31, 2012. Authors (April 2, 2012). "Rachel Aaron interview (April 2012)". Fantasybookreview.co.uk.
"Civilian Reader: An Interview with Rachel Aaron". Civilian-reader.blogspot.com. May 4, 2011. http://civilian-reader.blogspot.com/2011/05/interview-with-rachel-aaron.html. Retrieved August 31, 2012.
Hanson, Robin (October 31, 2010). "Hyper-Rational Harry". Overcoming Bias. http://www.overcomingbias.com/2010/10/hyper-rational-harry.html. Retrieved August 31, 2012.
Swartz, Aaron. "The 2011 Review of Books (Aaron Swartz's Raw Thought)". archive.org. Archived from the original on March 16, 2013. https://web.archive.org/web/20130316081659/http://www.aaronsw.com/weblog/books2011. Retrieved October 4, 2013.
"Harry Potter and the Methods of Rationality". fanfiction.net. February 28, 2010. https://www.fanfiction.net/s/5782108/1/Harry_Potter_and_the_Methods_of_Rationality. Retrieved December 29, 2014.
Packer, George (2011). "No Death, No Taxes: The Libertarian Futurism of a Silicon Valley Billionaire". The New Yorker: 54. http://www.newyorker.com/magazine/2011/11/28/no-death-no-taxes. Retrieved October 12, 2015.
7, Eliezer Yudkowsky Response Essays September; 2011. "Is That Your True Rejection?". https://www.cato-unbound.org/2011/09/07/eliezer-yudkowsky/true-rejection.
Yudkowsky - Yehuda Yudkowsky http://yudkowsky.net/other/yehuda