How can Factors Underlying Human Preferences Lead to Methods of Formal Characterizations towards Developing Safe Artificial General Intelligence?
Jak mohou faktory určující lidské preference vést k metodám formální charakterizace směřující k rozvoji bezpečné umělé obecné inteligence?
diplomová práce (OBHÁJENO)
Zobrazit/ otevřít
Trvalý odkaz
http://hdl.handle.net/20.500.11956/177252Identifikátory
SIS: 225300
Kolekce
- Kvalifikační práce [19618]
Autor
Vedoucí práce
Oponent práce
Butler, Eamonn
Biagini, Erika
Fakulta / součást
Fakulta sociálních věd
Obor
International Master in Security, Intelligence and Strategic Studies (IMSISS)
Katedra / ústav / klinika
Katedra bezpečnostních studií
Datum obhajoby
16. 9. 2020
Nakladatel
Univerzita Karlova, Fakulta sociálních vědJazyk
Angličtina
Známka
Velmi dobře
Klíčová slova (česky)
Preferences, AI, Security, Decision-Making, Choice, Evolution, Neurocognitive, EconomicsKlíčová slova (anglicky)
Preferences, AI, Security, Decision-Making, Choice, Evolution, Neurocognitive, EconomicsThis research aims to investigate the Artificial Intelligence (AI) value alignment problem, which refers to the challenge in developing a safe and reliable AI that can achieve our goals and adhere to our values as we intend it to do. A misaligned AI, especially one which transcends all domains of cognitive abilities and has acquired vast computational powers, will be nearly impossible to manage and it will threaten our security. Research addressing this problem is now focused on understanding how to develop AI that can reliably infer our values from our preferences. Thus, preferences are the primary conceptual unit of analysis to the AI value alignment problem. This paper investigates our preferences and seeks to shed light on the issue of obtaining a formal truth that is fundamentally constitutive of our preferences, for the aim of using said formal truth to create a value aligned AI. To do this, this paper gathers data from economics, biological evolution, and neurocognitive studies to bridge the current gaps on the conceptual problem of preferences. The paper concludes with presenting a new kind of security dilemma which stems from the notion of combining a general theoretical framework that fully captures our preferences with the crucial elementof uncertainty inAI, effectively showcasing how...
