How can Factors Underlying Human Preferences Lead to Methods of Formal Characterizations towards Developing Safe Artificial General Intelligence?

Al-Nusair, Rana Ghalib

Jak mohou faktory určující lidské preference vést k metodám formální charakterizace směřující k rozvoji bezpečné umělé obecné inteligence?

diplomová práce (OBHÁJENO)

Zobrazit/otevřít

Záznam o průběhu obhajoby (151.5Kb)

Trvalý odkaz

http://hdl.handle.net/20.500.11956/177252

Identifikátory

SIS: 225300

Oponent práce

Butler, Eamonn

Biagini, Erika

Fakulta / součást

Fakulta sociálních věd

Obor

International Master in Security, Intelligence and Strategic Studies (IMSISS)

Katedra / ústav / klinika

Katedra bezpečnostních studií

Datum obhajoby

16. 9. 2020

Nakladatel

Univerzita Karlova, Fakulta sociálních věd

Jazyk

Angličtina

Známka

Velmi dobře

Klíčová slova (česky)

Preferences, AI, Security, Decision-Making, Choice, Evolution, Neurocognitive, Economics

Klíčová slova (anglicky)

Preferences, AI, Security, Decision-Making, Choice, Evolution, Neurocognitive, Economics

This research aims to investigate the Artificial Intelligence (AI) value alignment problem, which refers to the challenge in developing a safe and reliable AI that can achieve our goals and adhere to our values as we intend it to do. A misaligned AI, especially one which transcends all domains of cognitive abilities and has acquired vast computational powers, will be nearly impossible to manage and it will threaten our security. Research addressing this problem is now focused on understanding how to develop AI that can reliably infer our values from our preferences. Thus, preferences are the primary conceptual unit of analysis to the AI value alignment problem. This paper investigates our preferences and seeks to shed light on the issue of obtaining a formal truth that is fundamentally constitutive of our preferences, for the aim of using said formal truth to create a value aligned AI. To do this, this paper gathers data from economics, biological evolution, and neurocognitive studies to bridge the current gaps on the conceptual problem of preferences. The paper concludes with presenting a new kind of security dilemma which stems from the notion of combining a general theoretical framework that fully captures our preferences with the crucial elementof uncertainty inAI, effectively showcasing how...

Citace dokumentu

Metadata

Zobrazit celý záznam