Show simple item record

Zkoumání reprezentace plurálu ve velkých jazykových modelech prostřednictvím sondovacích intervencí
dc.contributor.advisorMareček, David
dc.creatorHanna, Michael
dc.date.accessioned2022-10-04T17:28:50Z
dc.date.available2022-10-04T17:28:50Z
dc.date.issued2022
dc.identifier.urihttp://hdl.handle.net/20.500.11956/175532
dc.description.abstractTitle: Investigating Large Language Models' Representations Of Plurality Through Probing Interventions Author: Michael Hanna Institute: Institute of Formal and Applied Linguistics Supervisor: RNDr. David Mareček, Ph.D., Institute of Formal and Applied Linguistics Abstract: Large language models (LLMs) have become ubiquitous in natural language processing, but how exactly they process their input and arrive at good downstream task performance is still poorly understood. While much work has been done using probing to examine LLM internals, or behavioral studies, to determine LLMs' linguistic capabilities, these techniques are too weak to allow us to draw conclusions how LLMs process language. In this paper, I use both probing and causal intervention methods to investigate the question of subject-verb agreement with respect to the subject's plurality. I find that while probing reveals that subject plurality information is distributed throughout a sentence, causal interventions suggest that only information stored in linguistically relevant tokens is used. Probing interventions suggest that some but not all probes capture information in a way that reflects LLMs' usage thereof. Keywords: Interpretability, Probing, Natural Language Processing, Computational Linguisticsen_US
dc.languageEnglishcs_CZ
dc.language.isoen_US
dc.publisherUniverzita Karlova, Matematicko-fyzikální fakultacs_CZ
dc.subjectprobing|interpretation|language model|neural networken_US
dc.subjectprobing|interpretace|jazykový model|neuronová síťcs_CZ
dc.titleInvestigating Large Language Models' Representations Of Plurality Through Probing Interventionsen_US
dc.typediplomová prácecs_CZ
dcterms.created2022
dcterms.dateAccepted2022-09-02
dc.description.departmentInstitute of Formal and Applied Linguisticsen_US
dc.description.departmentÚstav formální a aplikované lingvistikycs_CZ
dc.description.facultyMatematicko-fyzikální fakultacs_CZ
dc.description.facultyFaculty of Mathematics and Physicsen_US
dc.identifier.repId248720
dc.title.translatedZkoumání reprezentace plurálu ve velkých jazykových modelech prostřednictvím sondovacích intervencícs_CZ
dc.contributor.refereeHelcl, Jindřich
thesis.degree.nameMgr.
thesis.degree.levelnavazující magisterskécs_CZ
thesis.degree.disciplineComputer Science - Language Technologies and Computational Linguisticsen_US
thesis.degree.disciplineComputer Science - Language Technologies and Computational Linguisticscs_CZ
thesis.degree.programComputer Science - Language Technologies and Computational Linguisticsen_US
thesis.degree.programComputer Science - Language Technologies and Computational Linguisticscs_CZ
uk.thesis.typediplomová prácecs_CZ
uk.taxonomy.organization-csMatematicko-fyzikální fakulta::Ústav formální a aplikované lingvistikycs_CZ
uk.taxonomy.organization-enFaculty of Mathematics and Physics::Institute of Formal and Applied Linguisticsen_US
uk.faculty-name.csMatematicko-fyzikální fakultacs_CZ
uk.faculty-name.enFaculty of Mathematics and Physicsen_US
uk.faculty-abbr.csMFFcs_CZ
uk.degree-discipline.csComputer Science - Language Technologies and Computational Linguisticscs_CZ
uk.degree-discipline.enComputer Science - Language Technologies and Computational Linguisticsen_US
uk.degree-program.csComputer Science - Language Technologies and Computational Linguisticscs_CZ
uk.degree-program.enComputer Science - Language Technologies and Computational Linguisticsen_US
thesis.grade.csVýborněcs_CZ
thesis.grade.enExcellenten_US
uk.abstract.enTitle: Investigating Large Language Models' Representations Of Plurality Through Probing Interventions Author: Michael Hanna Institute: Institute of Formal and Applied Linguistics Supervisor: RNDr. David Mareček, Ph.D., Institute of Formal and Applied Linguistics Abstract: Large language models (LLMs) have become ubiquitous in natural language processing, but how exactly they process their input and arrive at good downstream task performance is still poorly understood. While much work has been done using probing to examine LLM internals, or behavioral studies, to determine LLMs' linguistic capabilities, these techniques are too weak to allow us to draw conclusions how LLMs process language. In this paper, I use both probing and causal intervention methods to investigate the question of subject-verb agreement with respect to the subject's plurality. I find that while probing reveals that subject plurality information is distributed throughout a sentence, causal interventions suggest that only information stored in linguistically relevant tokens is used. Probing interventions suggest that some but not all probes capture information in a way that reflects LLMs' usage thereof. Keywords: Interpretability, Probing, Natural Language Processing, Computational Linguisticsen_US
uk.file-availabilityV
uk.grantorUniverzita Karlova, Matematicko-fyzikální fakulta, Ústav formální a aplikované lingvistikycs_CZ
thesis.grade.code1
uk.publication-placePrahacs_CZ
uk.thesis.defenceStatusO


Files in this item

Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record


© 2017 Univerzita Karlova, Ústřední knihovna, Ovocný trh 560/5, 116 36 Praha 1; email: admin-repozitar [at] cuni.cz

Za dodržení všech ustanovení autorského zákona jsou zodpovědné jednotlivé složky Univerzity Karlovy. / Each constituent part of Charles University is responsible for adherence to all provisions of the copyright law.

Upozornění / Notice: Získané informace nemohou být použity k výdělečným účelům nebo vydávány za studijní, vědeckou nebo jinou tvůrčí činnost jiné osoby než autora. / Any retrieved information shall not be used for any commercial purposes or claimed as results of studying, scientific or any other creative activities of any person other than the author.

DSpace software copyright © 2002-2015  DuraSpace
Theme by 
@mire NV