Investigating Large Language Models' Representations Of Plurality Through Probing Interventions

Hanna, Michael

Zkoumání reprezentace plurálu ve velkých jazykových modelech prostřednictvím sondovacích intervencí

dc.contributor.advisor	Mareček, David
dc.creator	Hanna, Michael
dc.date.accessioned	2022-10-04T17:28:50Z
dc.date.available	2022-10-04T17:28:50Z
dc.date.issued	2022
dc.identifier.uri	http://hdl.handle.net/20.500.11956/175532
dc.description.abstract	Title: Investigating Large Language Models' Representations Of Plurality Through Probing Interventions Author: Michael Hanna Institute: Institute of Formal and Applied Linguistics Supervisor: RNDr. David Mareček, Ph.D., Institute of Formal and Applied Linguistics Abstract: Large language models (LLMs) have become ubiquitous in natural language processing, but how exactly they process their input and arrive at good downstream task performance is still poorly understood. While much work has been done using probing to examine LLM internals, or behavioral studies, to determine LLMs' linguistic capabilities, these techniques are too weak to allow us to draw conclusions how LLMs process language. In this paper, I use both probing and causal intervention methods to investigate the question of subject-verb agreement with respect to the subject's plurality. I find that while probing reveals that subject plurality information is distributed throughout a sentence, causal interventions suggest that only information stored in linguistically relevant tokens is used. Probing interventions suggest that some but not all probes capture information in a way that reflects LLMs' usage thereof. Keywords: Interpretability, Probing, Natural Language Processing, Computational Linguistics	en_US
dc.language	English	cs_CZ
dc.language.iso	en_US
dc.publisher	Univerzita Karlova, Matematicko-fyzikální fakulta	cs_CZ
dc.subject	probing\|interpretation\|language model\|neural network	en_US
dc.subject	probing\|interpretace\|jazykový model\|neuronová síť	cs_CZ
dc.title	Investigating Large Language Models' Representations Of Plurality Through Probing Interventions	en_US
dc.type	diplomová práce	cs_CZ
dcterms.created	2022
dcterms.dateAccepted	2022-09-02
dc.description.department	Institute of Formal and Applied Linguistics	en_US
dc.description.department	Ústav formální a aplikované lingvistiky	cs_CZ
dc.description.faculty	Matematicko-fyzikální fakulta	cs_CZ
dc.description.faculty	Faculty of Mathematics and Physics	en_US
dc.identifier.repId	248720
dc.title.translated	Zkoumání reprezentace plurálu ve velkých jazykových modelech prostřednictvím sondovacích intervencí	cs_CZ
dc.contributor.referee	Helcl, Jindřich
thesis.degree.name	Mgr.
thesis.degree.level	navazující magisterské	cs_CZ
thesis.degree.discipline	Computer Science - Language Technologies and Computational Linguistics	en_US
thesis.degree.discipline	Computer Science - Language Technologies and Computational Linguistics	cs_CZ
thesis.degree.program	Computer Science - Language Technologies and Computational Linguistics	en_US
thesis.degree.program	Computer Science - Language Technologies and Computational Linguistics	cs_CZ
uk.thesis.type	diplomová práce	cs_CZ
uk.taxonomy.organization-cs	Matematicko-fyzikální fakulta::Ústav formální a aplikované lingvistiky	cs_CZ
uk.taxonomy.organization-en	Faculty of Mathematics and Physics::Institute of Formal and Applied Linguistics	en_US
uk.faculty-name.cs	Matematicko-fyzikální fakulta	cs_CZ
uk.faculty-name.en	Faculty of Mathematics and Physics	en_US
uk.faculty-abbr.cs	MFF	cs_CZ
uk.degree-discipline.cs	Computer Science - Language Technologies and Computational Linguistics	cs_CZ
uk.degree-discipline.en	Computer Science - Language Technologies and Computational Linguistics	en_US
uk.degree-program.cs	Computer Science - Language Technologies and Computational Linguistics	cs_CZ
uk.degree-program.en	Computer Science - Language Technologies and Computational Linguistics	en_US
thesis.grade.cs	Výborně	cs_CZ
thesis.grade.en	Excellent	en_US
uk.abstract.en	Title: Investigating Large Language Models' Representations Of Plurality Through Probing Interventions Author: Michael Hanna Institute: Institute of Formal and Applied Linguistics Supervisor: RNDr. David Mareček, Ph.D., Institute of Formal and Applied Linguistics Abstract: Large language models (LLMs) have become ubiquitous in natural language processing, but how exactly they process their input and arrive at good downstream task performance is still poorly understood. While much work has been done using probing to examine LLM internals, or behavioral studies, to determine LLMs' linguistic capabilities, these techniques are too weak to allow us to draw conclusions how LLMs process language. In this paper, I use both probing and causal intervention methods to investigate the question of subject-verb agreement with respect to the subject's plurality. I find that while probing reveals that subject plurality information is distributed throughout a sentence, causal interventions suggest that only information stored in linguistically relevant tokens is used. Probing interventions suggest that some but not all probes capture information in a way that reflects LLMs' usage thereof. Keywords: Interpretability, Probing, Natural Language Processing, Computational Linguistics	en_US
uk.file-availability	V
uk.grantor	Univerzita Karlova, Matematicko-fyzikální fakulta, Ústav formální a aplikované lingvistiky	cs_CZ
thesis.grade.code	1
uk.publication-place	Praha	cs_CZ
uk.thesis.defenceStatus	O

Soubory tohoto záznamu

Název:: 120426695.pdf
Velikost:: 2.345Mb
Formát:: application/pdf
Popis:: Text práce

Zobrazit/otevřít

Název:: 120426689.pdf
Velikost:: 79.96Kb
Formát:: application/pdf
Popis:: Abstrakt (anglicky)

Zobrazit/otevřít

Název:: 120430624.pdf
Velikost:: 53.33Kb
Formát:: application/pdf
Popis:: Posudek vedoucího

Zobrazit/otevřít

Název:: 120431560.pdf
Velikost:: 76.78Kb
Formát:: application/pdf
Popis:: Posudek oponenta

Zobrazit/otevřít

Název:: 120434717.pdf
Velikost:: 348.6Kb
Formát:: application/pdf
Popis:: Záznam o průběhu obhajoby

Zobrazit/otevřít

Tento záznam se objevuje v následujících sbírkách

Kvalifikační práce [11325]
Theses

Zobrazit minimální záznam