dc.contributor.advisor | Mareček, David | |
dc.creator | Hanna, Michael | |
dc.date.accessioned | 2022-10-04T17:28:50Z | |
dc.date.available | 2022-10-04T17:28:50Z | |
dc.date.issued | 2022 | |
dc.identifier.uri | http://hdl.handle.net/20.500.11956/175532 | |
dc.description.abstract | Title: Investigating Large Language Models' Representations Of Plurality Through Probing Interventions Author: Michael Hanna Institute: Institute of Formal and Applied Linguistics Supervisor: RNDr. David Mareček, Ph.D., Institute of Formal and Applied Linguistics Abstract: Large language models (LLMs) have become ubiquitous in natural language processing, but how exactly they process their input and arrive at good downstream task performance is still poorly understood. While much work has been done using probing to examine LLM internals, or behavioral studies, to determine LLMs' linguistic capabilities, these techniques are too weak to allow us to draw conclusions how LLMs process language. In this paper, I use both probing and causal intervention methods to investigate the question of subject-verb agreement with respect to the subject's plurality. I find that while probing reveals that subject plurality information is distributed throughout a sentence, causal interventions suggest that only information stored in linguistically relevant tokens is used. Probing interventions suggest that some but not all probes capture information in a way that reflects LLMs' usage thereof. Keywords: Interpretability, Probing, Natural Language Processing, Computational Linguistics | en_US |
dc.language | English | cs_CZ |
dc.language.iso | en_US | |
dc.publisher | Univerzita Karlova, Matematicko-fyzikální fakulta | cs_CZ |
dc.subject | probing|interpretation|language model|neural network | en_US |
dc.subject | probing|interpretace|jazykový model|neuronová síť | cs_CZ |
dc.title | Investigating Large Language Models' Representations Of Plurality Through Probing Interventions | en_US |
dc.type | diplomová práce | cs_CZ |
dcterms.created | 2022 | |
dcterms.dateAccepted | 2022-09-02 | |
dc.description.department | Institute of Formal and Applied Linguistics | en_US |
dc.description.department | Ústav formální a aplikované lingvistiky | cs_CZ |
dc.description.faculty | Matematicko-fyzikální fakulta | cs_CZ |
dc.description.faculty | Faculty of Mathematics and Physics | en_US |
dc.identifier.repId | 248720 | |
dc.title.translated | Zkoumání reprezentace plurálu ve velkých jazykových modelech prostřednictvím sondovacích intervencí | cs_CZ |
dc.contributor.referee | Helcl, Jindřich | |
thesis.degree.name | Mgr. | |
thesis.degree.level | navazující magisterské | cs_CZ |
thesis.degree.discipline | Computer Science - Language Technologies and Computational Linguistics | en_US |
thesis.degree.discipline | Computer Science - Language Technologies and Computational Linguistics | cs_CZ |
thesis.degree.program | Computer Science - Language Technologies and Computational Linguistics | en_US |
thesis.degree.program | Computer Science - Language Technologies and Computational Linguistics | cs_CZ |
uk.thesis.type | diplomová práce | cs_CZ |
uk.taxonomy.organization-cs | Matematicko-fyzikální fakulta::Ústav formální a aplikované lingvistiky | cs_CZ |
uk.taxonomy.organization-en | Faculty of Mathematics and Physics::Institute of Formal and Applied Linguistics | en_US |
uk.faculty-name.cs | Matematicko-fyzikální fakulta | cs_CZ |
uk.faculty-name.en | Faculty of Mathematics and Physics | en_US |
uk.faculty-abbr.cs | MFF | cs_CZ |
uk.degree-discipline.cs | Computer Science - Language Technologies and Computational Linguistics | cs_CZ |
uk.degree-discipline.en | Computer Science - Language Technologies and Computational Linguistics | en_US |
uk.degree-program.cs | Computer Science - Language Technologies and Computational Linguistics | cs_CZ |
uk.degree-program.en | Computer Science - Language Technologies and Computational Linguistics | en_US |
thesis.grade.cs | Výborně | cs_CZ |
thesis.grade.en | Excellent | en_US |
uk.abstract.en | Title: Investigating Large Language Models' Representations Of Plurality Through Probing Interventions Author: Michael Hanna Institute: Institute of Formal and Applied Linguistics Supervisor: RNDr. David Mareček, Ph.D., Institute of Formal and Applied Linguistics Abstract: Large language models (LLMs) have become ubiquitous in natural language processing, but how exactly they process their input and arrive at good downstream task performance is still poorly understood. While much work has been done using probing to examine LLM internals, or behavioral studies, to determine LLMs' linguistic capabilities, these techniques are too weak to allow us to draw conclusions how LLMs process language. In this paper, I use both probing and causal intervention methods to investigate the question of subject-verb agreement with respect to the subject's plurality. I find that while probing reveals that subject plurality information is distributed throughout a sentence, causal interventions suggest that only information stored in linguistically relevant tokens is used. Probing interventions suggest that some but not all probes capture information in a way that reflects LLMs' usage thereof. Keywords: Interpretability, Probing, Natural Language Processing, Computational Linguistics | en_US |
uk.file-availability | V | |
uk.grantor | Univerzita Karlova, Matematicko-fyzikální fakulta, Ústav formální a aplikované lingvistiky | cs_CZ |
thesis.grade.code | 1 | |
uk.publication-place | Praha | cs_CZ |
uk.thesis.defenceStatus | O | |