Understanding cross-lingual abilities in large multilingual language models

Del Valle Girón, José Jacobo

Porozumění mezijazykovým vlastnostem ve velkých vícejazyčných jazykových modelech

diplomová práce (OBHÁJENO)

Zobrazit/otevřít

Záznam o průběhu obhajoby (347.1Kb)

Trvalý odkaz

http://hdl.handle.net/20.500.11956/184175

Identifikátory

SIS: 257456

Oponent práce

Limisiewicz, Tomasz

Fakulta / součást

Matematicko-fyzikální fakulta

Obor

Computer Science - Language Technologies and Computational Linguistics

Katedra / ústav / klinika

Ústav formální a aplikované lingvistiky

Datum obhajoby

6. 9. 2023

Nakladatel

Univerzita Karlova, Matematicko-fyzikální fakulta

Jazyk

Angličtina

Známka

Výborně

Klíčová slova (česky)

transfer learning|cross-lingual learning|low-resource|language models

Klíčová slova (anglicky)

transfer learning|cross-lingual learning|low-resource|language models

Cross-lingual abilities have been evident in large multilingual language models over the past few years. However, understanding why and under what circumstances they work is not entirely clear. In this work, we work towards a better understanding of these aspects in a specific subset of multilingual models, namely modular multilingual models with cross-lingual transfer learning abilities. We try to quantify claims in Pfeiffer et al. [2022] regarding their proposed model, X-MOD, as it was tested in a very specific setting which may not align with common low-resource settings. Specifically, we evaluate how the following factors may affect downstream performance: the amount of available pre- training data; hyperparameters such as number of training steps, checkpoint selection criteria, available overlapping lexicon. With the help of our findings, we also aim to provide guidelines on how to best use X-MOD, especially from a low-resource perspective. 1

Citace dokumentu

Metadata

Zobrazit celý záznam