Understanding cross-lingual abilities in large multilingual language models
Porozumění mezijazykovým vlastnostem ve velkých vícejazyčných jazykových modelech
diplomová práce (OBHÁJENO)
Zobrazit/ otevřít
Trvalý odkaz
http://hdl.handle.net/20.500.11956/184175Identifikátory
SIS: 257456
Kolekce
- Kvalifikační práce [11197]
Vedoucí práce
Oponent práce
Limisiewicz, Tomasz
Fakulta / součást
Matematicko-fyzikální fakulta
Obor
Computer Science - Language Technologies and Computational Linguistics
Katedra / ústav / klinika
Ústav formální a aplikované lingvistiky
Datum obhajoby
6. 9. 2023
Nakladatel
Univerzita Karlova, Matematicko-fyzikální fakultaJazyk
Angličtina
Známka
Výborně
Klíčová slova (česky)
transfer learning|cross-lingual learning|low-resource|language modelsKlíčová slova (anglicky)
transfer learning|cross-lingual learning|low-resource|language modelsCross-lingual abilities have been evident in large multilingual language models over the past few years. However, understanding why and under what circumstances they work is not entirely clear. In this work, we work towards a better understanding of these aspects in a specific subset of multilingual models, namely modular multilingual models with cross-lingual transfer learning abilities. We try to quantify claims in Pfeiffer et al. [2022] regarding their proposed model, X-MOD, as it was tested in a very specific setting which may not align with common low-resource settings. Specifically, we evaluate how the following factors may affect downstream performance: the amount of available pre- training data; hyperparameters such as number of training steps, checkpoint selection criteria, available overlapping lexicon. With the help of our findings, we also aim to provide guidelines on how to best use X-MOD, especially from a low-resource perspective. 1