Consistency of Linguistic Annotation
Konzistence lingvistických anotací
diploma thesis (DEFENDED)
View/ Open
Permanent link
http://hdl.handle.net/20.500.11956/120867Identifiers
Study Information System: 226385
Collections
- Kvalifikační práce [10932]
Author
Advisor
Referee
Lopatková, Markéta
Faculty / Institute
Faculty of Mathematics and Physics
Discipline
Computational Linguistics
Department
Institute of Formal and Applied Linguistics
Date of defense
10. 9. 2020
Publisher
Univerzita Karlova, Matematicko-fyzikální fakultaLanguage
English
Grade
Good
Keywords (Czech)
konzistence anotace, nekonzistence anotace, dobývání chyb, jazykově nezávislé, Universal Dependencies, projekt UD, syntax, morfologieKeywords (English)
Annotation Consistency, Annotation Inconsistency, Error Mining, Language Independent, Universal Dependencies, UD Project, Syntax, MorphologyThesis Abstract Akshay Aggarwal July 2020 This thesis attempts at correction of some errors and inconsistencies in dif- ferent treebanks. The inconsistencies can be related to linguistic constructions, failure of the guidelines of annotation, failure to understand the guidelines on annotator's part, or random errors caused by annotators, among others. We propose a metric to attest the POS annotation consistency of different tree- banks in the same language, when the annotation guidelines remain the same. We offer solutions to some previously identified inconsistencies in the scope of the Universal Dependencies Project, and check the viability of a proposed in- consistency detection tool in a low-resource setting. The solutions discussed in the thesis are language-neutral, intended to work with multiple languages with efficiency. 1