dc.contributor.advisor | Zeman, Daniel | |
dc.creator | Aggarwal, Akshay | |
dc.date.accessioned | 2020-10-01T09:56:36Z | |
dc.date.available | 2020-10-01T09:56:36Z | |
dc.date.issued | 2020 | |
dc.identifier.uri | http://hdl.handle.net/20.500.11956/120867 | |
dc.description.abstract | Thesis Abstract Akshay Aggarwal July 2020 This thesis attempts at correction of some errors and inconsistencies in dif- ferent treebanks. The inconsistencies can be related to linguistic constructions, failure of the guidelines of annotation, failure to understand the guidelines on annotator's part, or random errors caused by annotators, among others. We propose a metric to attest the POS annotation consistency of different tree- banks in the same language, when the annotation guidelines remain the same. We offer solutions to some previously identified inconsistencies in the scope of the Universal Dependencies Project, and check the viability of a proposed in- consistency detection tool in a low-resource setting. The solutions discussed in the thesis are language-neutral, intended to work with multiple languages with efficiency. 1 | en_US |
dc.language | English | cs_CZ |
dc.language.iso | en_US | |
dc.publisher | Univerzita Karlova, Matematicko-fyzikální fakulta | cs_CZ |
dc.subject | konzistence anotace | cs_CZ |
dc.subject | nekonzistence anotace | cs_CZ |
dc.subject | dobývání chyb | cs_CZ |
dc.subject | jazykově nezávislé | cs_CZ |
dc.subject | Universal Dependencies | cs_CZ |
dc.subject | projekt UD | cs_CZ |
dc.subject | syntax | cs_CZ |
dc.subject | morfologie | cs_CZ |
dc.subject | Annotation Consistency | en_US |
dc.subject | Annotation Inconsistency | en_US |
dc.subject | Error Mining | en_US |
dc.subject | Language Independent | en_US |
dc.subject | Universal Dependencies | en_US |
dc.subject | UD Project | en_US |
dc.subject | Syntax | en_US |
dc.subject | Morphology | en_US |
dc.title | Consistency of Linguistic Annotation | en_US |
dc.type | diplomová práce | cs_CZ |
dcterms.created | 2020 | |
dcterms.dateAccepted | 2020-09-10 | |
dc.description.department | Ústav formální a aplikované lingvistiky | cs_CZ |
dc.description.department | Institute of Formal and Applied Linguistics | en_US |
dc.description.faculty | Faculty of Mathematics and Physics | en_US |
dc.description.faculty | Matematicko-fyzikální fakulta | cs_CZ |
dc.identifier.repId | 226385 | |
dc.title.translated | Konzistence lingvistických anotací | cs_CZ |
dc.contributor.referee | Lopatková, Markéta | |
thesis.degree.name | Mgr. | |
thesis.degree.level | navazující magisterské | cs_CZ |
thesis.degree.discipline | Computational Linguistics | en_US |
thesis.degree.discipline | Matematická lingvistika | cs_CZ |
thesis.degree.program | Computer Science | en_US |
thesis.degree.program | Informatika | cs_CZ |
uk.thesis.type | diplomová práce | cs_CZ |
uk.taxonomy.organization-cs | Matematicko-fyzikální fakulta::Ústav formální a aplikované lingvistiky | cs_CZ |
uk.taxonomy.organization-en | Faculty of Mathematics and Physics::Institute of Formal and Applied Linguistics | en_US |
uk.faculty-name.cs | Matematicko-fyzikální fakulta | cs_CZ |
uk.faculty-name.en | Faculty of Mathematics and Physics | en_US |
uk.faculty-abbr.cs | MFF | cs_CZ |
uk.degree-discipline.cs | Matematická lingvistika | cs_CZ |
uk.degree-discipline.en | Computational Linguistics | en_US |
uk.degree-program.cs | Informatika | cs_CZ |
uk.degree-program.en | Computer Science | en_US |
thesis.grade.cs | Dobře | cs_CZ |
thesis.grade.en | Good | en_US |
uk.abstract.en | Thesis Abstract Akshay Aggarwal July 2020 This thesis attempts at correction of some errors and inconsistencies in dif- ferent treebanks. The inconsistencies can be related to linguistic constructions, failure of the guidelines of annotation, failure to understand the guidelines on annotator's part, or random errors caused by annotators, among others. We propose a metric to attest the POS annotation consistency of different tree- banks in the same language, when the annotation guidelines remain the same. We offer solutions to some previously identified inconsistencies in the scope of the Universal Dependencies Project, and check the viability of a proposed in- consistency detection tool in a low-resource setting. The solutions discussed in the thesis are language-neutral, intended to work with multiple languages with efficiency. 1 | en_US |
uk.file-availability | V | |
uk.grantor | Univerzita Karlova, Matematicko-fyzikální fakulta, Ústav formální a aplikované lingvistiky | cs_CZ |
thesis.grade.code | 3 | |
uk.publication-place | Praha | cs_CZ |