Korpusy jako zdroje dat pro úpravy nástrojů automatické morfologické analýzy (Slovotvorné varianty adjektiv na [(ou)|í]cí z hlediska morfologického značkování)

Osolsobě, Klára; Čermák, Petr

Corpora as Data Sources for the Up-Grading of Morphological Tagging

dc.contributor.author	Osolsobě, Klára
dc.contributor.author	Čermák, Petr
dc.date.accessioned	2018-05-28T11:04:06Z
dc.date.available	2018-05-28T11:04:06Z
dc.date.issued	2015
dc.identifier.uri	http://hdl.handle.net/20.500.11956/96413
dc.description.abstract	Adjectives ending with -oucí/-ící are regularly derived from verbs and hence are not usually listed in any of the Czech monolingual dictionaries. On the level of automatic morphological analysis (the dictionary) of Czech they should be generated from verbal roots and tagged as verbal adjectives (pos tag: AG.*). The data from Czech corpora prove a) inconsistencies in tagging and b) gaps in the dictionary. The main cause of both kinds of insufficiency is the existence of variants on the level of verbal forms from which the verbal adjectives are potentially derived. Consequently, text corpora are a significant sourceof knowledge about the formation and use of adjectives with endings -oucí/-ící that can be important for both a) automatic morphological analysis of Czech and b) theoretical description of Czech grammar(derivational morphology). Our goal is to present a corpus-based study of the Czech gerund, i.e. verbaladjectives with -oucí/-ící. The link between the inflected and the word-formation variants will bedemonstrated using material from the SYN corpus (2,6 billion tokens of written Czech) and the large web corpus czTenTen12 (5,2 billion tokens of Czech text from the Internet — cleaned and deduplicated).	en
dc.format	pdf
dc.language.iso	cs
dc.publisher	Univerzita Karlova, Filozofická fakulta
dc.source	Časopis pro moderní filologii (Journal for Modern Philology), 2015, 97, 2, 136-145
dc.title	Korpusy jako zdroje dat pro úpravy nástrojů automatické morfologické analýzy (Slovotvorné varianty adjektiv na [(ou)\|í]cí z hlediska morfologického značkování)	cs
dc.type	Vědecký článek	cs
dcterms.accessRights	openAccess
dcterms.license	https://creativecommons.org/licenses/by-nc-nd/2.0/
dc.title.translated	Corpora as Data Sources for the Up-Grading of Morphological Tagging	en
dc.publisher.publicationPlace	Praha
uk.internal-type	uk_publication
dc.description.startPage	136
dc.description.endPage	145
dcterms.isPartOf.name	Časopis pro moderní filologii (Journal for Modern Philology)	cs
dcterms.isPartOf.journalYear	2015
dcterms.isPartOf.journalVolume	2015
dcterms.isPartOf.journalIssue	2
dcterms.isPartOf.issn	2336-6591
dc.relation.isPartOfUrl	https://casopispromodernifilologii.ff.cuni.cz
dc.subject.keyword	verbální adjektivum	cs
dc.subject.keyword	morfologické značkování	cs
dc.subject.keyword	automatická morfologická analýza	cs
dc.subject.keyword	varianta	cs
dc.subject.keyword	slovotvorba	cs
dc.subject.keyword	gerund/deverbal adjective	en
dc.subject.keyword	pos tagging	en
dc.subject.keyword	automatic morphological analysis	en
dc.subject.keyword	variant	en
dc.subject.keyword	derivational	en
dc.subject.keyword	morphology	en

Soubory tohoto záznamu

Název:: 1346135_klara_osolsobe_136-145.pdf
Velikost:: 691.8Kb
Formát:: application/pdf
Popis:: Plný text

Zobrazit/otevřít

Tento záznam se objevuje v následujících sbírkách

Číslo 2 [8]
Issue 2

Zobrazit minimální záznam