Application of Artificial Neural Networks in Computational Linguistics

Němec, Petr

Application of Artificial Neural Networks in Computational Linguistics

dc.creator	Němec, Petr
dc.date.accessioned	2021-05-19T16:13:24Z
dc.date.available	2021-05-19T16:13:24Z
dc.date.issued	2006
dc.identifier.uri	http://hdl.handle.net/20.500.11956/3247
dc.description.abstract	Neural networks represent a promising approach to problems, which exact algorithmic solution is unknown or not efficient enough. Morphological tagging is one of such tasks in the area of computational linguistics. We have tried to use a backpropagation neural network in several types of experiments. When determining the correct tag on the basis of reliable context, we have learned that the neural tag is basically capable to handle the problem, although the achieved tagging precision (89,22%) did not reach that of statistical methods (93,47%). We also managed to determine appropriate network and context parameters that we have used in the next experiments. The attempt to determine the correct tag on the basis of beforehand statistically determined tags brought a slight decrease of tagging precision (88,71%). Finally, the experiment, which goal was to vote from the outputs of two statistical taggers, showed higher tagging precision (93,56%) than any of these methods (92,74%, 92,58%). It is therefore the overall best result on the given training data set (Prague Dependency Treebank). Hence, it is recommended to test the method by training it on a larger training set (Czech Corpus).	en_US
dc.description.abstract	Neuronové sítě představují perspektivní přístup k řešení problémů, jejichž přímé algoritmické řešení není známé či dostatečně efektivní. Automatické morfologické značkování je jednou z takových úloh na poli počítačové lingvistiky. K jejímu řešení jsme použili neuronovou síť zpětného šíření (backpropagation) v několika typech experimentů. Při určování správné značky na základě spolehlivého kontextu jsme se přesvědčili o základní schopnosti sítě se problému naučit, ačkoli dosažená úspešnost (89,22%) nedosahovala přesnosti dosahované statistikou (93,47%). Podařilo se nám též určit vhodné parametry sítě a vstupního kontextu pro další experimenty. Pokus určit správnou značku na základě kontextu značek určených předem statistikou přinesl mírné snížení úspěšnosti (88,71%). Konečný experiment, jehož úkolem bylo volit mezi výstupy dvou statistických metod, vykázal vyšší úspěšnost (93,56%) než libovolné z těchto metod (92,74%, 92,58%). Na daném trénovacím korpusu (Pražský závislostní korpus) jde v současné době o absolutně nejlepší dosažený výsledek. Z dosažených výsledků vyplývá doporučení, aby prezentovaná metoda byla vyzkoušena na rozsáhlejší množině dat (Český národní korpus).	cs_CZ
dc.language	Čeština	cs_CZ
dc.language.iso	cs_CZ
dc.publisher	Univerzita Karlova, Matematicko-fyzikální fakulta	cs_CZ
dc.title	Application of Artificial Neural Networks in Computational Linguistics	cs_CZ
dc.type	rigorózní práce	cs_CZ
dcterms.created	2006
dcterms.dateAccepted	2006-02-09
dc.description.department	Institute of Formal and Applied Linguistics	en_US
dc.description.department	Ústav formální a aplikované lingvistiky	cs_CZ
dc.description.faculty	Faculty of Mathematics and Physics	en_US
dc.description.faculty	Matematicko-fyzikální fakulta	cs_CZ
dc.identifier.repId	43917
dc.title.translated	Application of Artificial Neural Networks in Computational Linguistics	en_US
dc.identifier.aleph	000830789
thesis.degree.name	RNDr.
thesis.degree.level	rigorózní řízení	cs_CZ
thesis.degree.discipline	Computational and Formal Linguistics	en_US
thesis.degree.discipline	Počítačová a formální lingvistika	cs_CZ
thesis.degree.program	Informatics	en_US
thesis.degree.program	Informatika	cs_CZ
uk.thesis.type	rigorózní práce	cs_CZ
uk.taxonomy.organization-cs	Matematicko-fyzikální fakulta::Ústav formální a aplikované lingvistiky	cs_CZ
uk.taxonomy.organization-en	Faculty of Mathematics and Physics::Institute of Formal and Applied Linguistics	en_US
uk.faculty-name.cs	Matematicko-fyzikální fakulta	cs_CZ
uk.faculty-name.en	Faculty of Mathematics and Physics	en_US
uk.faculty-abbr.cs	MFF	cs_CZ
uk.degree-discipline.cs	Počítačová a formální lingvistika	cs_CZ
uk.degree-discipline.en	Computational and Formal Linguistics	en_US
uk.degree-program.cs	Informatika	cs_CZ
uk.degree-program.en	Informatics	en_US
thesis.grade.cs	Uznáno	cs_CZ
thesis.grade.en	Recognized	en_US
uk.abstract.cs	Neuronové sítě představují perspektivní přístup k řešení problémů, jejichž přímé algoritmické řešení není známé či dostatečně efektivní. Automatické morfologické značkování je jednou z takových úloh na poli počítačové lingvistiky. K jejímu řešení jsme použili neuronovou síť zpětného šíření (backpropagation) v několika typech experimentů. Při určování správné značky na základě spolehlivého kontextu jsme se přesvědčili o základní schopnosti sítě se problému naučit, ačkoli dosažená úspešnost (89,22%) nedosahovala přesnosti dosahované statistikou (93,47%). Podařilo se nám též určit vhodné parametry sítě a vstupního kontextu pro další experimenty. Pokus určit správnou značku na základě kontextu značek určených předem statistikou přinesl mírné snížení úspěšnosti (88,71%). Konečný experiment, jehož úkolem bylo volit mezi výstupy dvou statistických metod, vykázal vyšší úspěšnost (93,56%) než libovolné z těchto metod (92,74%, 92,58%). Na daném trénovacím korpusu (Pražský závislostní korpus) jde v současné době o absolutně nejlepší dosažený výsledek. Z dosažených výsledků vyplývá doporučení, aby prezentovaná metoda byla vyzkoušena na rozsáhlejší množině dat (Český národní korpus).	cs_CZ
uk.abstract.en	Neural networks represent a promising approach to problems, which exact algorithmic solution is unknown or not efficient enough. Morphological tagging is one of such tasks in the area of computational linguistics. We have tried to use a backpropagation neural network in several types of experiments. When determining the correct tag on the basis of reliable context, we have learned that the neural tag is basically capable to handle the problem, although the achieved tagging precision (89,22%) did not reach that of statistical methods (93,47%). We also managed to determine appropriate network and context parameters that we have used in the next experiments. The attempt to determine the correct tag on the basis of beforehand statistically determined tags brought a slight decrease of tagging precision (88,71%). Finally, the experiment, which goal was to vote from the outputs of two statistical taggers, showed higher tagging precision (93,56%) than any of these methods (92,74%, 92,58%). It is therefore the overall best result on the given training data set (Prague Dependency Treebank). Hence, it is recommended to test the method by training it on a larger training set (Czech Corpus).	en_US
uk.file-availability	V
uk.grantor	Univerzita Karlova, Matematicko-fyzikální fakulta, Ústav formální a aplikované lingvistiky	cs_CZ
thesis.grade.code	U
uk.publication-place	Praha	cs_CZ
uk.thesis.defenceStatus	U
dc.identifier.lisID	990008307890106986

Soubory tohoto záznamu

Název:: 150011944.pdf
Velikost:: 462.5Kb
Formát:: application/pdf
Popis:: Text práce

Zobrazit/otevřít

Název:: 150011945.pdf
Velikost:: 80.01Kb
Formát:: application/pdf
Popis:: Abstrakt

Zobrazit/otevřít

Název:: 150011946.pdf
Velikost:: 79.90Kb
Formát:: application/pdf
Popis:: Abstrakt (anglicky)

Zobrazit/otevřít

Název:: 150003941.pdf
Velikost:: 19.80Kb
Formát:: application/pdf
Popis:: Záznam o průběhu obhajoby

Zobrazit/otevřít

Tento záznam se objevuje v následujících sbírkách

Kvalifikační práce [10690]
Theses

Zobrazit minimální záznam