Show simple item record

Aplications of information theory to the study of deep learning
dc.contributor.advisorSychrovský, David
dc.creatorČerný, Jakub
dc.date.accessioned2025-09-26T09:46:19Z
dc.date.available2025-09-26T09:46:19Z
dc.date.issued2025
dc.identifier.urihttp://hdl.handle.net/20.500.11956/202629
dc.description.abstractDespite deep learning's remarkable empirical success, its theoretical un- derpinnings lag behind. Information theory provides a powerful framework for analyzing internal network representations, particularly through recent advances in information bottleneck (IB) theory and the information plane. This thesis investigates how the structure of the information plane, specifi- cally the clustering behavior of internal representations, influences neural net- work performance. We introduce Purity theory, a novel framework for quan- tifying layer-wise clustering, complementing established IB perspectives. Our analysis reveals a significant correlation between information plane structure and generalization performance in binary classification tasks. Leveraging this correlation, we propose a new information-theoretic metric that effectively predicts model generalization capability. Furthermore, we develop a model selection algorithm based on this metric, which demonstrably outperforms selection based solely on training loss.en_US
dc.description.abstractPr ̌es pozoruhodny ́ empiricky ́ u 'spe ̌ch hluboke ́ho uc ̌enı ́(deep learning) jeho teoreticke ́ za ́klady zaosta ́vajı ́. Teorie informace poskytuje u ́c ̌inny ́ ra ́mec pro analy ́zu vnitr ̌nı ́ch reprezentacı ́ v sı ́tı ́ch, zejme ́na dı ́ky neda ́vne ́mu pokroku v teorii informac ̌nı ́ho hrdla (information bottleneck, IB) a konceptu infor- mac ̌nı ́roviny. Tato pra ́ce zkouma ́, jak struktura informac ̌nı ́roviny, konkre ́tne ̌ shlukova ́nı ́vnitr ̌nı ́ch reprezentacı ́, ovlivn ̌uje vy ́kon neuronovy ́ch sı ́tı ́. Pr ̌edstavujeme Purity teorii, novy ́ ra ́mec pro kvantifikaci shlukova ́nı ́reprezentacı ́po vrstva ́ch, ktery ́ dopln ̌uje sta ́vajı ́cı ́ IB perspektivy. Nas ̌e analy ́za odhaluje vy ́znamnou korelaci mezi strukturou informac ̌nı ́roviny a schopnostı ́generalizace v u ́loha ́ch bina ́rnı ́ klasifikace. Na za ́klade ̌ te ́to korelace navrhujeme novou metriku zaloz ̌enou na teorii informace, ktera ́ u ́c ̌inne ̌ pr ̌edpovı ́da ́ schopnost modelu generalizovat. Da ́le vyvı ́jı ́me algoritmus pro vy ́be ̌r modelu vyuz ̌ı ́vajı ́cı ́ tuto metriku, ktery ́ prokazatelne ̌ pr ̌ekona ́va ́ vy ́be ̌r zaloz ̌eny ́ vy ́hradne ̌ na tre ́novacı ́ ztra ́te ̌ (train loss).cs_CZ
dc.languageČeštinacs_CZ
dc.language.isocs_CZ
dc.publisherUniverzita Karlova, Matematicko-fyzikální fakultacs_CZ
dc.subjectHluboké Učení|Strojové Učení|Teorie Informačního Hrdla|Neuronové Sítěcs_CZ
dc.subjectDeep Learning|information bottleneck theory|Machine Learning|Neural Networksen_US
dc.titleAplikace teorie informace na studium učení hlubokých neuronových sítícs_CZ
dc.typebakalářská prácecs_CZ
dcterms.created2025
dcterms.dateAccepted2025-09-05
dc.description.departmentKatedra aplikované matematikycs_CZ
dc.description.departmentDepartment of Applied Mathematicsen_US
dc.description.facultyFaculty of Mathematics and Physicsen_US
dc.description.facultyMatematicko-fyzikální fakultacs_CZ
dc.identifier.repId282204
dc.title.translatedAplications of information theory to the study of deep learningen_US
dc.contributor.refereeSchmid, Martin
thesis.degree.nameBc.
thesis.degree.levelbakalářskécs_CZ
thesis.degree.disciplineMathematics for Information Technologiesen_US
thesis.degree.disciplineMatematika pro informační technologiecs_CZ
thesis.degree.programMatematika pro informační technologiecs_CZ
thesis.degree.programMathematics for Information Technologiesen_US
uk.thesis.typebakalářská prácecs_CZ
uk.taxonomy.organization-csMatematicko-fyzikální fakulta::Katedra aplikované matematikycs_CZ
uk.taxonomy.organization-enFaculty of Mathematics and Physics::Department of Applied Mathematicsen_US
uk.faculty-name.csMatematicko-fyzikální fakultacs_CZ
uk.faculty-name.enFaculty of Mathematics and Physicsen_US
uk.faculty-abbr.csMFFcs_CZ
uk.degree-discipline.csMatematika pro informační technologiecs_CZ
uk.degree-discipline.enMathematics for Information Technologiesen_US
uk.degree-program.csMatematika pro informační technologiecs_CZ
uk.degree-program.enMathematics for Information Technologiesen_US
thesis.grade.csVýborněcs_CZ
thesis.grade.enExcellenten_US
uk.abstract.csPr ̌es pozoruhodny ́ empiricky ́ u 'spe ̌ch hluboke ́ho uc ̌enı ́(deep learning) jeho teoreticke ́ za ́klady zaosta ́vajı ́. Teorie informace poskytuje u ́c ̌inny ́ ra ́mec pro analy ́zu vnitr ̌nı ́ch reprezentacı ́ v sı ́tı ́ch, zejme ́na dı ́ky neda ́vne ́mu pokroku v teorii informac ̌nı ́ho hrdla (information bottleneck, IB) a konceptu infor- mac ̌nı ́roviny. Tato pra ́ce zkouma ́, jak struktura informac ̌nı ́roviny, konkre ́tne ̌ shlukova ́nı ́vnitr ̌nı ́ch reprezentacı ́, ovlivn ̌uje vy ́kon neuronovy ́ch sı ́tı ́. Pr ̌edstavujeme Purity teorii, novy ́ ra ́mec pro kvantifikaci shlukova ́nı ́reprezentacı ́po vrstva ́ch, ktery ́ dopln ̌uje sta ́vajı ́cı ́ IB perspektivy. Nas ̌e analy ́za odhaluje vy ́znamnou korelaci mezi strukturou informac ̌nı ́roviny a schopnostı ́generalizace v u ́loha ́ch bina ́rnı ́ klasifikace. Na za ́klade ̌ te ́to korelace navrhujeme novou metriku zaloz ̌enou na teorii informace, ktera ́ u ́c ̌inne ̌ pr ̌edpovı ́da ́ schopnost modelu generalizovat. Da ́le vyvı ́jı ́me algoritmus pro vy ́be ̌r modelu vyuz ̌ı ́vajı ́cı ́ tuto metriku, ktery ́ prokazatelne ̌ pr ̌ekona ́va ́ vy ́be ̌r zaloz ̌eny ́ vy ́hradne ̌ na tre ́novacı ́ ztra ́te ̌ (train loss).cs_CZ
uk.abstract.enDespite deep learning's remarkable empirical success, its theoretical un- derpinnings lag behind. Information theory provides a powerful framework for analyzing internal network representations, particularly through recent advances in information bottleneck (IB) theory and the information plane. This thesis investigates how the structure of the information plane, specifi- cally the clustering behavior of internal representations, influences neural net- work performance. We introduce Purity theory, a novel framework for quan- tifying layer-wise clustering, complementing established IB perspectives. Our analysis reveals a significant correlation between information plane structure and generalization performance in binary classification tasks. Leveraging this correlation, we propose a new information-theoretic metric that effectively predicts model generalization capability. Furthermore, we develop a model selection algorithm based on this metric, which demonstrably outperforms selection based solely on training loss.en_US
uk.file-availabilityV
uk.grantorUniverzita Karlova, Matematicko-fyzikální fakulta, Katedra aplikované matematikycs_CZ
thesis.grade.code1
uk.publication-placePrahacs_CZ
uk.thesis.defenceStatusO


Files in this item

Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record


© 2025 Univerzita Karlova, Ústřední knihovna, Ovocný trh 560/5, 116 36 Praha 1; email: admin-repozitar [at] cuni.cz

Za dodržení všech ustanovení autorského zákona jsou zodpovědné jednotlivé složky Univerzity Karlovy. / Each constituent part of Charles University is responsible for adherence to all provisions of the copyright law.

Upozornění / Notice: Získané informace nemohou být použity k výdělečným účelům nebo vydávány za studijní, vědeckou nebo jinou tvůrčí činnost jiné osoby než autora. / Any retrieved information shall not be used for any commercial purposes or claimed as results of studying, scientific or any other creative activities of any person other than the author.

DSpace software copyright © 2002-2015  DuraSpace
Theme by 
@mire NV