Metoda k-průměrů

Hricová, Jana

K-means method
Metoda k-průměrů

bachelor thesis (DEFENDED)

View/Open

Záznam o průběhu obhajoby (55.41Kb)

Permanent link

http://hdl.handle.net/20.500.11956/50246

Identifiers

Study Information System: 90310

Referee

Legát, David

Faculty / Institute

Faculty of Mathematics and Physics

Discipline

Financial Mathematics

Department

Department of Probability and Mathematical Statistics

Date of defense

12. 9. 2011

Publisher

Univerzita Karlova, Matematicko-fyzikální fakulta

Language

Slovak

Grade

Excellent

Keywords (Czech)

k-priemerov, zhluková analýza, miera podobnosti, obrysový graf

Keywords (English)

k-means, cluster analysis, dissimilarity measure, silhouette

Názov práce: Metoda k-průměrů Autor: Jana Hricová Katedra: Katedra pravděpodobnosti a matematické statistiky Vedúci bakalárskej práce: prof. RNDr. Jaromír Antoch, CSc., Katedra pravdě- podobnosti a matematické statistiky Abstrakt: Táto bakalárska práca pojednáva predovšetkým o štatistickej metóde k-priemerov, ktorá je súčast'ou rozsiahlej množiny metód a algoritmov určených pre zhlu- kovú analýzu dát. Výsledky zhlukovej analýzy majú široké využitie napríklad pri d'alšej vedeckej činnosti, ale aj v marketingu, vedení firiem, poist'ovníctve atd'. Štatistické metódy zhlukovej analýzy vytvárajú z analyzovaných dát zhluky, ktoré sú tvorené podobnými objektmi. Podobnost' objektov je vyjadrená pomocou mier podobnosti, prípadne nepodobnosti. Ciel'om tejto práce bolo predstavit' algoritmus k-priemerov. Ide o nehierarchickú metódu, ktorá vyžaduje predom určeného počtu hl'adaných zhlukov. V prostredí matematického softvéru Matlab sme aplikovali tento algoritmus na simulované a reálne dáta a výsledky interpretovali pomocou grafických a číselných výstupov. Klúčové slová: k-priemerov, zhluková analýza, miera podobnosti, obrysový graf

Abstract (English)

Title: k-means method Author: Jana Hricová Department: Department of Probability and Mathematical Statistics Supervisor: prof. RNDr. Jaromír Antoch, CSc., Department of Probability and Mathematical Statistics Abstract: This thesis deals with the statistical method k-means, which is a part of an extensive set of methods and algorithms designed for cluster analysis of data. Results of the cluster analysis are widely used in other scientific activities, but also in marketing, management or in insurance etc. Statistical methods for cluster analysis are creating clusters from analyzed datasets, which consist of similar objects. Similarity of two objects is expressed by dis-/similarity measure. The aim of this thesis was to introduce the k-means algorithm. This is a non- hierarchical method with given number of output clusters as input. We have applied this algorithm in the enviroment of mathematical software Matlab on simulated and real data and have interpreted the results using graphical and numerical outputs. Keywords: k-means, cluster analysis, dissimilarity measure, silhouette

Citace dokumentu

Metadata

Show full item record