Návrh efektivní generické molekulární reprezentace

Škoda, Petr

Návrh efektivní generické molekulární reprezentace

diplomová práce (OBHÁJENO)

Zobrazit/otevřít

Záznam o průběhu obhajoby (151.7Kb)

Trvalý odkaz

http://hdl.handle.net/20.500.11956/67455

Identifikátory

SIS: 143162

Katalog UK: 990017782430106986

Oponent práce

Mráz, František

Fakulta / součást

Matematicko-fyzikální fakulta

Obor

Softwarové systémy

Katedra / ústav / klinika

Katedra softwarového inženýrství

Datum obhajoby

26. 5. 2014

Nakladatel

Univerzita Karlova, Matematicko-fyzikální fakulta

Jazyk

Angličtina

Známka

Dobře

Klíčová slova (česky)

chemická informatika;molekulární reprezentace, podobnostní modelování

Klíčová slova (anglicky)

chemical informatics, molecular representation, similarity modeling

Screening chemických knihoven je důležitou součástí výzkumu léčiv. Existující chemické knihovny obsahují miliony sloučenin. Screening v takovém rozsahu je velice nákladný, z tohoto důvodu je často používáno virtuálního screeningu. Existuje několik variant virtuálního screeningu, ligand-based virtual screening je jednou z nich. Využívá podobnosti screenovaných sloučenin ke známým sloučeninám. Na výsledky virutálního screeningu má vliv nejen použitá podobnostní metoda, ale také zvolená reprezentace chemických sloučenin. V této práci prezentujeme reprezentaci chemických sloučenin založenou na fingerprintech. Naše reprezentace využívá fragmentů chemické sloučeniny k její reprezentaci jakožto celku. Každý fragment je reprezentován fyzikálně-chemickými vlastnostmi. Reprezentace je vysoce parametrizovatelá a to zejména v oblasti výběru fyzikálně-chemických vlastností a jejich aplikace. Pro otestování naší reprezentace jsme využili existujícího frameworku pro benchmark virtuálních screeningů. Výsledky ukázaly, že naše reprezentace je srovnatelná s nejlepšími existujícími, navíc na některých datasetech dosáhla nejlepších výsledků.

Abstrakt (anglicky)

The screening of chemical libraries is an important step in the drug discovery process. The existing chemical libraries contain up to millions of compounds. As the screening at such scale is expensive, the virtual screening is often utilized. There exist several variants of virtual screening and ligand- based virtual screening is one of them. It utilizes the similarity of screened chemical compounds to known compounds. Besides the employed similarity measure, another aspect greatly influencing the performance of ligand-based virtual screening is the chosen chemical compound representation. In this thesis, we introduce a fragment-based representation of chemical compounds. Our representation utilizes fragments to represent a compound. Each fragment is represented by its physicochemical descriptors. The representation is highly parameterizable, especially in the area of physicochemical descriptors selection and application. In order to test the performance of our method, we utilized an existing framework for virtual screening benchmarking. The results show that our method is comparable to the best existing approaches and on some datasets it outperforms them.

Citace dokumentu

Metadata

Zobrazit celý záznam