Differentiable Depth Estimation for Bin Picking

Černý, Marek

Derivovatelný estimator hloubky pro bin picking

dc.contributor.advisor	Klusáček, David
dc.creator	Černý, Marek
dc.date.accessioned	2022-09-05T06:29:49Z
dc.date.available	2022-09-05T06:29:49Z
dc.date.issued	2019
dc.identifier.uri	http://hdl.handle.net/20.500.11956/109075
dc.description.abstract	The goal of this thesis was to investigate the neural 3D surface reconstruction from multiple views with the intent to use the resulting depth maps for bin picking. Survey of papers from 2014 to 2018 showed that none of the state of the art methods would be used to control a robot arm in our setup. Therefore we decided to create our low-level neural approach which we called the EmfNet. The network is based on a pyramidal resolution refining approach. At each pyramid's layer, there are three separate networks that take part in the computation. Each of them has a definite goal, which gives us almost complete understanding of what is going on inside the network. The EmfNet model was partially usable, but we nevertheless extended it to EmfNet-v2. First, another measuring layer was added, which freed EmfNet from depending on an unnecessary hyperparameter. Second, we used constraints on geometry for the network not to be confused by occlusions (cases where a certain part of the surface is visible only from a single camera). Both networks were implemented and tested on a corpus that was created as a part of this thesis. A corpus containing rendered as well as real data. The process of correspondence pairing inside the network can be observed using the visualization tool. We designed a way how to use a robotic arm...	en_US
dc.description.abstract	Tato práce zkoumá možnosti rekonstrukce povrchu pro bin picking pomocí neuronových sítí. Při zkoumání článků z 2014-2018 se ukázalo, že stávající metody nejsou použitelné. Proto jsme vytvořili vlastní nízkourovnový přístup nazvaný EmfNet. Sít používá pyramidové zjemnování rozlišení, kde se na každé urovni pyramidy výpoctu úcastní tři oddělené sítě s jasne definovaným učelem, coz umožnuje témer kompletní pochopení fungování sítě. Model EmfNet byl již částecně použitelný, ale byl rozšířen na EmfNet-v2. Jednak dostal novou meřící vrstvu, aby nezávisel na zbytečném hyperparametru, hlavne ale bylo využito geometrických omezení k tomu, aby sit nemátly okluze (případy, kdy je určitá část povrchu vidět jen z jedné kamery). Obě sítě jsme implementovali a testovali na vlastnim korpusu jak renderovaných tak realných dat. Process párování korespondencí uvnitř sítě lze sledovat po- mocí vizualizace. Navrhli jsme způsob, jak využít robotickou ruku a SMF software k tomu, abychom relativně rychle získali potřebne mnozství dat pro natrénování modelu. Zatím nejlepší model zvládne zrekonstruovat 80% povrchu s chybou menší než 2 mm za čas pod 1 sekundu. 1	cs_CZ
dc.language	English	cs_CZ
dc.language.iso	en_US
dc.publisher	Univerzita Karlova, Matematicko-fyzikální fakulta	cs_CZ
dc.subject	disparity	en_US
dc.subject	depth perception	en_US
dc.subject	bin picking	en_US
dc.subject	deep convolutional neural networks	en_US
dc.subject	disparita	cs_CZ
dc.subject	vnimani hloubky	cs_CZ
dc.subject	bin picking	cs_CZ
dc.subject	hluboke konvolucni neuronove site	cs_CZ
dc.title	Differentiable Depth Estimation for Bin Picking	en_US
dc.type	bakalářská práce	cs_CZ
dcterms.created	2019
dcterms.dateAccepted	2019-09-05
dc.description.department	Institute of Formal and Applied Linguistics	en_US
dc.description.department	Ústav formální a aplikované lingvistiky	cs_CZ
dc.description.faculty	Matematicko-fyzikální fakulta	cs_CZ
dc.description.faculty	Faculty of Mathematics and Physics	en_US
dc.identifier.repId	214374
dc.title.translated	Derivovatelný estimator hloubky pro bin picking	cs_CZ
dc.contributor.referee	Šikudová, Elena
thesis.degree.name	Bc.
thesis.degree.level	bakalářské	cs_CZ
thesis.degree.discipline	Obecná informatika	cs_CZ
thesis.degree.discipline	General Computer Science	en_US
thesis.degree.program	Computer Science	en_US
thesis.degree.program	Informatika	cs_CZ
uk.thesis.type	bakalářská práce	cs_CZ
uk.taxonomy.organization-cs	Matematicko-fyzikální fakulta::Ústav formální a aplikované lingvistiky	cs_CZ
uk.taxonomy.organization-en	Faculty of Mathematics and Physics::Institute of Formal and Applied Linguistics	en_US
uk.faculty-name.cs	Matematicko-fyzikální fakulta	cs_CZ
uk.faculty-name.en	Faculty of Mathematics and Physics	en_US
uk.faculty-abbr.cs	MFF	cs_CZ
uk.degree-discipline.cs	Obecná informatika	cs_CZ
uk.degree-discipline.en	General Computer Science	en_US
uk.degree-program.cs	Informatika	cs_CZ
uk.degree-program.en	Computer Science	en_US
thesis.grade.cs	Výborně	cs_CZ
thesis.grade.en	Excellent	en_US
uk.abstract.cs	Tato práce zkoumá možnosti rekonstrukce povrchu pro bin picking pomocí neuronových sítí. Při zkoumání článků z 2014-2018 se ukázalo, že stávající metody nejsou použitelné. Proto jsme vytvořili vlastní nízkourovnový přístup nazvaný EmfNet. Sít používá pyramidové zjemnování rozlišení, kde se na každé urovni pyramidy výpoctu úcastní tři oddělené sítě s jasne definovaným učelem, coz umožnuje témer kompletní pochopení fungování sítě. Model EmfNet byl již částecně použitelný, ale byl rozšířen na EmfNet-v2. Jednak dostal novou meřící vrstvu, aby nezávisel na zbytečném hyperparametru, hlavne ale bylo využito geometrických omezení k tomu, aby sit nemátly okluze (případy, kdy je určitá část povrchu vidět jen z jedné kamery). Obě sítě jsme implementovali a testovali na vlastnim korpusu jak renderovaných tak realných dat. Process párování korespondencí uvnitř sítě lze sledovat po- mocí vizualizace. Navrhli jsme způsob, jak využít robotickou ruku a SMF software k tomu, abychom relativně rychle získali potřebne mnozství dat pro natrénování modelu. Zatím nejlepší model zvládne zrekonstruovat 80% povrchu s chybou menší než 2 mm za čas pod 1 sekundu. 1	cs_CZ
uk.abstract.en	The goal of this thesis was to investigate the neural 3D surface reconstruction from multiple views with the intent to use the resulting depth maps for bin picking. Survey of papers from 2014 to 2018 showed that none of the state of the art methods would be used to control a robot arm in our setup. Therefore we decided to create our low-level neural approach which we called the EmfNet. The network is based on a pyramidal resolution refining approach. At each pyramid's layer, there are three separate networks that take part in the computation. Each of them has a definite goal, which gives us almost complete understanding of what is going on inside the network. The EmfNet model was partially usable, but we nevertheless extended it to EmfNet-v2. First, another measuring layer was added, which freed EmfNet from depending on an unnecessary hyperparameter. Second, we used constraints on geometry for the network not to be confused by occlusions (cases where a certain part of the surface is visible only from a single camera). Both networks were implemented and tested on a corpus that was created as a part of this thesis. A corpus containing rendered as well as real data. The process of correspondence pairing inside the network can be observed using the visualization tool. We designed a way how to use a robotic arm...	en_US
uk.file-availability	V
uk.grantor	Univerzita Karlova, Matematicko-fyzikální fakulta, Ústav formální a aplikované lingvistiky	cs_CZ
thesis.grade.code	1
uk.publication-place	Praha	cs_CZ
uk.thesis.defenceStatus	O
dc.identifier.lisID	990022928250106986