146 research outputs found
Filtering of image sequences: on line edge detection and motion reconstruction
L'argomento della Tesi riguarda lĂelaborazione di sequenze di immagini, relative ad una
scena in cui uno o piË oggetti (possibilmente deformabili) si muovono e acquisite da un
opportuno strumento di misura. A causa del processo di misura, le immagini sono corrotte da
un livello di degradazione. Si riporta la formalizzazione matematica dellĂinsieme delle
immagini considerate, dellĂinsieme dei moti ammissibili e della degradazione introdotta dallo
strumento di misura. Ogni immagine della sequenza acquisita ha una relazione con tutte le
altre, stabilita dalla legge del moto della scena. LĂidea proposta in questa Tesi Ă quella di
sfruttare questa relazione tra le diverse immagini della sequenza per ricostruire grandezze di
interesse che caratterizzano la scena.
Nel caso in cui si conosce il moto, lĂinteresse Ă quello di ricostruire i contorni dellĂimmagine
iniziale (che poi possono essere propagati attraverso la stessa legge del moto, in modo da
ricostruire i contorni della generica immagine appartenente alla sequenza in esame), stimando
lĂampiezza e del salto del livello di grigio e la relativa localizzazione.
Nel caso duale si suppone invece di conoscere la disposizione dei contorni nellĂimmagine
iniziale e di avere un modello stocastico che descriva il moto; lĂobiettivo Ă quindi stimare i
parametri che caratterizzano tale modello.
Infine, si presentano i risultati dellĂapplicazione delle due metodologie succitate a dati reali
ottenuti in ambito biomedicale da uno strumento denominato pupillometro. Tali risultati sono
di elevato interesse nellĂottica di utilizzare il suddetto strumento a fini diagnostici
I-theory on depth vs width: hierarchical function composition
Deep learning networks with convolution, pooling and subsampling are a special case of hierar- chical architectures, which can be represented by trees (such as binary trees). Hierarchical as well as shallow networks can approximate functions of several variables, in particular those that are com- positions of low dimensional functions. We show that the power of a deep network architecture with respect to a shallow network is rather independent of the specific nonlinear operations in the network and depends instead on the the behavior of the VC-dimension. A shallow network can approximate compositional functions with the same error of a deep network but at the cost of a VC-dimension that is exponential instead than quadratic in the dimensionality of the function. To complete the argument we argue that there exist visual computations that are intrinsically compositional. In particular, we prove that recognition invariant to translation cannot be computed by shallow networks in the presence of clutter. Finally, a general framework that includes the compositional case is sketched. The key con- dition that allows tall, thin networks to be nicer that short, fat networks is that the target input-output function must be sparse in a certain technical sense.This work was supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF - 1231216
Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)
The implicit objective of the biennial "international - Traveling Workshop on
Interactions between Sparse models and Technology" (iTWIST) is to foster
collaboration between international scientific teams by disseminating ideas
through both specific oral/poster presentations and free discussions. For its
second edition, the iTWIST workshop took place in the medieval and picturesque
town of Namur in Belgium, from Wednesday August 27th till Friday August 29th,
2014. The workshop was conveniently located in "The Arsenal" building within
walking distance of both hotels and town center. iTWIST'14 has gathered about
70 international participants and has featured 9 invited talks, 10 oral
presentations, and 14 posters on the following themes, all related to the
theory, application and generalization of the "sparsity paradigm":
Sparsity-driven data sensing and processing; Union of low dimensional
subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph
sensing/processing; Blind inverse problems and dictionary learning; Sparsity
and computational neuroscience; Information theory, geometry and randomness;
Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?;
Sparse machine learning and inference.Comment: 69 pages, 24 extended abstracts, iTWIST'14 website:
http://sites.google.com/site/itwist1
Automated Semantic Content Extraction from Images
In this study, an automatic semantic segmentation and object recognition methodology is implemented which bridges the semantic gap between low level features of image content and high level conceptual meaning. Semantically understanding an image is essential in modeling autonomous robots, targeting customers in marketing or reverse engineering of building information modeling in the construction industry. To achieve an understanding of a room from a single image we proposed a new object recognition framework which has four major components: segmentation, scene detection, conceptual cueing and object recognition. The new segmentation methodology developed in this research extends Felzenswalb\u27s cost function to include new surface index and depth features as well as color, texture and normal features to overcome issues of occlusion and shadowing commonly found in images. Adding depth allows capturing new features for object recognition stage to achieve high accuracy compared to the current state of the art. The goal was to develop an approach to capture and label perceptually important regions which often reflect global representation and understanding of the image. We developed a system by using contextual and common sense information for improving object recognition and scene detection, and fused the information from scene and objects to reduce the level of uncertainty. This study in addition to improving segmentation, scene detection and object recognition, can be used in applications that require physical parsing of the image into objects, surfaces and their relations. The applications include robotics, social networking, intelligence and anti-terrorism efforts, criminal investigations and security, marketing, and building information modeling in the construction industry. In this dissertation a structural framework (ontology) is developed that generates text descriptions based on understanding of objects, structures and the attributes of an image
- âŠ