12 research outputs found
Discriminative Transfer Learning for General Image Restoration
Recently, several discriminative learning approaches have been proposed for
effective image restoration, achieving convincing trade-off between image
quality and computational efficiency. However, these methods require separate
training for each restoration task (e.g., denoising, deblurring, demosaicing)
and problem condition (e.g., noise level of input images). This makes it
time-consuming and difficult to encompass all tasks and conditions during
training. In this paper, we propose a discriminative transfer learning method
that incorporates formal proximal optimization and discriminative learning for
general image restoration. The method requires a single-pass training and
allows for reuse across various problems and conditions while achieving an
efficiency comparable to previous discriminative approaches. Furthermore, after
being trained, our model can be easily transferred to new likelihood terms to
solve untrained tasks, or be combined with existing priors to further improve
image restoration quality
Fast, Exact and Multi-Scale Inference for Semantic Image Segmentation with Deep Gaussian CRFs
In this work we propose a structured prediction technique that combines the
virtues of Gaussian Conditional Random Fields (G-CRF) with Deep Learning: (a)
our structured prediction task has a unique global optimum that is obtained
exactly from the solution of a linear system (b) the gradients of our model
parameters are analytically computed using closed form expressions, in contrast
to the memory-demanding contemporary deep structured prediction approaches that
rely on back-propagation-through-time, (c) our pairwise terms do not have to be
simple hand-crafted expressions, as in the line of works building on the
DenseCRF, but can rather be `discovered' from data through deep architectures,
and (d) out system can trained in an end-to-end manner. Building on standard
tools from numerical analysis we develop very efficient algorithms for
inference and learning, as well as a customized technique adapted to the
semantic segmentation task. This efficiency allows us to explore more
sophisticated architectures for structured prediction in deep learning: we
introduce multi-resolution architectures to couple information across scales in
a joint optimization framework, yielding systematic improvements. We
demonstrate the utility of our approach on the challenging VOC PASCAL 2012
image segmentation benchmark, showing substantial improvements over strong
baselines. We make all of our code and experiments available at
{https://github.com/siddharthachandra/gcrf}Comment: Our code is available at https://github.com/siddharthachandra/gcr
INDOOR MESH CLASSIFICATION FOR BIM
This work addresses the automatic reconstruction of objects useful for BIM, like walls, floors and ceilings, from meshed and textured mapped 3D point clouds of indoor scenes. For this reason, we focus on the semantic segmentation of 3D indoor meshes as the initial step for the automatic generation of BIM models. Our investigations are based on the benchmark dataset ScanNet, which aims at the interpretation of 3D indoor scenes. For this purpose it provides 3D meshed representations as collected from low cost range cameras. In our opinion such RGB-D data has a great potential for the automated reconstruction of BIM objects
Non-parametric Blur Map Regression for Depth of Field Extension
Real camera systems have a limited depth of field (DOF) which may cause an image to be degraded due to visible misfocus or too shallow DOF. In this paper, we present a blind deblurring pipeline able to restore such images by slightly extending their DOF and recovering sharpness in regions slightly out-of-focus. To address this severely ill-posed problem, our algorithm relies first on the estimation of the spatiallyvarying defocus blur. Drawing on local frequency image features, a machine learning approach based on the recently introduced Regression Tree Fields is used to train a model able to regress a coherent defocus blur map of the image, labeling each pixel by the scale of a defocus point-spread-function. A non-blind spatiallyvarying deblurring algorithm is then used to properly extend the DOF of the image. The good performance of our algorithm is assessed both quantitatively, using realistic ground truth data obtained with a novel approach based on a plenoptic camera, and qualitatively with real images
Probabilistic techniques in semantic mapping for mobile robotics
Los mapas semánticos son representaciones del mundo que permiten a un robot entender no sólo los aspectos espaciales de su lugar de trabajo, sino también el significado de sus elementos (objetos, habitaciones, etc.) y como los humanos interactúan con ellos (e.g. funcionalidades, eventos y relaciones). Para conseguirlo, un mapa semántico añade a las representaciones puramente espaciales, tales como mapas geométricos o topológicos, meta-información sobre los tipos de elementos y relaciones que pueden encontrarse en el entorno de trabajo. Esta meta-información, denominada conocimiento semántico o de sentido común, se codifica típicamente en Bases de Conocimiento.
Un ejemplo de este tipo de información podría ser: "los frigoríficos son objetos grandes, con forma rectangular, colocados normalmente en las cocinas, y que pueden contener comida perecedera y medicación". Codificar y manejar este conocimiento semántico permite al robot razonar acerca de la información obtenida de un cierto lugar de trabajo, así como inferir nueva información con el fin de ejecutar eficientemente tareas de alto nivel como "¡hola robot! llévale la medicación a la abuela, por favor".
La presente tesis propone la utilización de técnicas probabilísticas para construir y mantener mapas semánticos, lo cual presenta tres ventajas principales en comparación con los enfoques tradicionales:
i) permite manejar incertidumbre (proveniente de los sensores imprecisos del robot y de los modelos empleados),
ii) provee representaciones del entorno coherentes por medio del aprovechamiento de las relaciones contextuales entre los elementos observados (e.g. los frigoríficos usualmente se encuentran en las cocinas) desde un punto de vista holístico, y
iii) produce valores de certidumbre que reflejan el grado de exactitud de la comprensión del robot acerca de su entorno.
Específicamente, las contribuciones presentadas pueden agruparse en dos temas principales. El primer conjunto de contribuciones se basa en el problema del reconocimiento de objetos y/o habitaciones, ya que los sistemas de mapeo semántico deben contar con algoritmos de reconocimiento fiables para la construcción de representaciones válidas. Para ello se ha explorado la utilización de Modelos Gráficos Probabilísticos (Probabilistic Graphical Models o PGMs en inglés) con el fin de aprovechar las relaciones de contexto entre objetos y/o habitaciones a la vez que se maneja la incertidumbre inherente al problema de reconocimiento, y el empleo de Bases de Conocimiento para mejorar su desempeño de distintos modos, e.g., detectando resultados incoherentes, proveyendo información a priori, reduciendo la complejidad de los algoritmos de inferencia probabilística, generando ejemplos de entrenamiento sintéticos, habilitando el aprendizaje a partir de experiencias pasadas, etc.
El segundo grupo de contribuciones acomoda los resultados probabilísticos provenientes de los algoritmos de reconocimiento desarrollados en una nueva representación semántica, denominada Multiversal Semantic Map (MvSmap). Este mapa gestiona múltiples interpretaciones del espacio de trabajo del robot, llamadas universos, los cuales son anotados con la probabilidad de ser los correctos de acuerdo con el conocimiento actual del robot. Así, este enfoque proporciona una creencia fundamentada sobre la exactitud de la comprensión del robot sobre su entorno, lo que le permite operar de una manera más eficiente y coherente.
Los algoritmos probabilísticos propuestos han sido testeados concienzudamente y comparados con otros enfoques actuales e innovadores empleando conjuntos de datos del estado del arte. De manera adicional, esta tesis también contribuye con dos conjuntos de datos, UMA-Offices and Robot@Home, los cuales contienen información sensorial capturada en distintos entornos de oficinas y casas, así como dos herramientas software, la librería Undirected Probabilistic Graphical Models in C++ (UPGMpp), y el conjunto de herramientas Object Labeling Toolkit (OLT), para el trabajo con Modelos Gráficos Probabilísticos y el procesamiento de conjuntos de datos respectivamente
Image similarity in medical images
Recent experiments have indicated a strong influence of the substrate grain orientation on the self-ordering in anodic porous alumina. Anodic porous alumina with straight pore channels grown in a stable, self-ordered manner is formed on (001) oriented Al grain, while disordered porous pattern is formed on (101) oriented Al grain with tilted pore channels growing in an unstable manner. In this work, numerical simulation of the pore growth process is carried out to understand this phenomenon. The rate-determining step of the oxide growth is assumed to be the Cabrera-Mott barrier at the oxide/electrolyte (o/e) interface, while the substrate is assumed to determine the ratio β between the ionization and oxidation reactions at the metal/oxide (m/o) interface. By numerically solving the electric field inside a growing porous alumina during anodization, the migration rates of the ions and hence the evolution of the o/e and m/o interfaces are computed. The simulated results show that pore growth is more stable when β is higher. A higher β corresponds to more Al ionized and migrating away from the m/o interface rather than being oxidized, and hence a higher retained O:Al ratio in the oxide. Experimentally measured oxygen content in the self-ordered porous alumina on (001) Al is indeed found to be about 3% higher than that in the disordered alumina on (101) Al, in agreement with the theoretical prediction. The results, therefore, suggest that ionization on (001) Al substrate is relatively easier than on (101) Al, and this leads to the more stable growth of the pore channels on (001) Al