12 research outputs found

    Robust Learning Architectures for Perceiving Object Semantics and Geometry

    Get PDF
    Parsing object semantics and geometry in a scene is one core task in visual understanding. This includes classification of object identity and category, localizing and segmenting an object from cluttered background, estimating object orientation and parsing 3D shape structures. With the emergence of deep convolutional architectures in recent years, substantial progress has been made towards learning scalable image representation for large-scale vision problems such as image classification. However, there still remains some fundamental challenges in learning robust object representation. First, creating object representations that are robust to changes in viewpoint while capturing local visual details continues to be a problem. In particular, recent convolutional architectures employ spatial pooling to achieve scale and shift invariances, but they are still sensitive to out-of-plane rotations. Second, deep Convolutional Neural Networks (CNNs) are purely driven by data and predominantly pose the scene interpretation problem as an end-to-end black-box mapping. However, decades of work on perceptual organization in both human and machine vision suggests that there are often intermediate representations that are intrinsic to an inference task, and which provide essential structure to improve generalization. In this dissertation, we present two methodologies to surmount the aforementioned two issues. We first introduce a multi-domain pooling framework which group local visual signals within generic feature spaces that are invariant to 3D object transformation, thereby reducing the sensitivity of output feature to spatial deformations. We formulate a probabilistic analysis of pooling which further suggests the multi-domain pooling principle. In addition, this principle guides us in designing convolutional architectures which achieve state-of-the-art performance on instance classification and semantic segmentation. We also present a multi-view fusion algorithm which efficiently computes multi-domain pooling feature on incrementally reconstructed scenes and aggregates semantic confidence to boost long-term performance for semantic segmentation. Next, we explore an approach for injecting prior domain structure into neural network training, which leads a CNN to recover a sequence of intermediate milestones towards the final goal. Our approach supervises hidden layers of a CNN with intermediate concepts that normally are not observed in practice. We formulate a probabilistic framework which formalizes these notions and predicts improved generalization via this deep supervision method.One advantage of this approach is that we are able to generalize the model trained from synthetic CAD renderings of cluttered scenes, where concept values can be extracted, to real image domain. We implement this deep supervision framework with a novel CNN architecture which is trained on synthetic image only and achieves the state-of-the-art performance of 2D/3D keypoint localization on real image benchmarks. Finally, the proposed deep supervision scheme also motivates an approach for accurately inferring six Degree-of-Freedom (6-DoF) pose for a large number of object classes from single or multiple views. To learn discriminative pose features, we integrate three new capabilities into a deep CNN: an inference scheme that combines both classification and pose regression based on an uniform tessellation of SE(3), fusion of a class prior into the training process via a tiled class map, and an additional regularization using deep supervision with an object mask. Further, an efficient multi-view framework is formulated to address single-view ambiguity. We show the proposed multi-view scheme consistently improves the performance of the single-view network. Our approach achieves the competitive or superior performance over the current state-of-the-art methods on three large-scale benchmarks

    Learning cognitive maps: Finding useful structure in an uncertain world

    Get PDF
    In this chapter we will describe the central mechanisms that influence how people learn about large-scale space. We will focus particularly on how these mechanisms enable people to effectively cope with both the uncertainty inherent in a constantly changing world and also with the high information content of natural environments. The major lessons are that humans get by with a less is more approach to building structure, and that they are able to quickly adapt to environmental changes thanks to a range of general purpose mechanisms. By looking at abstract principles, instead of concrete implementation details, it is shown that the study of human learning can provide valuable lessons for robotics. Finally, these issues are discussed in the context of an implementation on a mobile robot. © 2007 Springer-Verlag Berlin Heidelberg

    Visual Perception For Robotic Spatial Understanding

    Get PDF
    Humans understand the world through vision without much effort. We perceive the structure, objects, and people in the environment and pay little direct attention to most of it, until it becomes useful. Intelligent systems, especially mobile robots, have no such biologically engineered vision mechanism to take for granted. In contrast, we must devise algorithmic methods of taking raw sensor data and converting it to something useful very quickly. Vision is such a necessary part of building a robot or any intelligent system that is meant to interact with the world that it is somewhat surprising we don\u27t have off-the-shelf libraries for this capability. Why is this? The simple answer is that the problem is extremely difficult. There has been progress, but the current state of the art is impressive and depressing at the same time. We now have neural networks that can recognize many objects in 2D images, in some cases performing better than a human. Some algorithms can also provide bounding boxes or pixel-level masks to localize the object. We have visual odometry and mapping algorithms that can build reasonably detailed maps over long distances with the right hardware and conditions. On the other hand, we have robots with many sensors and no efficient way to compute their relative extrinsic poses for integrating the data in a single frame. The same networks that produce good object segmentations and labels in a controlled benchmark still miss obvious objects in the real world and have no mechanism for learning on the fly while the robot is exploring. Finally, while we can detect pose for very specific objects, we don\u27t yet have a mechanism that detects pose that generalizes well over categories or that can describe new objects efficiently. We contribute algorithms in four of the areas mentioned above. First, we describe a practical and effective system for calibrating many sensors on a robot with up to 3 different modalities. Second, we present our approach to visual odometry and mapping that exploits the unique capabilities of RGB-D sensors to efficiently build detailed representations of an environment. Third, we describe a 3-D over-segmentation technique that utilizes the models and ego-motion output in the previous step to generate temporally consistent segmentations with camera motion. Finally, we develop a synthesized dataset of chair objects with part labels and investigate the influence of parts on RGB-D based object pose recognition using a novel network architecture we call PartNet

    Irish Machine Vision and Image Processing Conference Proceedings 2017

    Get PDF

    Tools and Algorithms for the Construction and Analysis of Systems

    Get PDF
    This open access book constitutes the proceedings of the 28th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2022, which was held during April 2-7, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 46 full papers and 4 short papers presented in this volume were carefully reviewed and selected from 159 submissions. The proceedings also contain 16 tool papers of the affiliated competition SV-Comp and 1 paper consisting of the competition report. TACAS is a forum for researchers, developers, and users interested in rigorously based tools and algorithms for the construction and analysis of systems. The conference aims to bridge the gaps between different communities with this common interest and to support them in their quest to improve the utility, reliability, exibility, and efficiency of tools and algorithms for building computer-controlled systems

    Tools and Algorithms for the Construction and Analysis of Systems

    Get PDF
    This open access book constitutes the proceedings of the 28th International Conference on Tools and Algorithms for the Construction and Analysis of Systems, TACAS 2022, which was held during April 2-7, 2022, in Munich, Germany, as part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2022. The 46 full papers and 4 short papers presented in this volume were carefully reviewed and selected from 159 submissions. The proceedings also contain 16 tool papers of the affiliated competition SV-Comp and 1 paper consisting of the competition report. TACAS is a forum for researchers, developers, and users interested in rigorously based tools and algorithms for the construction and analysis of systems. The conference aims to bridge the gaps between different communities with this common interest and to support them in their quest to improve the utility, reliability, exibility, and efficiency of tools and algorithms for building computer-controlled systems

    2013 Oklahoma Research Day Full Program

    Get PDF
    This document contains all abstracts from the 2013 Oklahoma Research Day held at the University of Central Oklahoma

    Advances and Applications of DSmT for Information Fusion. Collected Works, Volume 5

    Get PDF
    This fifth volume on Advances and Applications of DSmT for Information Fusion collects theoretical and applied contributions of researchers working in different fields of applications and in mathematics, and is available in open-access. The collected contributions of this volume have either been published or presented after disseminating the fourth volume in 2015 in international conferences, seminars, workshops and journals, or they are new. The contributions of each part of this volume are chronologically ordered. First Part of this book presents some theoretical advances on DSmT, dealing mainly with modified Proportional Conflict Redistribution Rules (PCR) of combination with degree of intersection, coarsening techniques, interval calculus for PCR thanks to set inversion via interval analysis (SIVIA), rough set classifiers, canonical decomposition of dichotomous belief functions, fast PCR fusion, fast inter-criteria analysis with PCR, and improved PCR5 and PCR6 rules preserving the (quasi-)neutrality of (quasi-)vacuous belief assignment in the fusion of sources of evidence with their Matlab codes. Because more applications of DSmT have emerged in the past years since the apparition of the fourth book of DSmT in 2015, the second part of this volume is about selected applications of DSmT mainly in building change detection, object recognition, quality of data association in tracking, perception in robotics, risk assessment for torrent protection and multi-criteria decision-making, multi-modal image fusion, coarsening techniques, recommender system, levee characterization and assessment, human heading perception, trust assessment, robotics, biometrics, failure detection, GPS systems, inter-criteria analysis, group decision, human activity recognition, storm prediction, data association for autonomous vehicles, identification of maritime vessels, fusion of support vector machines (SVM), Silx-Furtif RUST code library for information fusion including PCR rules, and network for ship classification. Finally, the third part presents interesting contributions related to belief functions in general published or presented along the years since 2015. These contributions are related with decision-making under uncertainty, belief approximations, probability transformations, new distances between belief functions, non-classical multi-criteria decision-making problems with belief functions, generalization of Bayes theorem, image processing, data association, entropy and cross-entropy measures, fuzzy evidence numbers, negator of belief mass, human activity recognition, information fusion for breast cancer therapy, imbalanced data classification, and hybrid techniques mixing deep learning with belief functions as well

    Annual Report of the University, 1994-1995, Volumes 1-4

    Get PDF
    DEMONSTRATING THE STRENGTH OF DIVERSITY A walk around the UNM campus as students change classes demonstrates UNM\\u27s commitment to diversity. Students and professors from a variety of ethnic backgrounds crowd the sidewalks and fill classrooms. Over the past year UNM moved forward with existing and new programs to interest more minority students, faculty and staff in the University and to aid in their success while here. Hispanic Outlook in Higher Education recently recognized the University\\u27s endeavors, ranking UNM as one of the best colleges in the nation at graduating Hispanic students. Provost Mary Sue Coleman says diversity contributes to a stimulating environment where faculty and students have different points of view and experiences. The campus becomes a more intellectually alive place, she says. The efforts to build a diverse campus go hand in hand with the University\\u27s goals of achieving academic excellence and attracting the best and brightest. MINORITY ENROLLMENT In the fall of 1994 a total of 32 percent of the student body came from underrepresented groups. The UNM School of Law had the largest number of Native Americans enrolled in any law school in the country

    XLIII Jornadas de Automática: libro de actas: 7, 8 y 9 de septiembre de 2022, Logroño (La Rioja)

    Get PDF
    [Resumen] Las Jornadas de Automática (JA) son el evento más importante del Comité Español de Automática (CEA), entidad científico-técnica con más de cincuenta años de vida y destinada a la difusión e implantación de la Automática en la sociedad. Este año se celebra la cuadragésima tercera edición de las JA, que constituyen el punto de encuentro de la comunidad de Automática de nuestro país. La presente edición permitirá dar visibilidad a los nuevos retos y resultados del ámbito, y su uso en un gran número de aplicaciones, entre otras, las energías renovables, la bioingeniería o la robótica asistencial. Además de la componente científica, que se ve reflejada en este libro de actas, las JA son un punto de encuentro de las diferentes generaciones de profesores, investigadores y profesionales, incluyendo la componente social que es de vital importancia. Esta edición 2022 de las JA se celebra en Logroño, capital de La Rioja, región mundialmente conocida por la calidad de sus vinos de Denominación de Origen y que ha asumido el desafío de poder ganar competitividad a través de la transformación verde y digital. Pero también por ser la cuna del castellano e impulsar el Valle de la Lengua con la ayuda de las nuevas tecnologías, entre ellas la Automática Inteligente. Los organizadores de estas JA, pertenecientes al Área de Ingeniería de Sistemas y Automática del Departamento de Ingeniería Eléctrica de la Universidad de La Rioja (UR), constituyen un pilar fundamental en el apoyo a la región para el estudio, implementación y difusión de estos retos. Esta edición, la primera en formato íntegramente presencial después de la pandemia de la covid-19, cuenta con más de 200 asistentes y se celebra a caballo entre el Edificio Politécnico de la Escuela Técnica Superior de Ingeniería Industrial y el Monasterio de Yuso situado en San Millán de la Cogolla, dos marcos excepcionales para la realización de las JA. Como parte del programa científico, dos sesiones plenarias harán hincapié, respectivamente, sobre soluciones de control para afrontar los nuevos retos energéticos, y sobre la calidad de los datos para una inteligencia artificial (IA) imparcial y confiable. También, dos mesas redondas debatirán aplicaciones de la IA y la implantación de la tecnología digital en la actividad profesional. Adicionalmente, destacaremos dos clases magistrales alineadas con tecnología de última generación que serán impartidas por profesionales de la empresa. Las JA también van a albergar dos competiciones: CEABOT, con robots humanoides, y el Concurso de Ingeniería de Control, enfocado a UAVs. A todas estas actividades hay que añadir las reuniones de los grupos temáticos de CEA, las exhibiciones de pósteres con las comunicaciones presentadas a las JA y los expositores de las empresas. Por último, durante el evento se va a proceder a la entrega del “Premio Nacional de Automática” (edición 2022) y del “Premio CEA al Talento Femenino en Automática”, patrocinado por el Gobierno de La Rioja (en su primera edición), además de diversos galardones enmarcados dentro de las actividades de los grupos temáticos de CEA. Las actas de las XLIII Jornadas de Automática están formadas por un total de 143 comunicaciones, organizadas en torno a los nueve Grupos Temáticos y a las dos Líneas Estratégicas de CEA. Los trabajos seleccionados han sido sometidos a un proceso de revisión por pares
    corecore