175 research outputs found
S3CNet: A Sparse Semantic Scene Completion Network for LiDAR Point Clouds
With the increasing reliance of self-driving and similar robotic systems on
robust 3D vision, the processing of LiDAR scans with deep convolutional neural
networks has become a trend in academia and industry alike. Prior attempts on
the challenging Semantic Scene Completion task - which entails the inference of
dense 3D structure and associated semantic labels from "sparse" representations
- have been, to a degree, successful in small indoor scenes when provided with
dense point clouds or dense depth maps often fused with semantic segmentation
maps from RGB images. However, the performance of these systems drop
drastically when applied to large outdoor scenes characterized by dynamic and
exponentially sparser conditions. Likewise, processing of the entire sparse
volume becomes infeasible due to memory limitations and workarounds introduce
computational inefficiency as practitioners are forced to divide the overall
volume into multiple equal segments and infer on each individually, rendering
real-time performance impossible. In this work, we formulate a method that
subsumes the sparsity of large-scale environments and present S3CNet, a sparse
convolution based neural network that predicts the semantically completed scene
from a single, unified LiDAR point cloud. We show that our proposed method
outperforms all counterparts on the 3D task, achieving state-of-the art results
on the SemanticKITTI benchmark. Furthermore, we propose a 2D variant of S3CNet
with a multi-view fusion strategy to complement our 3D network, providing
robustness to occlusions and extreme sparsity in distant regions. We conduct
experiments for the 2D semantic scene completion task and compare the results
of our sparse 2D network against several leading LiDAR segmentation models
adapted for bird's eye view segmentation on two open-source datasets.Comment: 14 page
Aerospace medicine and biology: A continuing bibliography with indexes (supplement 335)
This bibliography lists 143 reports, articles and other documents introduced into the NASA Scientific and Technical Information System during March, 1990. Subject coverage includes: aerospace medicine and psychology, life support systems and controlled environments, safety equipment, exobiology and extraterrestrial life, and flight crew behavior and performance
Advances towards behaviour-based indoor robotic exploration
215 p.The main contributions of this research work remain in object recognition by computer vision, by one side, and in robot localisation and mapping by the other. The first contribution area of the research address object recognition in mobile robots. In this area, door handle recognition is of great importance, as it help the robot to identify doors in places where the camera is not able to view the whole door. In this research, a new two step algorithm is presented based on feature extraction that aimed at improving the extracted features to reduce the superfluous keypoints to be compared at the same time that it increased its efficiency by improving accuracy and reducing the computational time. Opposite to segmentation based paradigms, the feature extraction based two-step method can easily be generalized to other types of handles or even more, to other type of objects such as road signals. Experiments have shown very good accuracy when tested in real environments with different kind of door handles. With respect to the second contribution, a new technique to construct a topological map during the exploration phase a robot would perform on an unseen office-like environment is presented. Firstly a preliminary approach proposed to merge the Markovian localisation in a distributed system, which requires low storage and computational resources and is adequate to be applied in dynamic environments. In the same area, a second contribution to terrain inspection level behaviour based navigation concerned to the development of an automatic mapping method for acquiring the procedural topological map. The new approach is based on a typicality test called INCA to perform the so called loop-closing action. The method was integrated in a behaviour-based control architecture and tested in both, simulated and real robot/environment system. The developed system proved to be useful also for localisation purpose
Multi-Scale Architectures for Human Pose Estimation
In this dissertation we present multiple state-of-the-art deep learning methods for computer vision tasks using multi-scale approaches for two main tasks: pose estimation and semantic segmentation. For pose estimation, we introduce a complete framework expanding the fields-of-view of the network through a multi-scale approach, resulting in a significant increasing the effectiveness of conventional backbone architectures, for several pose estimation tasks without requiring a larger network or postprocessing. Our multi-scale pose estimation framework contributes to research on methods for single-person pose estimation in both 2D and 3D scenarios, pose estimation in videos, and the estimation of multiple people’s pose in a single image for both top-down and bottom-up approaches. In addition to the enhanced capability of multi-person pose estimation generated by our multi-scale approach, our framework also demonstrates a superior capacity to expanded the more detailed and heavier task of full-body pose estimation, including up to 133 joints per person. For segmentation, we present a new efficient architecture for semantic segmentation, based on a “Waterfall” Atrous Spatial Pooling architecture, that achieves a considerable accuracy increase while decreasing the number of network parameters and memory footprint. The proposed Waterfall architecture leverages the efficiency of progressive filtering in the cascade architecture while maintaining multi-scale fields-of-view comparable to spatial pyramid configurations. Additionally, our method does not rely on a postprocessing stage with conditional random fields, which further reduces complexity and required training time
INTELLIGENT VISION-BASED NAVIGATION SYSTEM
This thesis presents a complete vision-based navigation system that can plan and
follow an obstacle-avoiding path to a desired destination on the basis of an internal map
updated with information gathered from its visual sensor.
For vision-based self-localization, the system uses new floor-edges-specific filters
for detecting floor edges and their pose, a new algorithm for determining the orientation of
the robot, and a new procedure for selecting the initial positions in the self-localization
procedure. Self-localization is based on matching visually detected features with those
stored in a prior map.
For planning, the system demonstrates for the first time a real-world application of
the neural-resistive grid method to robot navigation. The neural-resistive grid is modified
with a new connectivity scheme that allows the representation of the collision-free space of
a robot with finite dimensions via divergent connections between the spatial memory layer
and the neuro-resistive grid layer.
A new control system is proposed. It uses a Smith Predictor architecture that has
been modified for navigation applications and for intermittent delayed feedback typical of
artificial vision. A receding horizon control strategy is implemented using Normalised
Radial Basis Function nets as path encoders, to ensure continuous motion during the delay
between measurements.
The system is tested in a simplified environment where an obstacle placed
anywhere is detected visually and is integrated in the path planning process.
The results show the validity of the control concept and the crucial importance of a
robust vision-based self-localization process
Modern Information Systems
The development of modern information systems is a demanding task. New technologies and tools are designed, implemented and presented in the market on a daily bases. User needs change dramatically fast and the IT industry copes to reach the level of efficiency and adaptability for its systems in order to be competitive and up-to-date. Thus, the realization of modern information systems with great characteristics and functionalities implemented for specific areas of interest is a fact of our modern and demanding digital society and this is the main scope of this book. Therefore, this book aims to present a number of innovative and recently developed information systems. It is titled "Modern Information Systems" and includes 8 chapters. This book may assist researchers on studying the innovative functions of modern systems in various areas like health, telematics, knowledge management, etc. It can also assist young students in capturing the new research tendencies of the information systems' development
Image Classification of High Variant Objects in Fast Industrial Applications
Recent advances in machine learning and image processing have expanded the applications of computer vision
in many industries. In industrial applications, image classification is a crucial task since high variant objects
present difficult problems because of their variety and constant change in attributes. Computer vision algorithms
can function effectively in complex environments, working alongside human operators to enhance efficiency and
data accuracy. However, there are still many industries facing difficulties with automation that have not yet been
properly solved and put into practice. They have the need for more accurate, convenient, and faster methods.
These solutions drove my interest in combining multiple learning strategies as well as sensors and image formats
to enable the use of computer vision for these applications. The motivation for this work is to answer a number of
research questions that aim to mitigate current problems in hinder their practical application. This work therefore
aims to present solutions that contribute to enabling these solutions. I demonstrate why standard methods cannot
simply be applied to an existing problem. Each method must be customized to the specific application scenario
in order to obtain a working solution.
One example is face recognition where the classification performance is crucial for the system’s ability to
correctly identify individuals. Additional features would allow higher accuracy, robustness, safety, and make
presentation attacks more difficult. The detection of attempted attacks is critical for the acceptance of such
systems and significantly impacts the applicability of biometrics. Another application is tailgating detection
at automated entrance gates. Especially in high security environments it is important to prevent that authorized
persons can take an unauthorized person into the secured area. There is a plethora of technology that seem potentially
suitable but there are several practical factors to consider that increase or decrease applicability depending
which method is used. The third application covered in this thesis is the classification of textiles when they are
not spread out. Finding certain properties on them is complex, as these properties might be inside a fold, or differ
in appearance because of shadows and position.
The first part of this work provides in-depth analysis of the three individual applications, including background
information that is needed to understand the research topic and its proposed solutions. It includes the state of
the art in the area for all researched applications. In the second part of this work, methods are presented to
facilitate or enable the industrial applicability of the presented applications. New image databases are initially
presented for all three application areas. In the case of biometrics, three methods that identify and improve
specific performance parameters are shown. It will be shown how melanin face pigmentation (MFP) features
can be extracted and used for classification in face recognition and PAD applications. In the entrance control
application, the focus is on the sensor information with six methods being presented in detail. This includes the
use of thermal images to detect humans based on their body heat, depth images in form of RGB-D images and
2D image series, as well as data of a floor mounted sensor-grid. For textile defect detection several methods and
a novel classification procedure, in free-fall is presented.
In summary, this work examines computer vision applications for their practical industrial applicability and
presents solutions to mitigate the identified problems. In contrast to previous work, the proposed approaches are
(a) effective in improving classification performance (b) fast in execution and (c) easily integrated into existing
processes and equipment
Towards a framework for socially interactive robots
250 p.En las últimas décadas, la investigación en el campo de la robótica social ha crecido considerablemente. El desarrollo de diferentes tipos de robots y sus roles dentro de la sociedad se están expandiendo poco a poco. Los robots dotados de habilidades sociales pretenden ser utilizados para diferentes aplicaciones; por ejemplo, como profesores interactivos y asistentes educativos, para apoyar el manejo de la diabetes en niños, para ayudar a personas mayores con necesidades especiales, como actores interactivos en el teatro o incluso como asistentes en hoteles y centros comerciales.El equipo de investigación RSAIT ha estado trabajando en varias áreas de la robótica, en particular,en arquitecturas de control, exploración y navegación de robots, aprendizaje automático y visión por computador. El trabajo presentado en este trabajo de investigación tiene como objetivo añadir una nueva capa al desarrollo anterior, la capa de interacción humano-robot que se centra en las capacidades sociales que un robot debe mostrar al interactuar con personas, como expresar y percibir emociones, mostrar un alto nivel de diálogo, aprender modelos de otros agentes, establecer y mantener relaciones sociales, usar medios naturales de comunicación (mirada, gestos, etc.),mostrar personalidad y carácter distintivos y aprender competencias sociales.En esta tesis doctoral, tratamos de aportar nuestro grano de arena a las preguntas básicas que surgen cuando pensamos en robots sociales: (1) ¿Cómo nos comunicamos (u operamos) los humanos con los robots sociales?; y (2) ¿Cómo actúan los robots sociales con nosotros? En esa línea, el trabajo se ha desarrollado en dos fases: en la primera, nos hemos centrado en explorar desde un punto de vista práctico varias formas que los humanos utilizan para comunicarse con los robots de una maneranatural. En la segunda además, hemos investigado cómo los robots sociales deben actuar con el usuario.Con respecto a la primera fase, hemos desarrollado tres interfaces de usuario naturales que pretenden hacer que la interacción con los robots sociales sea más natural. Para probar tales interfaces se han desarrollado dos aplicaciones de diferente uso: robots guía y un sistema de controlde robot humanoides con fines de entretenimiento. Trabajar en esas aplicaciones nos ha permitido dotar a nuestros robots con algunas habilidades básicas, como la navegación, la comunicación entre robots y el reconocimiento de voz y las capacidades de comprensión.Por otro lado, en la segunda fase nos hemos centrado en la identificación y el desarrollo de los módulos básicos de comportamiento que este tipo de robots necesitan para ser socialmente creíbles y confiables mientras actúan como agentes sociales. Se ha desarrollado una arquitectura(framework) para robots socialmente interactivos que permite a los robots expresar diferentes tipos de emociones y mostrar un lenguaje corporal natural similar al humano según la tarea a realizar y lascondiciones ambientales.La validación de los diferentes estados de desarrollo de nuestros robots sociales se ha realizado mediante representaciones públicas. La exposición de nuestros robots al público en esas actuaciones se ha convertido en una herramienta esencial para medir cualitativamente la aceptación social de los prototipos que estamos desarrollando. De la misma manera que los robots necesitan un cuerpo físico para interactuar con el entorno y convertirse en inteligentes, los robots sociales necesitan participar socialmente en tareas reales para las que han sido desarrollados, para así poder mejorar su sociabilida
- …