163 research outputs found

    Advances and Applications of DSmT for Information Fusion. Collected Works, Volume 5

    Get PDF
    This fifth volume on Advances and Applications of DSmT for Information Fusion collects theoretical and applied contributions of researchers working in different fields of applications and in mathematics, and is available in open-access. The collected contributions of this volume have either been published or presented after disseminating the fourth volume in 2015 in international conferences, seminars, workshops and journals, or they are new. The contributions of each part of this volume are chronologically ordered. First Part of this book presents some theoretical advances on DSmT, dealing mainly with modified Proportional Conflict Redistribution Rules (PCR) of combination with degree of intersection, coarsening techniques, interval calculus for PCR thanks to set inversion via interval analysis (SIVIA), rough set classifiers, canonical decomposition of dichotomous belief functions, fast PCR fusion, fast inter-criteria analysis with PCR, and improved PCR5 and PCR6 rules preserving the (quasi-)neutrality of (quasi-)vacuous belief assignment in the fusion of sources of evidence with their Matlab codes. Because more applications of DSmT have emerged in the past years since the apparition of the fourth book of DSmT in 2015, the second part of this volume is about selected applications of DSmT mainly in building change detection, object recognition, quality of data association in tracking, perception in robotics, risk assessment for torrent protection and multi-criteria decision-making, multi-modal image fusion, coarsening techniques, recommender system, levee characterization and assessment, human heading perception, trust assessment, robotics, biometrics, failure detection, GPS systems, inter-criteria analysis, group decision, human activity recognition, storm prediction, data association for autonomous vehicles, identification of maritime vessels, fusion of support vector machines (SVM), Silx-Furtif RUST code library for information fusion including PCR rules, and network for ship classification. Finally, the third part presents interesting contributions related to belief functions in general published or presented along the years since 2015. These contributions are related with decision-making under uncertainty, belief approximations, probability transformations, new distances between belief functions, non-classical multi-criteria decision-making problems with belief functions, generalization of Bayes theorem, image processing, data association, entropy and cross-entropy measures, fuzzy evidence numbers, negator of belief mass, human activity recognition, information fusion for breast cancer therapy, imbalanced data classification, and hybrid techniques mixing deep learning with belief functions as well

    Why do we optimize what we optimize in multiple view geometry?

    Get PDF
    Para que un computador sea capaz de entender la geometría 3D de su entorno, necesitamos derivar las relaciones geométricas entre las imágenes 2D y el mundo 3D.La geometría de múltiples vistas es el área de investigación que estudia este problema.La mayor parte de métodos existentes resuelve pequeñas partes de este gran problema minimizando una determinada función objetivo.Estas funciones normalmente se componen de errores algebraicos o geométricos que representan las desviaciones con respecto al modelo de observación.En resumen, en general tratamos de recuperar la estructura 3D del mundo y el movimiento de la cámara encontrando el modelo que minimiza la discrepancia con respecto a las observaciones.El enfoque de esta tesis se centra principalmente en dos aspectos de los problemas de reconstrucción multivista:los criterios de error y la robustez.Primero, estudiamos los criterios de error usados en varios problemas geométricos y nos preguntamos`¿Por qué optimizamos lo que optimizamos?'Específicamente, analizamos sus pros y sus contras y proponemos métodos novedosos que combinan los criterios existentes o adoptan una mejor alternativa.En segundo lugar, tratamos de alcanzar el estado del arte en robustez frente a valores atípicos y escenarios desafiantes, que a menudo se encuentran en la práctica.Para ello, proponemos múltiples ideas novedosas que pueden ser incorporadas en los métodos basados en optimización.Específicamente, estudiamos los siguientes problemas: SLAM monocular, triangulación a partir de dos y de múltiples vistas, promedio de rotaciones únicas y múltiples, ajuste de haces únicamente con rotaciones de cámara, promedio robusto de números y evaluación cuantitativa de estimación de trayectoria.Para SLAM monocular, proponemos un enfoque híbrido novedoso que combina las fortalezas de los métodos directos y los basados en características.Los métodos directos minimizan los errores fotométricos entre los píxeles correspondientes en varias imágenes, mientras que los métodos basados en características minimizan los errores de reproyección.Nuestro método combina de manera débilmente acoplada la odometría directa y el SLAM basado en características, y demostramos que mejora la robustez en escenarios desafiantes, así como la precisión cuando el movimiento de la cámara realiza frecuentes revisitas.Para la triangulación de dos vistas, proponemos métodos óptimos que minimizan los errores de reproyección angular en forma cerrada.Dado que el error angular es rotacionalmente invariante, estos métodos se pueden utilizar para cámaras perspectivas, lentes de ojo de pez u omnidireccionales.Además, son mucho más rápidos que los métodos óptimos existentes en la literatura.Otro método de triangulación de dos vistas que proponemos adopta un enfoque completamente diferente:Modificamos ligeramente el método clásico del punto medio y demostramos que proporciona un equilibrio superior de precisión 2D y 3D, aunque no es óptimo.Para la triangulación multivista, proponemos un método robusto y eficiente utilizando RANSAC de dos vistas.Presentamos varios criterios de finalización temprana para RANSAC de dos vistas utilizando el método de punto medio y mostramos que mejora la eficiencia cuando la proporción de medidas espúreas es alta.Además, mostramos que la incertidumbre de un punto triangulado se puede modelar en función de tres factores: el número de cámaras, el error medio de reproyección y el ángulo de paralaje máximo.Al aprender este modelo, la incertidumbre se puede interpolar para cada caso.Para promediar una sola rotación, proponemos un método robusto basado en el algoritmo de Weiszfeld.La idea principal es comenzar con una inicialización robusta y realizar un esquema de rechazo de valores espúreos implícito dentro del algoritmo de Weiszfeld para aumentar aún más la robustez.Además, usamos una aproximación de la mediana cordal en SO(3)SO(3) que proporciona una aceleración significativa del método. Para promediar rotaciones múltiples proponemos HARA, un enfoque novedoso que inicializa de manera incremental el grafo de rotaciones basado en una jerarquía de compatibilidad con tripletas.Esencialmente, construimos un árbol de expansión priorizando los enlaces con muchos soportes triples fuertes y agregando gradualmente aquellos con menos soportes y más débiles.Como resultado, reducimos el riesgo de agregar valores atípicos en la solución inicial, lo que nos permite filtrar los valores atípicos antes de la optimización no lineal.Además, mostramos que podemos mejorar los resultados usando la función suavizada L0+ en el paso de refinamiento local.A continuación, proponemos el ajuste de haces únicamente con rotaciones, un método novedoso para estimar las rotaciones absolutas de múltiples vistas independientemente de las traslaciones y la estructura de la escena.La clave es minimizar una función de coste especialmente diseñada basada en el error epipolar normalizado, que está estrechamente relacionado con el error de reproyección angular óptimo L1 entre otras cantidades geométricas.Nuestro enfoque brinda múltiples beneficios, como inmunidad total a translaciones y triangulaciones imprecisas, robustez frente a rotaciones puras y escenas planas, y la mejora de la precisión cuando se usa tras el promedio de promedio de rotaciones explicado anteriormente.También proponemos RODIAN, un método robusto para promediar un conjunto de números contaminados por una gran proporción de valores atípicos.En nuestro método, asumimos que los valores atípicos se distribuyen uniformemente dentro del rango de los datos y buscamos la región que es menos probable que contenga solo valores atípicos.Luego tomamos la mediana de los datos dentro de esta región.Nuestro método es rápido, robusto y determinista, y no se basa en un límite de error interno conocido.Finalmente, para la evaluación cuantitativa de la trayectoria, señalamos la debilidad del Error de Trayectoria Absoluta (ATE) comúnmente utilizado y proponemos una alternativa novedosa llamada Error de Trayectoria Discernible (DTE).En presencia de solo unos pocos valores espúreos, el ATE pierde su sensibilidad respecto al error de trayectoria de los valores típicos y respecto al número de datos atípicos o espúreos.El DTE supera esta debilidad al alinear la trayectoria estimada con la verdadera (ground truth) utilizando un método robusto basado en varios tipos diferentes de medianas.Usando ideas similares, también proponemos una métrica de solo rotación, llamada Error de Rotación Discernible (DRE).Además, proponemos un método simple para calibrar la rotación de cámara a marcador, que es un requisito previo para el cálculo de DTE y DRE.<br /

    WiFi-Based Human Activity Recognition Using Attention-Based BiLSTM

    Get PDF
    Recently, significant efforts have been made to explore human activity recognition (HAR) techniques that use information gathered by existing indoor wireless infrastructures through WiFi signals without demanding the monitored subject to carry a dedicated device. The key intuition is that different activities introduce different multi-paths in WiFi signals and generate different patterns in the time series of channel state information (CSI). In this paper, we propose and evaluate a full pipeline for a CSI-based human activity recognition framework for 12 activities in three different spatial environments using two deep learning models: ABiLSTM and CNN-ABiLSTM. Evaluation experiments have demonstrated that the proposed models outperform state-of-the-art models. Also, the experiments show that the proposed models can be applied to other environments with different configurations, albeit with some caveats. The proposed ABiLSTM model achieves an overall accuracy of 94.03%, 91.96%, and 92.59% across the 3 target environments. While the proposed CNN-ABiLSTM model reaches an accuracy of 98.54%, 94.25% and 95.09% across those same environments

    Machine learning for the automation and optimisation of optical coordinate measurement

    Get PDF
    Camera based methods for optical coordinate metrology are growing in popularity due to their non-contact probing technique, fast data acquisition time, high point density and high surface coverage. However, these optical approaches are often highly user dependent, have high dependence on accurate system characterisation, and can be slow in processing the raw data acquired during measurement. Machine learning approaches have the potential to remedy the shortcomings of such optical coordinate measurement systems. The aim of this thesis is to remove dependence on the user entirely by enabling full automation and optimisation of optical coordinate measurements for the first time. A novel software pipeline is proposed, built, and evaluated which will enable automated and optimised measurements to be conducted. No such automated and optimised system for performing optical coordinate measurements currently exists. The pipeline can be roughly summarised as follows: intelligent characterisation -> view planning -> object pose estimation -> automated data acquisition -> optimised reconstruction. Several novel methods were developed in order to enable the embodiment of this pipeline. Chapter 4 presents an intelligent camera characterisation (the process of determining a mathematical model of the optical system) is performed using a hybrid approach wherein an EfficientNet convolutional neural network provides sub-pixel corrections to feature locations provided by the popular OpenCV library. The proposed characterisation scheme is shown to robustly refine the characterisation result as quantified by a 50 % reduction in the mean residual magnitude. The camera characterisation is performed before measurements are performed and the results are fed as an input to the pipeline. Chapter 5 presents a novel genetic optimisation approach is presented to create an imaging strategy, ie. the positions from which data should be captured relative to part’s specific geometry. This approach exploits the computer aided design (CAD) data of a given part, ensuring any measurement is optimal given a specific target geometry. This view planning approach is shown to give reconstructions with closer agreement to tactile coordinate measurement machine (CMM) results from 18 images compared to unoptimised measurements using 60 images. This view planning algorithm assumes the part is perfectly placed in the centre of the measurement volume so is first adjusted for an arbitrary placement of the part before being used for data acquistion. Chapter 6 presents a generative model for the creation of surface texture data is presented, allowing the generation of synthetic butt realistic datasets for the training of statistical models. The surface texture generated by the proposed model is shown to be quantitatively representative of real focus variation microscope measurements. The model developed in this chapter is used to produce large synthetic but realistic datasets for the training of further statistical models. Chapter 7 presents an autonomous background removal approach is proposed which removes superfluous data from images captured during a measurement. Using images processed by this algorithm to reconstruct a 3D measurement of an object is shown to be effective in reducing data processing times and improving measurement results. Use the proposed background removal on images before reconstruction are shown to benefit from up to a 41 % reduction in data processing times, a reduction in superfluous background points of up to 98 %, an increase in point density on the object surface of up to 10 %, and an improved agreement with CMM as measured by both a reduction in outliers and reduction in the standard deviation of point to mesh distances of up to 51 microns. The background removal algorithm is used to both improve the final reconstruction and within stereo pose estimation. Finally, in Chapter 8, two methods (one monocular and one stereo) for establishing the initial pose of the part to be measured relative to the measurement volume are presented. This is an important step to enabling automation as it allows the user to place the object at an arbitrary location in the measurement volume and for the pipeline to adjust the imaging strategy to account for this placement, enabling the optimised view plan to be carried out without the need for special part fixturing. It is shown that the monocular method can locate a part to within an average of 13 mm and the stereo method can locate apart to within an average of 0.44 mm as evaluated on 240 test images. Pose estimation is used to provide a correction to the view plan for an arbitrary part placement without the need for specialised fixturing or fiducial marking. This pipeline enables an inexperienced user to place a part anywhere in the measurement volume of a system and, from the part’s associated CAD data, the system will perform an optimal measurement without the need for any user input. Each new method which was developed as part of this pipeline has been validated against real experimental data from current measurement systems and shown to be effective. In future work given in Section 9.1, a possible hardware integration of the methods developed in this thesis is presented. Although the creation of this hardware is beyond the scope of this thesis

    Convolutional Neural Network in Pattern Recognition

    Get PDF
    Since convolutional neural network (CNN) was first implemented by Yann LeCun et al. in 1989, CNN and its variants have been widely implemented to numerous topics of pattern recognition, and have been considered as the most crucial techniques in the field of artificial intelligence and computer vision. This dissertation not only demonstrates the implementation aspect of CNN, but also lays emphasis on the methodology of neural network (NN) based classifier. As known to many, one general pipeline of NN-based classifier can be recognized as three stages: pre-processing, inference by models, and post-processing. To demonstrate the importance of pre-processing techniques, this dissertation presents how to model actual problems in medical pattern recognition and image processing by introducing conceptual abstraction and fuzzification. In particular, a transformer on the basis of self-attention mechanism, namely beat-rhythm transformer, greatly benefits from correct R-peak detection results and conceptual fuzzification. Recently proposed self-attention mechanism has been proven to be the top performer in the fields of computer vision and natural language processing. In spite of the pleasant accuracy and precision it has gained, it usually consumes huge computational resources to perform self-attention. Therefore, realtime global attention network is proposed to make a better trade-off between efficiency and performance for the task of image segmentation. To illustrate more on the stage of inference, we also propose models to detect polyps via Faster R-CNN - one of the most popular CNN-based 2D detectors, as well as a 3D object detection pipeline for regressing 3D bounding boxes from LiDAR points and stereo image pairs powered by CNN. The goal for post-processing stage is to refine artifacts inferred by models. For the semantic segmentation task, the dilated continuous random field is proposed to be better fitted to CNN-based models than the widely implemented fully-connected continuous random field. Proposed approaches can be further integrated into a reinforcement learning architecture for robotics

    Image-Based Rendering Of Real Environments For Virtual Reality

    Get PDF

    Markerless Human Motion Analysis

    Get PDF
    Measuring and understanding human motion is crucial in several domains, ranging from neuroscience, to rehabilitation and sports biomechanics. Quantitative information about human motion is fundamental to study how our Central Nervous System controls and organizes movements to functionally evaluate motor performance and deficits. In the last decades, the research in this field has made considerable progress. State-of-the-art technologies that provide useful and accurate quantitative measures rely on marker-based systems. Unfortunately, markers are intrusive and their number and location must be determined a priori. Also, marker-based systems require expensive laboratory settings with several infrared cameras. This could modify the naturalness of a subject\u2019s movements and induce discomfort. Last, but not less important, they are computationally expensive in time and space. Recent advances on markerless pose estimation based on computer vision and deep neural networks are opening the possibility of adopting efficient video-based methods for extracting movement information from RGB video data. In this contest, this thesis presents original contributions to the following objectives: (i) the implementation of a video-based markerless pipeline to quantitatively characterize human motion; (ii) the assessment of its accuracy if compared with a gold standard marker-based system; (iii) the application of the pipeline to different domains in order to verify its versatility, with a special focus on the characterization of the motion of preterm infants and on gait analysis. With the proposed approach we highlight that, starting only from RGB videos and leveraging computer vision and machine learning techniques, it is possible to extract reliable information characterizing human motion comparable to that obtained with gold standard marker-based systems

    Visual-Inertial State Estimation With Information Deficiency

    Get PDF
    State estimation is an essential part of intelligent navigation and mapping systems where tracking the location of a smartphone, car, robot, or a human-worn device is required. For autonomous systems such as micro aerial vehicles and self-driving cars, it is a prerequisite for control and motion planning. For AR/VR applications, it is the first step to image rendering. Visual-inertial odometry (VIO) is the de-facto standard algorithm for embedded platforms because it lends itself to lightweight sensors and processors, and maturity in research and industrial development. Various approaches have been proposed to achieve accurate real-time tracking, and numerous open-source software and datasets are available. However, errors and outliers are common due to the complexity of visual measurement processes and environmental changes, and in practice, estimation drift is inevitable. In this thesis, we introduce the concept of information deficiency in state estimation and how to utilize this concept to develop and improve VIO systems. We look into the information deficiencies in visual-inertial state estimation, which are often present and ignored, causing system failures and drift. In particular, we investigate three critical cases of information deficiency in visual-inertial odometry: low texture environment with limited computation, monocular visual odometry, and inertial odometry. We consider these systems under three specific application settings: a lightweight quadrotor platform in autonomous flight, driving scenarios, and AR/VR headset for pedestrians. We address the challenges in each application setting and explore how the tight fusion of deep learning and model-based VIO can improve the state-of-the-art system performance and compensate for the lack of information in real-time. We identify deep learning as a key technology in tackling the information deficiencies in state estimation. We argue that developing hybrid frameworks that leverage its advantage and enable supervision for performance guarantee provides the most accurate and robust solution to state estimation

    Forum Bildverarbeitung 2020

    Get PDF
    Image processing plays a key role for fast and contact-free data acquisition in many technical areas, e.g., in quality control or robotics. These conference proceedings of the “Forum Bildverarbeitung”, which took place on 26.-27.11.202 in Karlsruhe as a common event of the Karlsruhe Institute of Technology and the Fraunhofer Institute of Optronics, System Technologies and Image Exploitation, contain the articles of the contributions
    corecore