148 research outputs found

    Interactive energy minimizing segmentation frameworks

    Get PDF
    [no abstract

    Human Shape-Motion Analysis In Athletics Videos for Coarse To Fine Action/Activity Recognition Using Transferable BeliefModel

    Get PDF
    We present an automatic human shape-motion analysis method based on a fusion architecture for human action and activity recognition in athletic videos. Robust shape and motion features are extracted from human detection and tracking. The features are combined within the Transferable Belief Model (TBM framework for two levels of recognition. The TBM-based modelling of the fusion process allows to take into account imprecision, uncertainty and conflict inherent to the features. First, in a coarse step, actions are roughly recognized. Then, in a fine step, an action sequence recognition method is used to discriminate activities. Belief on actions are made smooth by a Temporal Credal Filter and action sequences, i.e. activities, are recognized using a state machine, called belief scheduler, based on TBM. The belief scheduler is also exploited for feedback information extraction in order to improve tracking results. The system is tested on real videos of athletics meetings to recognize four types of actions (running, jumping, falling and standing) and four types of activities (high jump, pole vault, triple jump and long jump). Results on actions, activities and feedback demonstrate the relevance of the proposed features and as well the efficiency of the proposed recognition approach based on TBM

    Contributions to Robust Multi-view 3D Action Recognition

    Get PDF
    This thesis focus on human action recognition using volumetric reconstructions obtained from multiple monocular cameras. The problem of action recognition has been addressed using di erent approaches, both in the 2D and 3D domains, and using one or multiple views. However, the development of robust recognition methods, independent from the view employed, remains an open problem. Multi-view approaches allow to exploit 3D information to improve the recognition performance. Nevertheless, manipulating the large amount of information of 3D representations poses a major problem. As a consequence, standard dimensionality reduction techniques must be applied prior to the use of machine learning approaches. The rst contribution of this work is a new descriptor of volumetric information that can be further reduced using standard Dimensionality Reduction techniques in both holistic and sequential recognition approaches. However, the descriptor itself reduces the amount of data up to an order of magnitude (compared to previous descriptors) without a ecting to the classi cation performance. The descriptor represents the volumetric information obtained by SfS techniques. However, this family of techniques are highly in uenced by errors in the segmentation process (e.g., undersegmentation causes false negatives in the reconstructed volumes) so that the recognition performance is highly a ected by this rst step. The second contribution of this work is a new SfS technique (named SfSDS) that employs the Dempster-Shafer theory to fuse evidences provided by multiple cameras. The central idea is to consider the relative position between cameras so as to deal with inconsistent silhouettes and obtain robust volumetric reconstructions. The basic SfS technique still have a main drawback, it requires the whole volume to be analized in order to obtain the reconstruction. On the other hand, octree-based representations allows to save memory and time employing a dynamic tree structure where only occupied nodes are stored. Nevertheless, applying the SfS method to octreebased representations is not straightforward. The nal contribution of this work is a method for generating octrees using our proposed SfSDS technique so as to obtain robust and compact volumetric representations.Esta tesis se centra en el reconocimiento de acciones humanas usando reconstrucciones volum etricas obtenidas a partir de m ultiples c amaras monoculares. El problema del reconocimiento de acciones ha sido tratado usando diferentes enfoques, en los dominios 2D y 3D, y usando una o varias vistas. No obstante, el desarrollo de m etodos de reconocimiento robustos, independientes de la vista empleada, sigue siendo un problema abierto. Los enfoques multi-vista permiten explotar la informaci on 3D para mejorar el rendimiento del reconocimiento. Sin embargo, manipular las grandes cantidades de informaci on de las representaciones 3D plantea un importante problema. Como consecuencia, deben ser aplicadas t ecnicas est andar de reducci on de dimensionalidad con anterioridad al uso de propuestas de aprendizaje. La primera contribuci on de este trabajo es un nuevo descriptor de informaci on volum etrica que puede ser posteriormente reducido mediante t ecnicas est andar de reducci on de dimensionalidad en los enfoques de reconocimiento hol sticos y secuenciales. El descriptor, por si mismo, reduce la cantidad de datos hasta en un orden de magnitud (en comparaci on con descriptores previos) sin afectar al rendimiento de clasi caci on. El descriptor representa la informaci on volum etrica obtenida en t ecnicas SfS. Sin embargo, esta familia de t ecnicas est a altamente in uenciada por los errores en el proceso de segmentaci on (p.e., una sub-segmentaci on causa falsos negativos en los vol umenes reconstruidos) de forma que el rendimiento del reconocimiento est a signi cativamente afectado por este primer paso. La segunda contribuci on de este trabajo es una nueva t ecnica SfS (denominada SfSDS) que emplea la teor a de Dempster-Shafer para fusionar evidencias proporcionadas por m ultiples c amaras. La idea central consiste en considerar la posici on relativa entre c amaras de forma que se traten las inconsistencias en las siluetas y se obtenga reconstrucciones volum etricas robustas. La t ecnica SfS b asica sigue teniendo un inconveniente principal; requiere que el volumen completo sea analizado para obtener la reconstrucci on. Por otro lado, las representaciones basadas en octrees permiten salvar memoria y tiempo empleando una estructura de arbol din amica donde s olo se almacenan los nodos ocupados. No obstante, la aplicaci on del m etodo SfS a representaciones basadas en octrees no es directa. La contribuci on nal de este trabajo es un m etodo para la generaci on de octrees usando nuestra t ecnica SfSDS propuesta de forma que se obtengan representaciones volum etricas robustas y compactas

    Plausibility Verification for 3D Object Detectors Using Energy-Based Optimization

    Get PDF
    Environmental perception obtained via object detectors have no predictable safety layer encoded into their model schema, which creates the question of trustworthiness about the system\u27s prediction. As can be seen from recent adversarial attacks, most of the current object detection networks are vulnerable to input tampering, which in the real world could compromise the safety of autonomous vehicles. The problem would be amplified even more when uncertainty errors could not propagate into the submodules, if these are not a part of the end-to-end system design. To address these concerns, a parallel module which verifies the predictions of the object proposals coming out of Deep Neural Networks are required. This work aims to verify 3D object proposals from MonoRUn model by proposing a plausibility framework that leverages cross sensor streams to reduce false positives. The verification metric being proposed uses prior knowledge in the form of four different energy functions, each utilizing a certain prior to output an energy value leading to a plausibility justification for the hypothesis under consideration. We also employ a novel two-step schema to improve the optimization of the composite energy function representing the energy model

    Sensor fusion in smart camera networks for ambient intelligence

    Get PDF
    This short report introduces the topics of PhD research that was conducted on 2008-2013 and was defended on July 2013. The PhD thesis covers sensor fusion theory, gathers it into a framework with design rules for fusion-friendly design of vision networks, and elaborates on the rules through fusion experiments performed with four distinct applications of Ambient Intelligence

    Information selection and fusion in vision systems

    Get PDF
    Handling the enormous amounts of data produced by data-intensive imaging systems, such as multi-camera surveillance systems and microscopes, is technically challenging. While image and video compression help to manage the data volumes, they do not address the basic problem of information overflow. In this PhD we tackle the problem in a more drastic way. We select information of interest to a specific vision task, and discard the rest. We also combine data from different sources into a single output product, which presents the information of interest to end users in a suitable, summarized format. We treat two types of vision systems. The first type is conventional light microscopes. During this PhD, we have exploited for the first time the potential of the curvelet transform for image fusion for depth-of-field extension, allowing us to combine the advantages of multi-resolution image analysis for image fusion with increased directional sensitivity. As a result, the proposed technique clearly outperforms state-of-the-art methods, both on real microscopy data and on artificially generated images. The second type is camera networks with overlapping fields of view. To enable joint processing in such networks, inter-camera communication is essential. Because of infrastructure costs, power consumption for wireless transmission, etc., transmitting high-bandwidth video streams between cameras should be avoided. Fortunately, recently designed 'smart cameras', which have on-board processing and communication hardware, allow distributing the required image processing over the cameras. This permits compactly representing useful information from each camera. We focus on representing information for people localization and observation, which are important tools for statistical analysis of room usage, quick localization of people in case of building fires, etc. To further save bandwidth, we select which cameras should be involved in a vision task and transmit observations only from the selected cameras. We provide an information-theoretically founded framework for general purpose camera selection based on the Dempster-Shafer theory of evidence. Applied to tracking, it allows tracking people using a dynamic selection of as little as three cameras with the same accuracy as when using up to ten cameras

    Mesh-based 3D Textured Urban Mapping

    Get PDF
    In the era of autonomous driving, urban mapping represents a core step to let vehicles interact with the urban context. Successful mapping algorithms have been proposed in the last decade building the map leveraging on data from a single sensor. The focus of the system presented in this paper is twofold: the joint estimation of a 3D map from lidar data and images, based on a 3D mesh, and its texturing. Indeed, even if most surveying vehicles for mapping are endowed by cameras and lidar, existing mapping algorithms usually rely on either images or lidar data; moreover both image-based and lidar-based systems often represent the map as a point cloud, while a continuous textured mesh representation would be useful for visualization and navigation purposes. In the proposed framework, we join the accuracy of the 3D lidar data, and the dense information and appearance carried by the images, in estimating a visibility consistent map upon the lidar measurements, and refining it photometrically through the acquired images. We evaluate the proposed framework against the KITTI dataset and we show the performance improvement with respect to two state of the art urban mapping algorithms, and two widely used surface reconstruction algorithms in Computer Graphics.Comment: accepted at iros 201

    A Novel Multi-Stage Fusion based Approach for Gene Expression Profiling in Non-Small Cell Lung Cancer

    Get PDF
    Background: Non-small cell lung cancer is defined at the molecular level by mutations and alterations to oncogenes, including AKT1, ALK, BRAF, EGFR, HER2, KRAS, MEK1, MET, NRAS, PIK3CA, RET, and ROS1. A better understanding of non-small cell lung cancer requires a thorough consideration of these oncogenes. However, the complexity of the problem arises from high-dimensional gene vector space, which complicates the identification of cluster boundaries, and hence gene expression cluster membership. This paper aims to analyze potential biological biomarkers for tumorigenesis in lung cancer based on different treatment solutions. Results: Genes BRAF, RET, and ROS1 show an overexpression transition by one cluster from non-treatment to treatment states, followed by a stabilization in the 3 treatment states at the same cluster. Genes MET, ALK, and PIK3CA show an overexpression transition by two clusters from non-treatment to treatment states, followed by a stabilization in the 3 treatment states at the same cluster. SME1 shows an under-expression transition by two clusters from non-treatment to the treatment states, a stabilization in the 3 treatment states at the same cluster. Conclusions: We present a novel fusion-based approach for gene expression profiling of non-small cell lung cancer under non-thermal plasma treatment. The main contribution of the proposed approach is to exploit Dempster-Shafer evidence theory-based data fusion to combine information from different samples in the considered dataset. This minimizes uncertainty and enhances the reliability and validity of decisions, leading to a better description of genes related to non-small cell lung cancer. We also propose use of fuzzy c-means-with-range clustering to track changes of genes' states under different non-thermal plasma treatments

    Combination of Evidence in Dempster-Shafer Theory

    Full text link
    corecore