1,620 research outputs found

    3D Motion Reconstruction from 2D Motion Data Using Multimodal Conditional Deep Belief Network

    Get PDF
    In this paper, we propose a deep generative model named Multimodal Conditional Deep Belief Network (MCDBN) for cross-modal learning of 3D motion data and their non-injective 2D projections on the image plane. This model has a three sectional structure, which learns conditional probability distribution of 3D motion data given 2D projections. Two distinct Conditional Deep Belief Networks (CDBNs), encode the real-valued spatiotemporal patterns of 2D and 3D motion time series captured from subjects’ movements into the compact representations. The third part includes a Multimodal Restricted Boltzmann Machines (MRBMs) which in the training process, learns the relationship between the compact representations of data modalities by variation information criteria. As a result, conditioned on a 2D motion data obtained from a video, MCDBN can regenerate 3D motion data in generation phase. We introduce Pearson correlation coefficient of ground truth and regenerated motion signals as a new evaluation metric in motion reconstruction problems. The model is trained with human motion capture data and the results show that the real and the regenerated signals are highly correlated which means the model can reproduce dynamical patterns of motion accurately

    Deep Learning in Cardiology

    Full text link
    The medical field is creating large amount of data that physicians are unable to decipher and use efficiently. Moreover, rule-based expert systems are inefficient in solving complicated medical tasks or for creating insights using big data. Deep learning has emerged as a more accurate and effective technology in a wide range of medical problems such as diagnosis, prediction and intervention. Deep learning is a representation learning method that consists of layers that transform the data non-linearly, thus, revealing hierarchical relationships and structures. In this review we survey deep learning application papers that use structured data, signal and imaging modalities from cardiology. We discuss the advantages and limitations of applying deep learning in cardiology that also apply in medicine in general, while proposing certain directions as the most viable for clinical use.Comment: 27 pages, 2 figures, 10 table

    Development of a probabilistic perception system for camera-lidar sensor fusion

    Get PDF
    La estimación de profundidad usando diferentes sensores es uno de los desafíos clave para dotar a las máquinas autónomas de sólidas capacidades de percepción robótica. Ha habido un avance sobresaliente en el desarrollo de técnicas de estimación de profundidad unimodales basadas en cámaras monoculares, debido a su alta resolución o sensores LiDAR, debido a los datos geométricos precisos que proporcionan. Sin embargo, cada uno de ellos presenta inconvenientes inherentes, como la alta sensibilidad a los cambios en las condiciones de iluminación en el caso delas cámaras y la resolución limitada de los sensores LiDAR. La fusión de sensores se puede utilizar para combinar los méritos y compensar las desventajas de estos dos tipos de sensores. Sin embargo, los métodos de fusión actuales funcionan a un alto nivel. Procesan los flujos de datos de los sensores de forma independiente y combinan las estimaciones de alto nivel obtenidas para cada sensor. En este proyecto, abordamos el problema en un nivel bajo, fusionando los flujos de sensores sin procesar, obteniendo así estimaciones de profundidad que son densas y precisas, y pueden usarse como una fuente de datos multimodal unificada para problemas de estimación de nivel superior. Este trabajo propone un modelo de campo aleatorio condicional (CRF) con múltiples potenciales de geometría y apariencia que representa a la perfección el problema de estimar mapas de profundidad densos a partir de datos de cámara y LiDAR. El modelo se puede optimizar de manera eficiente utilizando el algoritmo Conjúgate Gradient Squared (CGS). El método propuesto se evalúa y compara utilizando el conjunto de datos proporcionado por KITTI Datset. Adicionalmente, se evalúa cualitativamente el modelo, usando datos adquiridos por el autor de esté trabajoMulti-modal depth estimation is one of the key challenges for endowing autonomous machines with robust robotic perception capabilities. There has been an outstanding advance in the development of uni-modal depth estimation techniques based on either monocular cameras, because of their rich resolution or LiDAR sensors due to the precise geometric data they provide. However, each of them suffers from some inherent drawbacks like high sensitivity to changes in illumination conditions in the case of cameras and limited resolution for the LiDARs. Sensor fusion can be used to combine the merits and compensate the downsides of these two kinds of sensors. Nevertheless, current fusion methods work at a high level. They processes sensor data streams independently and combine the high level estimates obtained for each sensor. In this thesis, I tackle the problem at a low level, fusing the raw sensor streams, thus obtaining depth estimates which are both dense and precise, and can be used as a unified multi-modal data source for higher level estimation problems. This work proposes a Conditional Random Field (CRF) model with multiple geometry and appearance potentials that seamlessly represents the problem of estimating dense depth maps from camera and LiDAR data. The model can be optimized efficiently using the Conjugate Gradient Squared (CGS) algorithm. The proposed method was evaluated and compared with the state-of-the-art using the commonly used KITTI benchmark dataset. In addition, the model is qualitatively evaluated using data acquired by the author of this work.MaestríaMagíster en Ingeniería de Desarrollo de Producto
    • …
    corecore