7,645 research outputs found

    Single camera pose estimation using Bayesian filtering and Kinect motion priors

    Full text link
    Traditional approaches to upper body pose estimation using monocular vision rely on complex body models and a large variety of geometric constraints. We argue that this is not ideal and somewhat inelegant as it results in large processing burdens, and instead attempt to incorporate these constraints through priors obtained directly from training data. A prior distribution covering the probability of a human pose occurring is used to incorporate likely human poses. This distribution is obtained offline, by fitting a Gaussian mixture model to a large dataset of recorded human body poses, tracked using a Kinect sensor. We combine this prior information with a random walk transition model to obtain an upper body model, suitable for use within a recursive Bayesian filtering framework. Our model can be viewed as a mixture of discrete Ornstein-Uhlenbeck processes, in that states behave as random walks, but drift towards a set of typically observed poses. This model is combined with measurements of the human head and hand positions, using recursive Bayesian estimation to incorporate temporal information. Measurements are obtained using face detection and a simple skin colour hand detector, trained using the detected face. The suggested model is designed with analytical tractability in mind and we show that the pose tracking can be Rao-Blackwellised using the mixture Kalman filter, allowing for computational efficiency while still incorporating bio-mechanical properties of the upper body. In addition, the use of the proposed upper body model allows reliable three-dimensional pose estimates to be obtained indirectly for a number of joints that are often difficult to detect using traditional object recognition strategies. Comparisons with Kinect sensor results and the state of the art in 2D pose estimation highlight the efficacy of the proposed approach.Comment: 25 pages, Technical report, related to Burke and Lasenby, AMDO 2014 conference paper. Code sample: https://github.com/mgb45/SignerBodyPose Video: https://www.youtube.com/watch?v=dJMTSo7-uF

    Efficient Scalable Accurate Regression Queries in In-DBMS Analytics

    Get PDF
    Recent trends aim to incorporate advanced data analytics capabilities within DBMSs. Linear regression queries are fundamental to exploratory analytics and predictive modeling. However, computing their exact answers leaves a lot to be desired in terms of efficiency and scalability. We contribute a novel predictive analytics model and associated regression query processing algorithms, which are efficient, scalable and accurate. We focus on predicting the answers to two key query types that reveal dependencies between the values of different attributes: (i) mean-value queries and (ii) multivariate linear regression queries, both within specific data subspaces defined based on the values of other attributes. Our algorithms achieve many orders of magnitude improvement in query processing efficiency and nearperfect approximations of the underlying relationships among data attributes

    Covariate dimension reduction for survival data via the Gaussian process latent variable model

    Full text link
    The analysis of high dimensional survival data is challenging, primarily due to the problem of overfitting which occurs when spurious relationships are inferred from data that subsequently fail to exist in test data. Here we propose a novel method of extracting a low dimensional representation of covariates in survival data by combining the popular Gaussian Process Latent Variable Model (GPLVM) with a Weibull Proportional Hazards Model (WPHM). The combined model offers a flexible non-linear probabilistic method of detecting and extracting any intrinsic low dimensional structure from high dimensional data. By reducing the covariate dimension we aim to diminish the risk of overfitting and increase the robustness and accuracy with which we infer relationships between covariates and survival outcomes. In addition, we can simultaneously combine information from multiple data sources by expressing multiple datasets in terms of the same low dimensional space. We present results from several simulation studies that illustrate a reduction in overfitting and an increase in predictive performance, as well as successful detection of intrinsic dimensionality. We provide evidence that it is advantageous to combine dimensionality reduction with survival outcomes rather than performing unsupervised dimensionality reduction on its own. Finally, we use our model to analyse experimental gene expression data and detect and extract a low dimensional representation that allows us to distinguish high and low risk groups with superior accuracy compared to doing regression on the original high dimensional data

    Applications in Monocular Computer Vision using Geometry and Learning : Map Merging, 3D Reconstruction and Detection of Geometric Primitives

    Get PDF
    As the dream of autonomous vehicles moving around in our world comes closer, the problem of robust localization and mapping is essential to solve. In this inherently structured and geometric problem we also want the agents to learn from experience in a data driven fashion. How the modern Neural Network models can be combined with Structure from Motion (SfM) is an interesting research question and this thesis studies some related problems in 3D reconstruction, feature detection, SfM and map merging.In Paper I we study how a Bayesian Neural Network (BNN) performs in Semantic Scene Completion, where the task is to predict a semantic 3D voxel grid for the Field of View of a single RGBD image. We propose an extended task and evaluate the benefits of the BNN when encountering new classes at inference time. It is shown that the BNN outperforms the deterministic baseline.Papers II-­III are about detection of points, lines and planes defining a Room Layout in an RGB image. Due to the repeated textures and homogeneous colours of indoor surfaces it is not ideal to only use point features for Structure from Motion. The idea is to complement the point features by detecting a Wireframe – a connected set of line segments – which marks the intersection of planes in the Room Layout. Paper II concerns a task for detecting a Semantic Room Wireframe and implements a Neural Network model utilizing a Graph Convolutional Network module. The experiments show that the method is more flexible than previous Room Layout Estimation methods and perform better than previous Wireframe Parsing methods. Paper III takes the task closer to Room Layout Estimation by detecting a connected set of semantic polygons in an RGB image. The end­-to-­end trainable model is a combination of a Wireframe Parsing model and a Heterogeneous Graph Neural Network. We show promising results by outperforming state of the art models for Room Layout Estimation using synthetic Wireframe detections. However, the joint Wireframe and Polygon detector requires further research to compete with the state of the art models.In Paper IV we propose minimal solvers for SfM with parallel cylinders. The problem may be reduced to estimating circles in 2D and the paper contributes with theory for the two­view relative motion and two­-circle relative structure problem. Fast solvers are derived and experiments show good performance in both simulation and on real data.Papers V-­VII cover the task of map merging. That is, given a set of individually optimized point clouds with camera poses from a SfM pipeline, how can the solutions be effectively merged without completely re­solving the Structure from Motion problem? Papers V­-VI introduce an effective method for merging and shows the effectiveness through experiments of real and simulated data. Paper VII considers the matching problem for point clouds and proposes minimal solvers that allows for deformation ofeach point cloud. Experiments show that the method robustly matches point clouds with drift in the SfM solution

    Fault Diagnosis and Failure Prognostics of Lithium-ion Battery based on Least Squares Support Vector Machine and Memory Particle Filter Framework

    Get PDF
    123456A novel data driven approach is developed for fault diagnosis and remaining useful life (RUL) prognostics for lithium-ion batteries using Least Square Support Vector Machine (LS-SVM) and Memory-Particle Filter (M-PF). Unlike traditional data-driven models for capacity fault diagnosis and failure prognosis, which require multidimensional physical characteristics, the proposed algorithm uses only two variables: Energy Efficiency (EE), and Work Temperature. The aim of this novel framework is to improve the accuracy of incipient and abrupt faults diagnosis and failure prognosis. First, the LSSVM is used to generate residual signal based on capacity fade trends of the Li-ion batteries. Second, adaptive threshold model is developed based on several factors including input, output model error, disturbance, and drift parameter. The adaptive threshold is used to tackle the shortcoming of a fixed threshold. Third, the M-PF is proposed as the new method for failure prognostic to determine Remaining Useful Life (RUL). The M-PF is based on the assumption of the availability of real-time observation and historical data, where the historical failure data can be used instead of the physical failure model within the particle filter. The feasibility of the framework is validated using Li-ion battery prognostic data obtained from the National Aeronautic and Space Administration (NASA) Ames Prognostic Center of Excellence (PCoE). The experimental results show the following: (1) fewer data dimensions for the input data are required compared to traditional empirical models; (2) the proposed diagnostic approach provides an effective way of diagnosing Li-ion battery fault; (3) the proposed prognostic approach can predict the RUL of Li-ion batteries with small error, and has high prediction accuracy; and, (4) the proposed prognostic approach shows that historical failure data can be used instead of a physical failure model in the particle filter

    Unsupervised Learning from Shollow to Deep

    Get PDF
    Machine learning plays a pivotal role in most state-of-the-art systems in many application research domains. With the rising of deep learning, massive labeled data become the solution of feature learning, which enables the model to learn automatically. Unfortunately, the trained deep learning model is hard to adapt to other datasets without fine-tuning, and the applicability of machine learning methods is limited by the amount of available labeled data. Therefore, the aim of this thesis is to alleviate the limitations of supervised learning by exploring algorithms to learn good internal representations, and invariant feature hierarchies from unlabelled data. Firstly, we extend the traditional dictionary learning and sparse coding algorithms onto hierarchical image representations in a principled way. To achieve dictionary atoms capture additional information from extended receptive fields and attain improved descriptive capacity, we present a two-pass multi-resolution cascade framework for dictionary learning and sparse coding. This cascade method allows collaborative reconstructions at different resolutions using only the same dimensional dictionary atoms. The jointly learned dictionary comprises atoms that adapt to the information available at the coarsest layer, where the support of atoms reaches a maximum range, and the residual images, where the supplementary details refine progressively a reconstruction objective. Our method generates flexible and accurate representations using only a small number of coefficients, and is efficient in computation. In the following work, we propose to incorporate the traditional self-expressiveness property into deep learning to explore better representation for subspace clustering. This architecture is built upon deep auto-encoders, which non-linearly map the input data into a latent space. Our key idea is to introduce a novel self-expressive layer between the encoder and the decoder to mimic the ``self-expressiveness'' property that has proven effective in traditional subspace clustering. Being differentiable, our new self-expressive layer provides a simple but effective way to learn pairwise affinities between all data points through a standard back-propagation procedure. Being nonlinear, our neural-network based method is able to cluster data points having complex (often nonlinear) structures. However, Subspace clustering algorithms are notorious for their scalability issues because building and processing large affinity matrices are demanding. We propose two methods to tackle this problem. One method is based on kk-Subspace Clustering, where we introduce a method that simultaneously learns an embedding space along subspaces within it to minimize a notion of reconstruction error, thus addressing the problem of subspace clustering in an end-to-end learning paradigm. This in turn frees us from the need of having an affinity matrix to perform clustering. The other way starts from using a feed forward network to replace the spectral clustering and learn the affinities of each data from "self-expressive" layer. We introduce the Neural Collaborative Subspace Clustering, where it benefits from a classifier which determines whether a pair of points lies on the same subspace under supervision of "self-expressive" layer. Essential to our model is the construction of two affinity matrices, one from the classifier and the other from a notion of subspace self-expressiveness, to supervise training in a collaborative scheme. In summary, we make constributions on how to perform the unsupervised learning in several tasks in this thesis. It starts from traditional sparse coding and dictionary learning perspective in low-level vision. Then, we exploit how to incorporate unsupervised learning in convolutional neural networks without label information and make subspace clustering to large scale dataset. Furthermore, we also extend the clustering on dense prediction task (saliency detection)

    Intelligent system for time series pattern identification and prediction

    Get PDF
    Mestrado em Gestão de Sistemas de InformaçãoOs crescentes volumes de dados representam uma fonte de informação potencialmente valiosa para as empresas, mas também implicam desafios nunca antes enfrentados. Apesar da sua complexidade intrínseca, as séries temporais são um tipo de dados notavelmente relevantes para o contexto empresarial, especialmente para tarefas preditivas. Os modelos Autorregressivos Integrados de Médias Móveis (ARIMA), têm sido a abordagem mais popular para tais tarefas, porém, não estão preparados para lidar com as cada vez mais comuns séries temporais de maior dimensão ou granularidade. Assim, novas tendências de investigação envolvem a aplicação de modelos orientados a dados, como Redes Neuronais Recorrentes (RNNs), à previsão. Dada a dificuldade da previsão de séries temporais e a necessidade de ferramentas aprimoradas, o objetivo deste projeto foi a implementação dos modelos clássicos ARIMA e as arquiteturas RNN mais proeminentes, de forma automática, e o posterior uso desses modelos como base para o desenvolvimento de um sistema modular capaz de apoiar o utilizador em todo o processo de previsão. Design science research foi a abordagem metodológica adotada para alcançar os objetivos propostos e envolveu, para além da identificação dos objetivos, uma revisão aprofundada da literatura que viria a servir de suporte teórico à etapa seguinte, designadamente a execução do projeto e findou com a avaliação meticulosa do artefacto produzido. No geral todos os objetivos propostos foram alcançados, sendo os principais contributos do projeto o próprio sistema desenvolvido devido à sua utilidade prática e ainda algumas evidências empíricas que apoiam a aplicabilidade das RNNs à previsão de séries temporais.The current growing volumes of data present a source of potentially valuable information for companies, but they also pose new challenges never faced before. Despite their intrinsic complexity, time series are a notably relevant kind of data in the entrepreneurial context, especially regarding prediction tasks. The Autoregressive Integrated Moving Average (ARIMA) models have been the most popular approach for such tasks, but they do not scale well to bigger and more granular time series which are becoming increasingly common. Hence, newer research trends involve the application of data-driven models, such as Recurrent Neural Networks (RNNs), to forecasting. Therefore, given the difficulty of time series prediction and the need for improved tools, the purpose of this project was to implement the classical ARIMA models and the most prominent RNN architectures in an automated fashion and posteriorly to use such models as foundation for the development of a modular system capable of supporting the common user along the entire forecasting process. Design science research was the adopted methodology to achieve the proposed goals and it comprised the activities of goal definition, followed by a thorough literature review aimed at providing the theoretical background necessary to the subsequent step that involved the actual project execution and, finally, the careful evaluation of the produced artifact. In general, each the established goals were accomplished, and the main contributions of the project were the developed system itself due to its practical usefulness along with some empirical evidence supporting the suitability of RNNs to time series forecasting.info:eu-repo/semantics/publishedVersio

    Improving the fidelity of abstract camera network simulations

    Get PDF
    This thesis studies the impact of augmenting an abstract target detection model with a higher degree of realism on the fidelity of the outcomes of camera network simulators in reflecting real-world results. The work is motivated by the identified trade-off between realistic but computationally expensive models and approximate but computationally cheap models. This trade-off opens the possibility for an al-ternative to augment abstract simulation tools with a higher degree of realism to capture both benefits, low computational expense with a higher fidelity of the out-comes. For the task of target detection, we propose a novel decomposition method with an intermediate point of representation. This point is the core element of our model that decouples the architecture into two parts. Decoupling brings flexibility and modularity into the design. This empowers practitioners to select the model’s fea-tures individually and independently to their requirements and camera settings. To investigate the fidelity of our model’s outcomes, we build models of three detectors and apply on our lab-based image data set to create ground truth confidences. By incorporating only a few more properties of realism, the fidelity of our model’s out-comes improved significantly when compared to the initial results in reflecting the ground truth confidences. Finally, to explore the implication of our high fidelity target detection model, we select a case study from coverage redundancy in smart camera networks. High-lighting the performance of a coverage approach strongly relies on the reliability of target detection results. An underestimation in the performance of studied coverage approaches is determined by employing the standard abstract detection model when compared to the results of our model. The identified underestimation in this study is one example of the general open concern in agent-based modelling about the unclear impact of simplified abstract models on the ability of the simulator to capture real-world behaviours
    corecore