65 research outputs found

    Simultaneous model-based clustering and visualization in the Fisher discriminative subspace

    Full text link
    Clustering in high-dimensional spaces is nowadays a recurrent problem in many scientific domains but remains a difficult task from both the clustering accuracy and the result understanding points of view. This paper presents a discriminative latent mixture (DLM) model which fits the data in a latent orthonormal discriminative subspace with an intrinsic dimension lower than the dimension of the original space. By constraining model parameters within and between groups, a family of 12 parsimonious DLM models is exhibited which allows to fit onto various situations. An estimation algorithm, called the Fisher-EM algorithm, is also proposed for estimating both the mixture parameters and the discriminative subspace. Experiments on simulated and real datasets show that the proposed approach performs better than existing clustering methods while providing a useful representation of the clustered data. The method is as well applied to the clustering of mass spectrometry data

    An improved mixture of probabilistic PCA for nonlinear data-driven process monitoring

    Get PDF
    An improved mixture of probabilistic principal component analysis (PPCA) has been introduced for nonlinear data-driven process monitoring in this paper. To realize this purpose, the technique of a mixture of probabilistic principal component analyzers is utilized to establish the model of the underlying nonlinear process with local PPCA models, where a novel composite monitoring statistic is proposed based on the integration of two monitoring statistics in modified PPCA-based fault detection approach. Besides, the weighted mean of the monitoring statistics aforementioned is utilized as a metrics to detect potential abnormalities. The virtues of the proposed algorithm are discussed in comparison with several unsupervised algorithms. Finally, Tennessee Eastman process and an autosuspension model are employed to demonstrate the effectiveness of the proposed scheme further

    A scalable, efficient, and accurate solution to non-rigid structure from motion

    Get PDF
    © . This manuscript version is made available under the CC-BY-NC-ND 4.0 license http://creativecommons.org/licenses/by-nc-nd/4.0/Most Non-Rigid Structure from Motion (NRSfM) solutions are based on factorization approaches that allow reconstructing objects parameterized by a sparse set of 3D points. These solutions, however, are low resolution and generally, they do not scale well to more than a few tens of points. While there have been recent attempts at bringing NRSfM to a dense domain, using for instance variational formulations, these are computationally demanding alternatives which require certain spatial continuity of the data, preventing their use for articulated shapes with large deformations or situations with multiple discontinuous objects. In this paper, we propose incorporating existing point trajectory low-rank models into a probabilistic framework for matrix normal distributions. With this formalism, we can then simultaneously learn shape and pose parameters using expectation maximization, and easily exploit additional priors such as known point correlations. While similar frameworks have been used before to model distributions over shapes, here we show that formulating the problem in terms of distributions over trajectories brings remarkable improvements, especially in generality and efficiency. We evaluate the proposed approach in a variety of scenarios including one or multiple objects, sparse or dense reconstructions, missing observations, mild or sharp deformations, and in all cases, with minimal prior knowledge and low computational cost.Peer ReviewedPostprint (author's final draft

    A Novel Hybrid Dimensionality Reduction Method using Support Vector Machines and Independent Component Analysis

    Get PDF
    Due to the increasing demand for high dimensional data analysis from various applications such as electrocardiogram signal analysis and gene expression analysis for cancer detection, dimensionality reduction becomes a viable process to extracts essential information from data such that the high-dimensional data can be represented in a more condensed form with much lower dimensionality to both improve classification accuracy and reduce computational complexity. Conventional dimensionality reduction methods can be categorized into stand-alone and hybrid approaches. The stand-alone method utilizes a single criterion from either supervised or unsupervised perspective. On the other hand, the hybrid method integrates both criteria. Compared with a variety of stand-alone dimensionality reduction methods, the hybrid approach is promising as it takes advantage of both the supervised criterion for better classification accuracy and the unsupervised criterion for better data representation, simultaneously. However, several issues always exist that challenge the efficiency of the hybrid approach, including (1) the difficulty in finding a subspace that seamlessly integrates both criteria in a single hybrid framework, (2) the robustness of the performance regarding noisy data, and (3) nonlinear data representation capability. This dissertation presents a new hybrid dimensionality reduction method to seek projection through optimization of both structural risk (supervised criterion) from Support Vector Machine (SVM) and data independence (unsupervised criterion) from Independent Component Analysis (ICA). The projection from SVM directly contributes to classification performance improvement in a supervised perspective whereas maximum independence among features by ICA construct projection indirectly achieving classification accuracy improvement due to better intrinsic data representation in an unsupervised perspective. For linear dimensionality reduction model, I introduce orthogonality to interrelate both projections from SVM and ICA while redundancy removal process eliminates a part of the projection vectors from SVM, leading to more effective dimensionality reduction. The orthogonality-based linear hybrid dimensionality reduction method is extended to uncorrelatedness-based algorithm with nonlinear data representation capability. In the proposed approach, SVM and ICA are integrated into a single framework by the uncorrelated subspace based on kernel implementation. Experimental results show that the proposed approaches give higher classification performance with better robustness in relatively lower dimensions than conventional methods for high-dimensional datasets

    Latent Space Reinforcement Learning

    Get PDF
    Often we have to handle high dimensional spaces if we want to learn motor skills for robots. In policy search tasks we have to find several parameters to learn a desired movement. This high dimensionality in parameters can be challenging for reinforcement algorithms, since more samples for finding an optimal solution are needed with every additional dimension. On the other hand, if the robot has a high number of actuators, an inherent correlation between these can be found for a specific motor task, which we can exploit for a faster convergence. One possibility is to use techniques to reduce the dimensionality of the space, which is used as a pre-processing step or as an independent process in most applications. In this thesis we present a novel algorithm which combines the theory of policy search and probabilistic dimensionality reduction to uncover the hidden structure of high dimensional action spaces. Evaluations on an inverse kinematics task indicate that the presented algorithm is able to outperform the reference algorithms PoWER and CMA-ES, especially in high dimensional spaces. Furthermore we evaluate our algorithm on a real-world task. In this task, a NAO robot learns to lift his leg while keeping balance. The issue of collecting samples for learning on a real robot in such a task, which is often very time and cost consuming, is considered in here by using a small number of samples in each iteration

    Mining Text and Time Series Data with Applications in Finance

    Get PDF
    Finance is a field extremely rich in data, and has great need of methods for summarizing and understanding these data. Existing methods of multivariate analysis allow the discovery of structure in time series data but can be difficult to interpret. Often there exists a wealth of text data directly related to the time series. In this thesis it is shown that this text can be exploited to aid interpretation of, and even to improve, the structure uncovered. To this end, two approaches are described and tested. Both serve to uncover structure in the relationship between text and time series data, but do so in very different ways. The first model comes from the field of topic modelling. A novel topic model is developed, closely related to an existing topic model for mixed data. Improved held-out likelihood is demonstrated for this model on a corpus of UK equity market data and the discovered structure is qualitatively examined. To the authors’ knowledge this is the first attempt to combine text and time series data in a single generative topic model. The second method is a simpler, discriminative method based on a low-rank decomposition of time series data with constraints determined by word frequencies in the text data. This is compared to topic modelling using both the equity data and a second corpus comprising foreign exchange rates time series and text describing global macroeconomic sentiments, showing further improvements in held-out likelihood. One example of an application for the inferred structure is also demonstrated: construction of carry trade portfolios. The superior results using this second method serve as a reminder that methodological complexity does not guarantee performance gains

    Machine Learning Developments in Dependency Modelling and Feature Extraction

    Get PDF
    Three complementary feature extraction approaches are developed in this thesis which addresses the challenge of dimensionality reduction in the presence of multivariate heavy-tailed and asymmetric distributions. First, we demonstrate how to improve the robustness of the standard Probabilistic Principal Component Analysis by adapting the concept of robust mean and covariance estimation within the standard framework. We then introduce feature extraction methods that extend the standard Principal Component Analysis by exploring distribution-based robustification. This is achieved via Probabilistic Principal Component Analysis (PPCA), in which new, statistically robust variants are derived, also treating missing data. We propose a novel generalisation to the t-Student Probabilistic Principal Component methodology which (1) accounts for asymmetric distribution of the observation data, (2) is a framework for grouped and generalised multiple-degree-of-freedom structures, which provides a more flexible framework to model groups of marginal tail dependence in the observation data, and (3) separates the tail effect of the error terms and factors. The new feature extraction methods are derived in an incomplete data setting to efficiently handle the presence of missing values in the observation vector. We discuss statistical properties of their robustness. In the next part of this thesis, we demonstrate the applicability of feature extraction methods to the statistical analysis of multidimensional dynamics. We introduce the class of Hybrid Factor models that combines classical state-space model formulations with incorporation of exogenous factors. We show how to utilize the information obtained from features extracted using introduced robust PPCA in a modelling framework in a meaningful and parsimonious manner. In the first application study, we show the applicability of robust feature extraction methods in the real data environment of financial markets and combine the obtained results with a stochastic multi-factor panel regression-based state-space model in order to model the dynamic of yield curves, whilst incorporating regression factors. We embed the rank-reduced feature extractions into a stochastic representation of state-space models for yield curve dynamics and compare the results to classical multi-factor dynamic Nelson-Siegel state-space models. This leads to important new representations of yield curve models that can have practical importance for addressing questions of financial stress testing and monetary policy interventions which can efficiently incorporate financial big data. We illustrate our results on various financial and macroeconomic data sets from the Euro Zone and international markets. In the second study, we develop a multi-factor extension of the family of Lee-Carter stochastic mortality models. We build upon the time, period and cohort stochastic model structure to include exogenous observable demographic features that can be used as additional factors to improve model fit and forecasting accuracy. We develop a framework in which (a) we employ projection-based techniques of dimensionality reduction that are amenable to different structures of demographic data; (b) we analyse demographic data sets from the patterns of missingness and the impact of such missingness on the feature extraction; (c) we introduce a class of multi-factor stochastic mortality models incorporating time, period, cohort and demographic features, which we develop within a Bayesian state-space estimation framework. Finally (d) we develop an efficient combined Markov chain and filtering framework for sampling the posterior and forecasting. We undertake a detailed case study on the Human Mortality Database demographic data from European countries and we use the extracted features to better explain the term structure of mortality in the UK over time for male and female populations. This is compared to a pure Lee-Carter stochastic mortality model, demonstrating that our feature extraction framework and consequent multi-factor mortality model improves both in-sample fit and, importantly, out-of-sample mortality forecasts by a non-trivial gain in performance

    Traffic State Estimation Using Probe Vehicle Data

    Full text link
    Traffic problems are becoming a burden on cities across the world. To prevent traffic accidents, mitigate congestion, and reduce fuel consumption, a critical step is to have a good understanding of traffic. Traditionally, traffic conditions are monitored primarily by fixed-location sensors. However, fixed-location sensors only provide information about specific locations, and the installation and maintenance cost is very high. The advances in gls{gps}-based technologies, such as connected vehicles and ride-hailing services, provide us an alternative approach to traffic monitoring. While these types of gls{gps}-equipped probe vehicles travel on the road, a vast amount of trajectory data are being collected. As probe vehicle data contain rich information about traffic conditions, they have drawn much attention from both researchers and practitioners in the field of traffic management and control. Extensive literature has studied the estimation of traffic speeds and travel times using probe vehicle data. However, as for queue lengths and traffic volumes, which are critical for traffic signal control and performance measures, most of the existing estimation methods based on probe vehicles can hardly be implemented in practice. The main obstacle is the low market penetration of probe vehicles. Therefore, in this dissertation, we aim to develop probe vehicle based traffic state estimation methods that are suitable for the low penetration rate environment and can potentially be implemented in the real world. First, we treat the traffic state in each location and each time point independently. We focus on estimating the queues forming at isolated intersections under light or moderate traffic. The existing methods often require prior knowledge of the queue length distribution or the probe vehicle penetration rate. However, these parameters are not available beforehand in real life. Therefore, we propose a series of methods to estimate these parameters from historical probe vehicle data. Some of the methods have been validated using real-world probe vehicle data. Second, we study traffic state estimation considering temporal correlations. The correlation of queue lengths in different traffic signal cycles is often ignored by the existing studies, although the phenomenon is commonly-observed in real life, such as the overflow queues induced by oversaturated traffic. To fill the gap, we model such queueing processes and observation processes using a hidden Markov model (gls{hmm}). Based on the gls{hmm}, we develop two cycle-by-cycle queue length estimation methods and an algorithm that can estimate the parameters of the gls{hmm} from historical probe vehicle data. Lastly, we consider the spatiotemporal correlations of traffic states, with a focus on the estimation of traffic volumes. With limited probe vehicle data, it is difficult to estimate traffic volumes accurately if we treat each location and each time slot independently. Noticing that traffic volumes in different locations and different time slots are correlated, we propose to find the low-rank representation of traffic volumes and then reconstruct the unknown values by fusing probe vehicle data and fixed-location sensor data. Test results show that the proposed methods can reconstruct the traffic volumes accurately, and they have great potential for real-world applications. In summary, this thesis systematically studies traffic state estimation based on probe vehicle data. Some of the proposed methods have been implemented in real life. We expect the methods to be implemented on an even larger scale and help transportation agencies solve more real-world traffic problems.PHDMechanical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/155289/1/zhaoyann_1.pd
    • …
    corecore