162,917 research outputs found

    Tensor-variate machine learning on graphs

    Get PDF
    Traditional machine learning algorithms are facing significant challenges as the world enters the era of big data, with a dramatic expansion in volume and range of applications and an increase in the variety of data sources. The large- and multi-dimensional nature of data often increases the computational costs associated with their processing and raises the risks of model over-fitting - a phenomenon known as the curse of dimensionality. To this end, tensors have become a subject of great interest in the data analytics community, owing to their remarkable ability to super-compress high-dimensional data into a low-rank format, while retaining the original data structure and interpretability. This leads to a significant reduction in computational costs, from an exponential complexity to a linear one in the data dimensions. An additional challenge when processing modern big data is that they often reside on irregular domains and exhibit relational structures, which violates the regular grid assumptions of traditional machine learning models. To this end, there has been an increasing amount of research in generalizing traditional learning algorithms to graph data. This allows for the processing of graph signals while accounting for the underlying relational structure, such as user interactions in social networks, vehicle flows in traffic networks, transactions in supply chains, chemical bonds in proteins, and trading data in financial networks, to name a few. Although promising results have been achieved in these fields, there is a void in literature when it comes to the conjoint treatment of tensors and graphs for data analytics. Solutions in this area are increasingly urgent, as modern big data is both large-dimensional and irregular in structure. To this end, the goal of this thesis is to explore machine learning methods that can fully exploit the advantages of both tensors and graphs. In particular, the following approaches are introduced: (i) Graph-regularized tensor regression framework for modelling high-dimensional data while accounting for the underlying graph structure; (ii) Tensor-algebraic approach for computing efficient convolution on graphs; (iii) Graph tensor network framework for designing neural learning systems which is both general enough to describe most existing neural network architectures and flexible enough to model large-dimensional data on any and many irregular domains. The considered frameworks were employed in several real-world applications, including air quality forecasting, protein classification, and financial modelling. Experimental results validate the advantages of the proposed methods, which achieved better or comparable performance against state-of-the-art models. Additionally, these methods benefit from increased interpretability and reduced computational costs, which are crucial for tackling the challenges posed by the era of big data.Open Acces

    Functional graphical models

    Get PDF
    Graphical models have attracted increasing attention in recent years, especially in settings involving high dimensional data. In particular Gaussian graphical models are used to model the conditional dependence structure among multiple Gaussian random variables. As a result of its computational efficiency the graphical lasso (glasso) has become one of the most popular approaches for fitting high dimensional graphical models. In this article we extend the graphical models concept to model the conditional dependence structure among p random functions. In this setting, not only is p large, but each function is itself a high dimensional object, posing an additional level of statistical and computational complexity. We develop an extension of the glasso criterion (fglasso), which estimates the functional graphical model by imposing a block sparsity constraint on the precision matrix, via a group lasso penalty. The fglasso criterion can be optimized using an efficient block coordinate descent algorithm. We establish the concentration inequalities of the estimates, which guarantee the desirable graph sup- port recovery property, i.e. with probability tending to one, the fglasso will correctly identify the true conditional dependence structure. Finally we show that the fglasso significantly outperforms possible competing methods through both simulations and an analysis of a real world EEG data set comparing alcoholic and non-alcoholic patients

    Spectral embedding finds meaningful (relevant) structure in image and microarray data

    Get PDF
    BACKGROUND: Accurate methods for extraction of meaningful patterns in high dimensional data have become increasingly important with the recent generation of data types containing measurements across thousands of variables. Principal components analysis (PCA) is a linear dimensionality reduction (DR) method that is unsupervised in that it relies only on the data; projections are calculated in Euclidean or a similar linear space and do not use tuning parameters for optimizing the fit to the data. However, relationships within sets of nonlinear data types, such as biological networks or images, are frequently mis-rendered into a low dimensional space by linear methods. Nonlinear methods, in contrast, attempt to model important aspects of the underlying data structure, often requiring parameter(s) fitting to the data type of interest. In many cases, the optimal parameter values vary when different classification algorithms are applied on the same rendered subspace, making the results of such methods highly dependent upon the type of classifier implemented. RESULTS: We present the results of applying the spectral method of Lafon, a nonlinear DR method based on the weighted graph Laplacian, that minimizes the requirements for such parameter optimization for two biological data types. We demonstrate that it is successful in determining implicit ordering of brain slice image data and in classifying separate species in microarray data, as compared to two conventional linear methods and three nonlinear methods (one of which is an alternative spectral method). This spectral implementation is shown to provide more meaningful information, by preserving important relationships, than the methods of DR presented for comparison. Tuning parameter fitting is simple and is a general, rather than data type or experiment specific approach, for the two datasets analyzed here. Tuning parameter optimization is minimized in the DR step to each subsequent classification method, enabling the possibility of valid cross-experiment comparisons. CONCLUSION: Results from the spectral method presented here exhibit the desirable properties of preserving meaningful nonlinear relationships in lower dimensional space and requiring minimal parameter fitting, providing a useful algorithm for purposes of visualization and classification across diverse datasets, a common challenge in systems biology

    Multi-Scale Process Modelling and Distributed Computation for Spatial Data

    Full text link
    Recent years have seen a huge development in spatial modelling and prediction methodology, driven by the increased availability of remote-sensing data and the reduced cost of distributed-processing technology. It is well known that modelling and prediction using infinite-dimensional process models is not possible with large data sets, and that both approximate models and, often, approximate-inference methods, are needed. The problem of fitting simple global spatial models to large data sets has been solved through the likes of multi-resolution approximations and nearest-neighbour techniques. Here we tackle the next challenge, that of fitting complex, nonstationary, multi-scale models to large data sets. We propose doing this through the use of superpositions of spatial processes with increasing spatial scale and increasing degrees of nonstationarity. Computation is facilitated through the use of Gaussian Markov random fields and parallel Markov chain Monte Carlo based on graph colouring. The resulting model allows for both distributed computing and distributed data. Importantly, it provides opportunities for genuine model and data scaleability and yet is still able to borrow strength across large spatial scales. We illustrate a two-scale version on a data set of sea-surface temperature containing on the order of one million observations, and compare our approach to state-of-the-art spatial modelling and prediction methods.Comment: 33 pages, 10 figures, 1 tabl

    Subspace discovery for video anomaly detection

    Get PDF
    PhDIn automated video surveillance anomaly detection is a challenging task. We address this task as a novelty detection problem where pattern description is limited and labelling information is available only for a small sample of normal instances. Classification under these conditions is prone to over-fitting. The contribution of this work is to propose a novel video abnormality detection method that does not need object detection and tracking. The method is based on subspace learning to discover a subspace where abnormality detection is easier to perform, without the need of detailed annotation and description of these patterns. The problem is formulated as one-class classification utilising a low dimensional subspace, where a novelty classifier is used to learn normal actions automatically and then to detect abnormal actions from low-level features extracted from a region of interest. The subspace is discovered (using both labelled and unlabelled data) by a locality preserving graph-based algorithm that utilises the Graph Laplacian of a specially designed parameter-less nearest neighbour graph. The methodology compares favourably with alternative subspace learning algorithms (both linear and non-linear) and direct one-class classification schemes commonly used for off-line abnormality detection in synthetic and real data. Based on these findings, the framework is extended to on-line abnormality detection in video sequences, utilising multiple independent detectors deployed over the image frame to learn the local normal patterns and infer abnormality for the complete scene. The method is compared with an alternative linear method to establish advantages and limitations in on-line abnormality detection scenarios. Analysis shows that the alternative approach is better suited for cases where the subspace learning is restricted on the labelled samples, while in the presence of additional unlabelled data the proposed approach using graph-based subspace learning is more appropriate

    Time-dependent mode structure for Lyapunov vectors as a collective movement in quasi-one-dimensional systems

    Full text link
    Time dependent mode structure for the Lyapunov vectors associated with the stepwise structure of the Lyapunov spectra and its relation to the momentum auto-correlation function are discussed in quasi-one-dimensional many-hard-disk systems. We demonstrate mode structures (Lyapunov modes) for all components of the Lyapunov vectors, which include the longitudinal and transverse components of their spatial and momentum parts, and their phase relations are specified. These mode structures are suggested from the form of the Lyapunov vectors corresponding to the zero-Lyapunov exponents. Spatial node structures of these modes are explained by the reflection properties of the hard-walls used in the models. Our main interest is the time-oscillating behavior of Lyapunov modes. It is shown that the largest time-oscillating period of the Lyapunov modes is twice as long as the time-oscillating period of the longitudinal momentum auto-correlation function. This relation is satisfied irrespective of the particle number and boundary conditions. A simple explanation for this relation is given based on the form of the Lyapunov vector.Comment: 39 pages, 21 figures, Manuscript including the figures of better quality is available from http://www.phys.unsw.edu.au/~gary/Research.htm
    corecore