82,730 research outputs found

    Unsupervised Shift-invariant Feature Learning from Time-series Data

    Get PDF
    Unsupervised feature learning is one of the key components of machine learningand articial intelligence. Learning features from high dimensional streaming data isan important and dicult problem which is incorporated with number of challenges.Moreover, feature learning algorithms need to be evaluated and generalized for timeseries with dierent patterns and components. A detailed study is needed to clarifywhen simple algorithms fail to learn features and whether we need more complicatedmethods.In this thesis, we show that the systematic way to learn meaningful featuresfrom time-series is by using convolutional or shift-invariant versions of unsupervisedfeature learning. We experimentally compare the shift-invariant versions of clustering,sparse coding and non-negative matrix factorization algorithms for: reconstruction,noise separation, prediction, classication and simulating auditory lters from acousticsignals. The results show that the most ecient and highly scalable clustering algorithmwith a simple modication in inference and learning phase is able to produce meaningfulresults. Clustering features are also comparable with sparse coding and non-negativematrix factorization in most of the tasks (e.g. classication) and even more successful insome (e.g. prediction). Shift invariant sparse coding is also used on a novel application,inferring hearing loss from speech signal and produced promising results.Performance of algorithms with regard to some important factors such as: timeseries components, number of features and size of receptive eld is also analyzed. Theresults show that there is a signicant positive correlation between performance of clusteringwith degree of trend, frequency skewness, frequency kurtosis and serial correlationof data, whereas, the correlation is negative in the case of dataset average bandwidth.Performance of shift invariant sparse coding is aected by frequency skewness, frequencykurtosis and serial correlation of data. Non-Negative matrix factorization is influenced by data characteristics same as clustering

    Crime incidents embedding using restricted Boltzmann machines

    Full text link
    We present a new approach for detecting related crime series, by unsupervised learning of the latent feature embeddings from narratives of crime record via the Gaussian-Bernoulli Restricted Boltzmann Machines (RBM). This is a drastically different approach from prior work on crime analysis, which typically considers only time and location and at most category information. After the embedding, related cases are closer to each other in the Euclidean feature space, and the unrelated cases are far apart, which is a good property can enable subsequent analysis such as detection and clustering of related cases. Experiments over several series of related crime incidents hand labeled by the Atlanta Police Department reveal the promise of our embedding methods.Comment: 5 pages, 3 figure

    Unsupervised feature selection for sensor time-series in pervasive computing applications

    Get PDF
    The paper introduces an efficient feature selection approach for multivariate time-series of heterogeneous sensor data within a pervasive computing scenario. An iterative filtering procedure is devised to reduce information redundancy measured in terms of time-series cross-correlation. The algorithm is capable of identifying nonredundant sensor sources in an unsupervised fashion even in presence of a large proportion of noisy features. In particular, the proposed feature selection process does not require expert intervention to determine the number of selected features, which is a key advancement with respect to time-series filters in the literature. The characteristic of the prosed algorithm allows enriching learning systems, in pervasive computing applications, with a fully automatized feature selection mechanism which can be triggered and performed at run time during system operation. A comparative experimental analysis on real-world data from three pervasive computing applications is provided, showing that the algorithm addresses major limitations of unsupervised filters in the literature when dealing with sensor time-series. Specifically, it is presented an assessment both in terms of reduction of time-series redundancy and in terms of preservation of informative features with respect to associated supervised learning tasks

    Learning Interpretable Features of Graphs and Time Series Data

    Get PDF
    Graphs and time series are two of the most ubiquitous representations of data of modern time. Representation learning of real-world graphs and time-series data is a key component for the downstream supervised and unsupervised machine learning tasks such as classification, clustering, and visualization. Because of the inherent high dimensionality, representation learning, i.e., low dimensional vector-based embedding of graphs and time-series data is very challenging. Learning interpretable features incorporates transparency of the feature roles, and facilitates downstream analytics tasks in addition to maximizing the performance of the downstream machine learning models. In this thesis, we leveraged tensor (multidimensional array) decomposition for generating interpretable and low dimensional feature space of graphs and time-series data found from three domains: social networks, neuroscience, and heliophysics. We present the theoretical models and empirical results on node embedding of social networks, biomarker embedding on fMRI-based brain networks, and prediction and visualization of multivariate time-series-based flaring and non-flaring solar events

    Clustering: Methodology, hybrid systems, visualization, validation and implementation

    Get PDF
    Unsupervised learning is one of the most important steps of machine learning applications. Besides its ability to obtain the insight of the data distribution, unsupervised learning is used as a preprocessing step for other machine learning algorithm. This dissertation investigates the application of unsupervised learning into various types of data for many machine learning tasks such as clustering, regression and classification. The dissertation is organized into three papers. In the first paper, unsupervised learning is applied to mixed categorical and numerical feature data type to transform the data objects from the mixed type feature domain into a new sparser numerical domain. By making use of the data fusion capacity of adaptive resonance theory clustering, the approach is able to reduce the distinction between the numerical and categorical features. The second paper presents a novel method to improve the performance of wind forecast by clustering the time series of the surrounding wind mills into the similar group by using hidden Markov model clustering and using the clustering information to enhance the forecast. A fast forecast method is also introduced by using extreme learning machine which can be trained by analytic form to choose the optimal value of past samples for prediction and appropriate size of the neural network. In the third paper, unsupervised learning is used to automatically learn the feature from the dataset itself without human design of sophisticated feature extractors. The paper points out that by using unsupervised feature learning with multi-quadric radial basis function extreme learning machine the performance of the classifier is better than several other supervised learning methods. The paper further improves the speed of training the neural network by presenting an algorithm that runs parallel on GPU --Abstract, page iv

    Synaptic state matching: a dynamical architecture for predictive internal representation and feature perception

    Get PDF
    Here we consider the possibility that a fundamental function of sensory cortex is the generation of an internal simulation of sensory environment in real-time. A logical elaboration of this idea leads to a dynamical neural architecture that oscillates between two fundamental network states, one driven by external input, and the other by recurrent synaptic drive in the absence of sensory input. Synaptic strength is modified by a proposed synaptic state matching (SSM) process that ensures equivalence of spike statistics between the two network states. Remarkably, SSM, operating locally at individual synapses, generates accurate and stable network-level predictive internal representations, enabling pattern completion and unsupervised feature detection from noisy sensory input. SSM is a biologically plausible substrate for learning and memory because it brings together sequence learning, feature detection, synaptic homeostasis, and network oscillations under a single parsimonious computational framework. Beyond its utility as a potential model of cortical computation, artificial networks based on this principle have remarkable capacity for internalizing dynamical systems, making them useful in a variety of application domains including time-series prediction and machine intelligence
    • …
    corecore