122,759 research outputs found

    Time series kernel similarities for predicting Paroxysmal Atrial Fibrillation from ECGs

    Get PDF
    We tackle the problem of classifying Electrocardiography (ECG) signals with the aim of predicting the onset of Paroxysmal Atrial Fibrillation (PAF). Atrial fibrillation is the most common type of arrhythmia, but in many cases PAF episodes are asymptomatic. Therefore, in order to help diagnosing PAF, it is important to design procedures for detecting and, more importantly, predicting PAF episodes. We propose a method for predicting PAF events whose first step consists of a feature extraction procedure that represents each ECG as a multi-variate time series. Successively, we design a classification framework based on kernel similarities for multi-variate time series, capable of handling missing data. We consider different approaches to perform classification in the original space of the multi-variate time series and in an embedding space, defined by the kernel similarity measure. We achieve a classification accuracy comparable with state of the art methods, with the additional advantage of detecting the PAF onset up to 15 minutes in advance

    Learning Interpretable Features of Graphs and Time Series Data

    Get PDF
    Graphs and time series are two of the most ubiquitous representations of data of modern time. Representation learning of real-world graphs and time-series data is a key component for the downstream supervised and unsupervised machine learning tasks such as classification, clustering, and visualization. Because of the inherent high dimensionality, representation learning, i.e., low dimensional vector-based embedding of graphs and time-series data is very challenging. Learning interpretable features incorporates transparency of the feature roles, and facilitates downstream analytics tasks in addition to maximizing the performance of the downstream machine learning models. In this thesis, we leveraged tensor (multidimensional array) decomposition for generating interpretable and low dimensional feature space of graphs and time-series data found from three domains: social networks, neuroscience, and heliophysics. We present the theoretical models and empirical results on node embedding of social networks, biomarker embedding on fMRI-based brain networks, and prediction and visualization of multivariate time-series-based flaring and non-flaring solar events

    Generative Pre-Training of Time-Series Data for Unsupervised Fault Detection in Semiconductor Manufacturing

    Full text link
    This paper introduces TRACE-GPT, which stands for Time-seRies Anomaly-detection with Convolutional Embedding and Generative Pre-trained Transformers. TRACE-GPT is designed to pre-train univariate time-series sensor data and detect faults on unlabeled datasets in semiconductor manufacturing. In semiconductor industry, classifying abnormal time-series sensor data from normal data is important because it is directly related to wafer defect. However, small, unlabeled, and even mixed training data without enough anomalies make classification tasks difficult. In this research, we capture features of time-series data with temporal convolutional embedding and Generative Pre-trained Transformer (GPT) to classify abnormal sequences from normal sequences using cross entropy loss. We prove that our model shows better performance than previous unsupervised models with both an open dataset, the University of California Riverside (UCR) time-series classification archive, and the process log of our Chemical Vapor Deposition (CVD) equipment. Our model has the highest F1 score at Equal Error Rate (EER) across all datasets and is only 0.026 below the supervised state-of-the-art baseline on the open dataset

    Exact multi-parameter persistent homology of time-series data: one-dimensional reduction of multi-parameter persistence theory

    Full text link
    In various applications of data classification and clustering problems, multi-parameter analysis is effective and crucial because data are usually defined in multi-parametric space. Multi-parameter persistent homology, an extension of persistent homology of one-parameter data analysis, has been developed for topological data analysis (TDA). Although it is conceptually attractive, multi-parameter persistent homology still has challenges in theory and practical applications. In this study, we consider time-series data and its classification and clustering problems using multi-parameter persistent homology. We develop a multi-parameter filtration method based on Fourier decomposition and provide an exact formula and its interpretation of one-dimensional reduction of multi-parameter persistent homology. The exact formula implies that the one-dimensional reduction of multi-parameter persistent homology of the given time-series data is equivalent to choosing diagonal ray (standard ray) in the multi-parameter filtration space. For this, we first consider the continuousization of time-series data based on Fourier decomposition towards the construction of the exact persistent barcode formula for the Vietoris-Rips complex of the point cloud generated by sliding window embedding. The proposed method is highly efficient even if the sliding window embedding dimension and the length of time-series data are large because the method precomputes the exact barcode and the computational complexity is as low as the fast Fourier transformation of O(NlogN)O(N \log N). Further the proposed method provides a way of finding different topological inferences by trying different rays in the filtration space in no time.Comment: 29 page

    Image Embedding of PMU Data for Deep Learning towards Transient Disturbance Classification

    Full text link
    This paper presents a study on power grid disturbance classification by Deep Learning (DL). A real synchrophasor set composing of three different types of disturbance events from the Frequency Monitoring Network (FNET) is used. An image embedding technique called Gramian Angular Field is applied to transform each time series of event data to a two-dimensional image for learning. Two main DL algorithms, i.e. CNN (Convolutional Neural Network) and RNN (Recurrent Neural Network) are tested and compared with two widely used data mining tools, the Support Vector Machine and Decision Tree. The test results demonstrate the superiority of the both DL algorithms over other methods in the application of power system transient disturbance classification.Comment: An updated version of this manuscript has been accepted by the 2018 IEEE International Conference on Energy Internet (ICEI), Beijing, Chin
    corecore