122,759 research outputs found
Time series kernel similarities for predicting Paroxysmal Atrial Fibrillation from ECGs
We tackle the problem of classifying Electrocardiography (ECG) signals with
the aim of predicting the onset of Paroxysmal Atrial Fibrillation (PAF). Atrial
fibrillation is the most common type of arrhythmia, but in many cases PAF
episodes are asymptomatic. Therefore, in order to help diagnosing PAF, it is
important to design procedures for detecting and, more importantly, predicting
PAF episodes. We propose a method for predicting PAF events whose first step
consists of a feature extraction procedure that represents each ECG as a
multi-variate time series. Successively, we design a classification framework
based on kernel similarities for multi-variate time series, capable of handling
missing data. We consider different approaches to perform classification in the
original space of the multi-variate time series and in an embedding space,
defined by the kernel similarity measure. We achieve a classification accuracy
comparable with state of the art methods, with the additional advantage of
detecting the PAF onset up to 15 minutes in advance
Learning Interpretable Features of Graphs and Time Series Data
Graphs and time series are two of the most ubiquitous representations of data of modern time. Representation learning of real-world graphs and time-series data is a key component for the downstream supervised and unsupervised machine learning tasks such as classification, clustering, and visualization. Because of the inherent high dimensionality, representation learning, i.e., low dimensional vector-based embedding of graphs and time-series data is very challenging. Learning interpretable features incorporates transparency of the feature roles, and facilitates downstream analytics tasks in addition to maximizing the performance of the downstream machine learning models. In this thesis, we leveraged tensor (multidimensional array) decomposition for generating interpretable and low dimensional feature space of graphs and time-series data found from three domains: social networks, neuroscience, and heliophysics. We present the theoretical models and empirical results on node embedding of social networks, biomarker embedding on fMRI-based brain networks, and prediction and visualization of multivariate time-series-based flaring and non-flaring solar events
Generative Pre-Training of Time-Series Data for Unsupervised Fault Detection in Semiconductor Manufacturing
This paper introduces TRACE-GPT, which stands for Time-seRies
Anomaly-detection with Convolutional Embedding and Generative Pre-trained
Transformers. TRACE-GPT is designed to pre-train univariate time-series sensor
data and detect faults on unlabeled datasets in semiconductor manufacturing. In
semiconductor industry, classifying abnormal time-series sensor data from
normal data is important because it is directly related to wafer defect.
However, small, unlabeled, and even mixed training data without enough
anomalies make classification tasks difficult. In this research, we capture
features of time-series data with temporal convolutional embedding and
Generative Pre-trained Transformer (GPT) to classify abnormal sequences from
normal sequences using cross entropy loss. We prove that our model shows better
performance than previous unsupervised models with both an open dataset, the
University of California Riverside (UCR) time-series classification archive,
and the process log of our Chemical Vapor Deposition (CVD) equipment. Our model
has the highest F1 score at Equal Error Rate (EER) across all datasets and is
only 0.026 below the supervised state-of-the-art baseline on the open dataset
Exact multi-parameter persistent homology of time-series data: one-dimensional reduction of multi-parameter persistence theory
In various applications of data classification and clustering problems,
multi-parameter analysis is effective and crucial because data are usually
defined in multi-parametric space. Multi-parameter persistent homology, an
extension of persistent homology of one-parameter data analysis, has been
developed for topological data analysis (TDA). Although it is conceptually
attractive, multi-parameter persistent homology still has challenges in theory
and practical applications. In this study, we consider time-series data and its
classification and clustering problems using multi-parameter persistent
homology. We develop a multi-parameter filtration method based on Fourier
decomposition and provide an exact formula and its interpretation of
one-dimensional reduction of multi-parameter persistent homology. The exact
formula implies that the one-dimensional reduction of multi-parameter
persistent homology of the given time-series data is equivalent to choosing
diagonal ray (standard ray) in the multi-parameter filtration space. For this,
we first consider the continuousization of time-series data based on Fourier
decomposition towards the construction of the exact persistent barcode formula
for the Vietoris-Rips complex of the point cloud generated by sliding window
embedding. The proposed method is highly efficient even if the sliding window
embedding dimension and the length of time-series data are large because the
method precomputes the exact barcode and the computational complexity is as low
as the fast Fourier transformation of . Further the proposed
method provides a way of finding different topological inferences by trying
different rays in the filtration space in no time.Comment: 29 page
Image Embedding of PMU Data for Deep Learning towards Transient Disturbance Classification
This paper presents a study on power grid disturbance classification by Deep
Learning (DL). A real synchrophasor set composing of three different types of
disturbance events from the Frequency Monitoring Network (FNET) is used. An
image embedding technique called Gramian Angular Field is applied to transform
each time series of event data to a two-dimensional image for learning. Two
main DL algorithms, i.e. CNN (Convolutional Neural Network) and RNN (Recurrent
Neural Network) are tested and compared with two widely used data mining tools,
the Support Vector Machine and Decision Tree. The test results demonstrate the
superiority of the both DL algorithms over other methods in the application of
power system transient disturbance classification.Comment: An updated version of this manuscript has been accepted by the 2018
IEEE International Conference on Energy Internet (ICEI), Beijing, Chin
- …