41 research outputs found
Discovering Potential Correlations via Hypercontractivity
Discovering a correlation from one variable to another variable is of
fundamental scientific and practical interest. While existing correlation
measures are suitable for discovering average correlation, they fail to
discover hidden or potential correlations. To bridge this gap, (i) we postulate
a set of natural axioms that we expect a measure of potential correlation to
satisfy; (ii) we show that the rate of information bottleneck, i.e., the
hypercontractivity coefficient, satisfies all the proposed axioms; (iii) we
provide a novel estimator to estimate the hypercontractivity coefficient from
samples; and (iv) we provide numerical experiments demonstrating that this
proposed estimator discovers potential correlations among various indicators of
WHO datasets, is robust in discovering gene interactions from gene expression
time series data, and is statistically more powerful than the estimators for
other correlation measures in binary hypothesis testing of canonical examples
of potential correlations.Comment: 30 pages, 19 figures, accepted for publication in the 31st Conference
on Neural Information Processing Systems (NIPS 2017
Generative Pre-Training of Time-Series Data for Unsupervised Fault Detection in Semiconductor Manufacturing
This paper introduces TRACE-GPT, which stands for Time-seRies
Anomaly-detection with Convolutional Embedding and Generative Pre-trained
Transformers. TRACE-GPT is designed to pre-train univariate time-series sensor
data and detect faults on unlabeled datasets in semiconductor manufacturing. In
semiconductor industry, classifying abnormal time-series sensor data from
normal data is important because it is directly related to wafer defect.
However, small, unlabeled, and even mixed training data without enough
anomalies make classification tasks difficult. In this research, we capture
features of time-series data with temporal convolutional embedding and
Generative Pre-trained Transformer (GPT) to classify abnormal sequences from
normal sequences using cross entropy loss. We prove that our model shows better
performance than previous unsupervised models with both an open dataset, the
University of California Riverside (UCR) time-series classification archive,
and the process log of our Chemical Vapor Deposition (CVD) equipment. Our model
has the highest F1 score at Equal Error Rate (EER) across all datasets and is
only 0.026 below the supervised state-of-the-art baseline on the open dataset
Domain Generalization Strategy to Train Classifiers Robust to Spatial-Temporal Shift
Deep learning-based weather prediction models have advanced significantly in
recent years. However, data-driven models based on deep learning are difficult
to apply to real-world applications because they are vulnerable to
spatial-temporal shifts. A weather prediction task is especially susceptible to
spatial-temporal shifts when the model is overfitted to locality and
seasonality. In this paper, we propose a training strategy to make the weather
prediction model robust to spatial-temporal shifts. We first analyze the effect
of hyperparameters and augmentations of the existing training strategy on the
spatial-temporal shift robustness of the model. Next, we propose an optimal
combination of hyperparameters and augmentation based on the analysis results
and a test-time augmentation. We performed all experiments on the W4C22
Transfer dataset and achieved the 1st performance.Comment: Core Transfer Track 1st place solution in Weather4Cast competition at
NeuIPS2
Simple Baseline for Weather Forecasting Using Spatiotemporal Context Aggregation Network
Traditional weather forecasting relies on domain expertise and
computationally intensive numerical simulation systems. Recently, with the
development of a data-driven approach, weather forecasting based on deep
learning has been receiving attention. Deep learning-based weather forecasting
has made stunning progress, from various backbone studies using CNN, RNN, and
Transformer to training strategies using weather observations datasets with
auxiliary inputs. All of this progress has contributed to the field of weather
forecasting; however, many elements and complex structures of deep learning
models prevent us from reaching physical interpretations. This paper proposes a
SImple baseline with a spatiotemporal context Aggregation Network (SIANet) that
achieved state-of-the-art in 4 parts of 5 benchmarks of W4C22. This simple but
efficient structure uses only satellite images and CNNs in an end-to-end
fashion without using a multi-model ensemble or fine-tuning. This simplicity of
SIANet can be used as a solid baseline that can be easily applied in weather
forecasting using deep learning.Comment: 1st place solution for stage1 and Core Transfer in the Weather4Cast
competition on NeurIPS 2