525 research outputs found
์ด์์น ํ์ง๋ฅผ ์ํ ์ ๋์ ์ฌ์ ํ์ต ์๊ณ ๋ฆฌ์ฆ
ํ์๋
ผ๋ฌธ (์์ฌ) -- ์์ธ๋ํ๊ต ๋ํ์ : ๊ณต๊ณผ๋ํ ๊ธฐ๊ณ๊ณตํ๋ถ, 2020. 8. ๋ฐ์ข
์ฐ.In this thesis, we propose a semi-supervised dictionary learning algorithm that learns representations of only non-outlier data. The presence of outliers in a dataset is a major drawback for dictionary learning, resulting in less than desirable performance in real-world applications. Our adversarial dictionary learning (ADL) algorithm exploits a supervision dataset composed of known outliers. The algorithm penalizes the dictionary expressing the known outliers well. Penalizing the known outliers makes dictionary learning robust to the outliers present in the dataset. The proposed method can handle highly corrupted dataset which cannot be effectively dealt with using conventional robust dictionary learning algorithms. We empirically show the usefulness of our algorithm with extensive experiments on anomaly detection, using both synthetic univariate time-series data and multivariate point data.๋ณธ ๋
ผ๋ฌธ์์๋ ์ด์์น๊ฐ ์๋ ๋ฐ์ดํฐ์ ํฌ์ ํํ๋ง์ ํ์ตํ๋ ์ค์ง๋ ์ฌ์ ํ์ต ์๊ณ ๋ฆฌ์ฆ์ ์ ์ํ๋ค. ๋ฐ์ดํฐ์
์ ์์ฌ ์๋ ์ด์์น๋ ์ฌ์ ํ์ต์ ์ฃผ์ํ ๋ฌธ์ ๋ก, ์ค์ ๋ฌธ์ ์ ์ ์ฉ ์ ๋ฐ๋์งํ์ง ์์ ์ฑ๋ฅ์ ์ด๋ํ๋ค. ๋ณธ ์ฐ๊ตฌ์์ ์ ์ํ๋ ์ ๋์ ์ฌ์ ํ์ต(ADL) ์๊ณ ๋ฆฌ์ฆ์ ์ด์์น ๋ฐ์ดํฐ๋ก ๊ตฌ์ฑ๋ ๊ฐ๋
๋ฐ์ดํฐ์
์ ํ์ต์ ์ด์ฉํ๋ค. ์ฐ๋ฆฌ์ ์๊ณ ๋ฆฌ์ฆ์ ์ฃผ์ด์ง ์ด์์น ๋ฐ์ดํฐ๋ฅผ ์ ํํํ๋ ์ฌ์ ์ ํ๋ํฐ๋ฅผ ์ฃผ๊ณ , ์ด๊ฒ์ ์ฌ์ ์ด ํ์ต ๋ฐ์ดํฐ์
์ ์์ฌ ์๋ ์ด์์น์ ๊ฐ๊ฑดํ๊ฒ ํ์ต๋๋๋ก ํ๋ค. ์ ์๋ ๋ฐฉ๋ฒ์ ๊ธฐ์กด์ ์ฌ์ ํ์ต ๋ฐฉ๋ฒ๋ค๊ณผ ๋น๊ตํด ์ด์์น์ ๋น์ค์ด ๋์ ๋ฐ์ดํฐ์
์์๋ ํจ๊ณผ์ ์ผ๋ก ์ฌ์ ์ ํ์ตํด ๋ธ๋ค. ์ด ์ฐ๊ตฌ์์๋ ์ธ๊ณต์ ์ธ ๋จ๋ณ๋ ์๊ณ์ด ๋ฐ์ดํฐ์ ๋ค๋ณ๋ ์ ๋ฐ์ดํฐ์ ๋ํ ์ด์์น ํ์ง ์คํ์ ํตํด ์๊ณ ๋ฆฌ์ฆ์ ์ ์ฉ์ฑ์ ๊ฒฝํ์ ์ผ๋ก ๊ฒ์ฆํ๋ค.1 Introduction 1
1.1 Related Works 4
1.2 Contributions of This Thesis 5
1.3 Organization 6
2 Sparse Representation and Dictionary Learning 7
2.1 Sparse Representation 7
2.1.1 Problem De nition of Sparse Representation 7
2.1.2 Sparse representation with l0-norm regularization 10
2.1.3 Sparse representation with l1-norm regularization 11
2.1.4 Sparse representation with lp-norm regularization (0 < p < 1) 12
2.2 Dictionary Learning 12
2.2.1 Problem De nition of Dictionary Learning 12
2.2.2 Dictionary Learning Methods 14
3 Adversarial Dictionary Learning 18
3.1 Problem Formulation 18
3.2 Adversarial Loss 19
3.3 Optimization Algorithm 20
4 Experiments 25
4.1 Data Description 26
4.1.1 Univariate Time-series Data 26
4.1.2 Multivariate Point Data 29
4.2 Evaluation Process 30
4.2.1 A Baseline of Anomaly Detection 30
4.2.2 ROC Curve and AUC 34
4.3 Experiment Setting 35
4.4 Results 36
5 Conclusion 43
Bibliography 45
๊ตญ๋ฌธ์ด๋ก 50Maste
DC Proximal Newton for Non-Convex Optimization Problems
We introduce a novel algorithm for solving learning problems where both the
loss function and the regularizer are non-convex but belong to the class of
difference of convex (DC) functions. Our contribution is a new general purpose
proximal Newton algorithm that is able to deal with such a situation. The
algorithm consists in obtaining a descent direction from an approximation of
the loss function and then in performing a line search to ensure sufficient
descent. A theoretical analysis is provided showing that the iterates of the
proposed algorithm {admit} as limit points stationary points of the DC
objective function. Numerical experiments show that our approach is more
efficient than current state of the art for a problem with a convex loss
functions and non-convex regularizer. We have also illustrated the benefit of
our algorithm in high-dimensional transductive learning problem where both loss
function and regularizers are non-convex
Shakeout: A New Approach to Regularized Deep Neural Network Training
Recent years have witnessed the success of deep neural networks in dealing
with a plenty of practical problems. Dropout has played an essential role in
many successful deep neural networks, by inducing regularization in the model
training. In this paper, we present a new regularized training approach:
Shakeout. Instead of randomly discarding units as Dropout does at the training
stage, Shakeout randomly chooses to enhance or reverse each unit's contribution
to the next layer. This minor modification of Dropout has the statistical
trait: the regularizer induced by Shakeout adaptively combines , and
regularization terms. Our classification experiments with representative
deep architectures on image datasets MNIST, CIFAR-10 and ImageNet show that
Shakeout deals with over-fitting effectively and outperforms Dropout. We
empirically demonstrate that Shakeout leads to sparser weights under both
unsupervised and supervised settings. Shakeout also leads to the grouping
effect of the input units in a layer. Considering the weights in reflecting the
importance of connections, Shakeout is superior to Dropout, which is valuable
for the deep model compression. Moreover, we demonstrate that Shakeout can
effectively reduce the instability of the training process of the deep
architecture.Comment: Appears at T-PAMI 201
- โฆ