95 research outputs found
Automatic large-scale classification of bird sounds is strongly improved by unsupervised feature learning
This is an Open Access article distributed in accordance with the terms of the Creative Commons Attribution (CC BY 4.0) license, which permits others to distribute, remix, adapt and build upon this work, for commercial use, provided the original work is properly cited. See: http://creativecommons.org/ licenses/by/4.0
Gaussian process classification for prediction of in-hospital mortality among preterm infants
We present a method for predicting preterm infant in-hospital mortality using Bayesian Gaussian process classification. We combined features extracted from sensor measurements, made during the first 72 h of care for 598 Very Low Birth Weight infants of birth weight <1500 g, with standard clinical features calculated on arrival at the Neonatal Intensive Care Unit. Time periods of 12, 18, 24, 36, 48, and 72 h were evaluated. We achieved a classification result with area under the receiver operating characteristic curve of 0.948, which is in excess of the results achieved by using the clinical standard SNAP-II and SNAPPE-II scores. (C) 2018 Elsevier B.V. All rights reserved.Peer reviewe
Data-Efficient Classification of Birdcall Through Convolutional Neural Networks Transfer Learning
Deep learning Convolutional Neural Network (CNN) models are powerful
classification models but require a large amount of training data. In niche
domains such as bird acoustics, it is expensive and difficult to obtain a large
number of training samples. One method of classifying data with a limited
number of training samples is to employ transfer learning. In this research, we
evaluated the effectiveness of birdcall classification using transfer learning
from a larger base dataset (2814 samples in 46 classes) to a smaller target
dataset (351 samples in 10 classes) using the ResNet-50 CNN. We obtained 79%
average validation accuracy on the target dataset in 5-fold cross-validation.
The methodology of transfer learning from an ImageNet-trained CNN to a
project-specific and a much smaller set of classes and images was extended to
the domain of spectrogram images, where the base dataset effectively played the
role of the ImageNet.Comment: Accepted for IEEE Digital Image Computing: Techniques and
Applications, 2019 (DICTA 2019), 2-4 December 2019 in Perth, Australia,
http://dicta2019.dictaconference.org/index.htm
Audio-only bird classification using unsupervised feature learning
We describe our method for automatic bird species classification, which uses raw audio without segmentation and without using any auxiliary metadata. It successfully classifies among 501 bird categories, and was by far the highest scoring audio-only bird recognition algorithm submitted to BirdCLEF 2014. Our method uses unsupervised feature learning, a technique which learns regularities in spectro-temporal content without reference to the training labels, which helps a classifier to generalise to further content of the same type. Our strongest submission uses two layers of feature learning to capture regularities at two different time scales
Automatic Detection of Depression in Speech Using Ensemble Convolutional Neural Networks
This paper proposes a speech-based method for automatic depression classification.
The system is based on ensemble learning for Convolutional Neural Networks (CNNs) and is
evaluated using the data and the experimental protocol provided in the Depression Classification Sub-Challenge (DCC) at the 2016 AudioโVisual Emotion Challenge (AVEC-2016). In the pre-processing phase, speech files are represented as a sequence of log-spectrograms and randomly sampled to balance positive and negative samples. For the classification task itself, first, a more suitable architecture for this task, based on One-Dimensional Convolutional Neural Networks, is built.
Secondly, several of these CNN-based models are trained with different initializations and then the corresponding individual predictions are fused by using an Ensemble Averaging algorithm and combined per speaker to get an appropriate final decision. The proposed ensemble system achieves satisfactory results on the DCC at the AVEC-2016 in comparison with a reference system based on Support Vector Machines and hand-crafted features, with a CNN+LSTM-based system called DepAudionet, and with the case of a single CNN-based classifier.This research was partly funded by Spanish Government grant TEC2017-84395-P
๋ฏผ๊ฐํ ์ ๋ณด๋ฅผ ๋ณดํธํ ์ ์๋ ํ๋ผ์ด๋ฒ์ ๋ณด์กด ๊ธฐ๊ณํ์ต ๊ธฐ์ ๊ฐ๋ฐ
ํ์๋
ผ๋ฌธ(๋ฐ์ฌ) -- ์์ธ๋ํ๊ต๋ํ์ : ๊ณต๊ณผ๋ํ ์ฐ์
๊ณตํ๊ณผ, 2022. 8. ์ด์ฌ์ฑ.์ต๊ทผ ์ธ๊ณต์ง๋ฅ์ ์ฑ๊ณต์๋ ์ฌ๋ฌ ๊ฐ์ง ์์ธ์ด ์์ผ๋, ์๋ก์ด ์๊ณ ๋ฆฌ์ฆ์ ๊ฐ๋ฐ๊ณผ ์ ์ ๋ ๋ฐ์ดํฐ ์์ ๊ธฐํ๊ธ์์ ์ธ ์ฆ๊ฐ๋ก ์ธํ ์ํฅ์ด ํฌ๋ค. ๋ฐ๋ผ์ ๊ธฐ๊ณํ์ต ๋ชจ๋ธ๊ณผ ๋ฐ์ดํฐ๋ ์ค์ฌ์ ๊ฐ์น๋ฅผ ๊ฐ์ง๊ฒ ๋๋ฉฐ, ํ์ค ์ธ๊ณ์์ ๊ฐ์ธ ๋๋ ๊ธฐ์
์ ํ์ต๋ ๋ชจ๋ธ ๋๋ ํ์ต์ ์ฌ์ฉํ ๋ฐ์ดํฐ๋ฅผ ์ ๊ณตํจ์ผ๋ก์จ ์ด์ต์ ์ป์ ์ ์๋ค. ๊ทธ๋ฌ๋, ๋ฐ์ดํฐ ๋๋ ๋ชจ๋ธ์ ๊ณต์ ๋ ๊ฐ์ธ์ ๋ฏผ๊ฐ ์ ๋ณด๋ฅผ ์ ์ถํจ์ผ๋ก์จ ํ๋ผ์ด๋ฒ์์ ์นจํด๋ก ์ด์ด์ง ์ ์๋ค๋ ์ฌ์ค์ด ๋ฐํ์ง๊ณ ์๋ค.
๋ณธ ๋
ผ๋ฌธ์ ๋ชฉํ๋ ๋ฏผ๊ฐ ์ ๋ณด๋ฅผ ๋ณดํธํ ์ ์๋ ํ๋ผ์ด๋ฒ์ ๋ณด์กด ๊ธฐ๊ณํ์ต ๋ฐฉ๋ฒ๋ก ์ ๊ฐ๋ฐํ๋ ๊ฒ์ด๋ค. ์ด๋ฅผ ์ํด ์ต๊ทผ ํ๋ฐํ ์ฐ๊ตฌ๋๊ณ ์๋ ๋ ๊ฐ์ง ํ๋ผ์ด๋ฒ์ ๋ณด์กด ๊ธฐ์ , ์ฆ ๋ํ ์ํธ์ ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์๋ฅผ ์ฌ์ฉํ๋ค. ๋จผ์ , ๋ํ ์ํธ๋ ์ํธํ๋ ๋ฐ์ดํฐ์ ๋ํด ๊ธฐ๊ณํ์ต ์๊ณ ๋ฆฌ์ฆ์ ์ ์ฉ ๊ฐ๋ฅํ๊ฒ ํจ์ผ๋ก์จ ๋ฐ์ดํฐ์ ํ๋ผ์ด๋ฒ์๋ฅผ ๋ณดํธํ ์ ์๋ค. ๊ทธ๋ฌ๋ ๋ํ ์ํธ๋ฅผ ํ์ฉํ ์ฐ์ฐ์ ๊ธฐ์กด์ ์ฐ์ฐ์ ๋นํด ๋งค์ฐ ํฐ ์ฐ์ฐ ์๊ฐ์ ์๊ตฌํ๋ฏ๋ก ํจ์จ์ ์ธ ์๊ณ ๋ฆฌ์ฆ์ ๊ตฌ์ฑํ๋ ๊ฒ์ด ์ค์ํ๋ค. ํจ์จ์ ์ธ ์ฐ์ฐ์ ์ํด ์ฐ๋ฆฌ๋ ๋ ๊ฐ์ง ์ ๊ทผ๋ฒ์ ์ฌ์ฉํ๋ค. ์ฒซ ๋ฒ์งธ๋ ํ์ต ๋จ๊ณ์์์ ์ฐ์ฐ๋์ ์ค์ด๋ ๊ฒ์ด๋ค. ํ์ต ๋จ๊ณ์์๋ถํฐ ๋ํ ์ํธ๋ฅผ ์ ์ฉํ๋ฉด ํ์ต ๋ฐ์ดํฐ์ ํ๋ผ์ด๋ฒ์๋ฅผ ํจ๊ป ๋ณดํธํ ์ ์์ผ๋ฏ๋ก ์ถ๋ก ๋จ๊ณ์์๋ง ๋ํ ์ํธ๋ฅผ ์ ์ฉํ๋ ๊ฒ์ ๋นํด ํ๋ผ์ด๋ฒ์์ ๋ฒ์๊ฐ ๋์ด์ง์ง๋ง, ๊ทธ๋งํผ ์ฐ์ฐ๋์ด ๋์ด๋๋ค. ๋ณธ ๋
ผ๋ฌธ์์๋ ์ผ๋ถ ๊ฐ์ฅ ์ค์ํ ์ ๋ณด๋ง์ ์ํธํํจ์ผ๋ก์จ ํ์ต ๋จ๊ณ๋ฅผ ํจ์จ์ ์ผ๋ก ํ๋ ๋ฐฉ๋ฒ๋ก ์ ์ ์ํ๋ค. ๊ตฌ์ฒด์ ์ผ๋ก, ์ผ๋ถ ๋ฏผ๊ฐ ๋ณ์๊ฐ ์ํธํ๋์ด ์์ ๋ ์ฐ์ฐ๋์ ๋งค์ฐ ์ค์ผ ์ ์๋ ๋ฆฟ์ง ํ๊ท ์๊ณ ๋ฆฌ์ฆ์ ๊ฐ๋ฐํ๋ค. ๋ํ ๊ฐ๋ฐ๋ ์๊ณ ๋ฆฌ์ฆ์ ํ์ฅ์์ผ ๋ํ ์ํธ ์นํ์ ์ด์ง ์์ ํ๋ผ๋ฏธํฐ ํ์ ๊ณผ์ ์ ์ต๋ํ ์ ๊ฑฐํ ์ ์๋ ์๋ก์ด ๋ก์ง์คํฑ ํ๊ท ์๊ณ ๋ฆฌ์ฆ์ ํจ๊ป ์ ์ํ๋ค.
ํจ์จ์ ์ธ ์ฐ์ฐ์ ์ํ ๋ ๋ฒ์งธ ์ ๊ทผ๋ฒ์ ๋ํ ์ํธ๋ฅผ ๊ธฐ๊ณํ์ต์ ์ถ๋ก ๋จ๊ณ์์๋ง ์ฌ์ฉํ๋ ๊ฒ์ด๋ค. ์ด๋ฅผ ํตํด ์ํ ๋ฐ์ดํฐ์ ์ง์ ์ ์ธ ๋
ธ์ถ์ ๋ง์ ์ ์๋ค. ๋ณธ ๋
ผ๋ฌธ์์๋ ์ํฌํธ ๋ฒกํฐ ๊ตฐ์งํ ๋ชจ๋ธ์ ๋ํ ๋ํ ์ํธ ์นํ์ ์ถ๋ก ๋ฐฉ๋ฒ์ ์ ์ํ๋ค.
๋ํ ์ํธ๋ ์ฌ๋ฌ ๊ฐ์ง ์ํ์ ๋ํด์ ๋ฐ์ดํฐ์ ๋ชจ๋ธ ์ ๋ณด๋ฅผ ๋ณดํธํ ์ ์์ผ๋, ํ์ต๋ ๋ชจ๋ธ์ ํตํด ์๋ก์ด ๋ฐ์ดํฐ์ ๋ํ ์ถ๋ก ์๋น์ค๋ฅผ ์ ๊ณตํ ๋ ์ถ๋ก ๊ฒฐ๊ณผ๋ก๋ถํฐ ๋ชจ๋ธ๊ณผ ํ์ต ๋ฐ์ดํฐ๋ฅผ ๋ณดํธํ์ง ๋ชปํ๋ค. ์ฐ๊ตฌ๋ฅผ ํตํด ๊ณต๊ฒฉ์๊ฐ ์์ ์ด ๊ฐ์ง ๋ฐ์ดํฐ์ ๊ทธ ๋ฐ์ดํฐ์ ๋ํ ์ถ๋ก ๊ฒฐ๊ณผ๋ง์ ์ด์ฉํ์ฌ ์ด์ฉํ์ฌ ๋ชจ๋ธ๊ณผ ํ์ต ๋ฐ์ดํฐ์ ๋ํ ์ ๋ณด๋ฅผ ์ถ์ถํ ์ ์์์ด ๋ฐํ์ง๊ณ ์๋ค. ์๋ฅผ ๋ค์ด, ๊ณต๊ฒฉ์๋ ํน์ ๋ฐ์ดํฐ๊ฐ ํ์ต ๋ฐ์ดํฐ์ ํฌํจ๋์ด ์๋์ง ์๋์ง๋ฅผ ์ถ๋ก ํ ์ ์๋ค. ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์๋ ํ์ต๋ ๋ชจ๋ธ์ ๋ํ ํน์ ๋ฐ์ดํฐ ์ํ์ ์ํฅ์ ์ค์์ผ๋ก์จ ์ด๋ฌํ ๊ณต๊ฒฉ์ ๋ํ ๋ฐฉ์ด๋ฅผ ๋ณด์ฅํ๋ ํ๋ผ์ด๋ฒ์ ๊ธฐ์ ์ด๋ค. ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์๋ ํ๋ผ์ด๋ฒ์์ ์์ค์ ์ ๋์ ์ผ๋ก ํํํจ์ผ๋ก์จ ์ํ๋ ๋งํผ์ ํ๋ผ์ด๋ฒ์๋ฅผ ์ถฉ์กฑ์ํฌ ์ ์์ง๋ง, ํ๋ผ์ด๋ฒ์๋ฅผ ์ถฉ์กฑ์ํค๊ธฐ ์ํด์๋ ์๊ณ ๋ฆฌ์ฆ์ ๊ทธ๋งํผ์ ๋ฌด์์์ฑ์ ๋ํด์ผ ํ๋ฏ๋ก ๋ชจ๋ธ์ ์ฑ๋ฅ์ ๋จ์ด๋จ๋ฆฐ๋ค. ๋ฐ๋ผ์, ๋ณธ๋ฌธ์์๋ ๋ชจ์ค ์ด๋ก ์ ์ด์ฉํ์ฌ ์ฐจ๋ถ ํ๋ผ์ด๋ฒ์ ๊ตฐ์งํ ๋ฐฉ๋ฒ๋ก ์ ํ๋ผ์ด๋ฒ์๋ฅผ ์ ์งํ๋ฉด์๋ ๊ทธ ์ฑ๋ฅ์ ๋์ด์ฌ๋ฆฌ๋ ์๋ก์ด ๋ฐฉ๋ฒ๋ก ์ ์ ์ํ๋ค.
๋ณธ ๋
ผ๋ฌธ์์ ๊ฐ๋ฐํ๋ ํ๋ผ์ด๋ฒ์ ๋ณด์กด ๊ธฐ๊ณํ์ต ๋ฐฉ๋ฒ๋ก ์ ๊ฐ๊ธฐ ๋ค๋ฅธ ์์ค์์ ํ๋ผ์ด๋ฒ์๋ฅผ ๋ณดํธํ๋ฉฐ, ๋ฐ๋ผ์ ์ํธ ๋ณด์์ ์ด๋ค. ์ ์๋ ๋ฐฉ๋ฒ๋ก ๋ค์ ํ๋์ ํตํฉ ์์คํ
์ ๊ตฌ์ถํ์ฌ ๊ธฐ๊ณํ์ต์ด ๊ฐ์ธ์ ๋ฏผ๊ฐ ์ ๋ณด๋กค ๋ณดํธํด์ผ ํ๋ ์ฌ๋ฌ ๋ถ์ผ์์ ๋์ฑ ๋๋ฆฌ ์ฌ์ฉ๋ ์ ์๋๋ก ํ๋ ๊ธฐ๋ ํจ๊ณผ๋ฅผ ๊ฐ์ง๋ค.Recent development of artificial intelligence systems has been driven by various factors such as the development of new algorithms and the the explosive increase in the amount of available data. In the real-world scenarios, individuals or corporations benefit by providing data for training a machine learning model or the trained model. However, it has been revealed that sharing of data or the model can lead to invasion of personal privacy by leaking personal sensitive information.
In this dissertation, we focus on developing privacy-preserving machine learning methods which can protect sensitive information. Homomorphic encryption can protect the privacy of data and the models because machine learning algorithms can be applied to encrypted data, but requires much larger computation time than conventional operations. For efficient computation, we take two approaches. The first is to reduce the amount of computation in the training phase. We present an efficient training algorithm by encrypting only few important information. In specific, we develop a ridge regression algorithm that greatly reduces the amount of computation when one or two sensitive variables are encrypted. Furthermore, we extend the method to apply it to classification problems by developing a new logistic regression algorithm that can maximally exclude searching of hyper-parameters that are not suitable for machine learning with homomorphic encryption.
Another approach is to apply homomorphic encryption only when the trained model is used for inference, which prevents direct exposure of the test data and the model information. We propose a homomorphic-encryption-friendly algorithm for inference of support based clustering.
Though homomorphic encryption can prevent various threats to data and the model information, it cannot defend against secondary attacks through inference APIs. It has been reported that an adversary can extract information about the training data only with his or her input and the corresponding output of the model. For instance, the adversary can determine whether specific data is included in the training data or not. Differential privacy is a mathematical concept which guarantees defense against those attacks by reducing the impact of specific data samples on the trained model. Differential privacy has the advantage of being able to quantitatively express the degree of privacy, but it reduces the utility of the model by adding randomness to the algorithm. Therefore, we propose a novel method which can improve the utility while maintaining the privacy of differentially private clustering algorithms by utilizing Morse theory.
The privacy-preserving machine learning methods proposed in this paper can complement each other to prevent different levels of attacks. We expect that our methods can construct an integrated system and be applied to various domains where machine learning involves sensitive personal information.Chapter 1 Introduction 1
1.1 Motivation of the Dissertation 1
1.2 Aims of the Dissertation 7
1.3 Organization of the Dissertation 10
Chapter 2 Preliminaries 11
2.1 Homomorphic Encryption 11
2.2 Differential Privacy 14
Chapter 3 Efficient Homomorphic Encryption Framework for Ridge Regression 18
3.1 Problem Statement 18
3.2 Framework 22
3.3 Proposed Method 25
3.3.1 Regression with one Encrypted Sensitive Variable 25
3.3.2 Regression with two Encrypted Sensitive Variables 30
3.3.3 Adversarial Perturbation Against Attribute Inference Attack 35
3.3.4 Algorithm for Ridge Regression 36
3.3.5 Algorithm for Adversarial Perturbation 37
3.4 Experiments 40
3.4.1 Experimental Setting 40
3.4.2 Experimental Results 42
3.5 Chapter Summary 47
Chapter 4 Parameter-free Homomorphic-encryption-friendly Logistic Regression 53
4.1 Problem Statement 53
4.2 Proposed Method 56
4.2.1 Motivation 56
4.2.2 Framework 58
4.3 Theoretical Results 63
4.4 Experiments 68
4.4.1 Experimental Setting 68
4.4.2 Experimental Results 70
4.5 Chapter Summary 75
Chapter 5 Homomorphic-encryption-friendly Evaluation for Support Vector Clustering 76
5.1 Problem Statement 76
5.2 Background 78
5.2.1 CKKS scheme 78
5.2.2 SVC 80
5.3 Proposed Method 82
5.4 Experiments 86
5.4.1 Experimental Setting 86
5.4.2 Experimental Results 87
5.5 Chapter Summary 89
Chapter 6 Differentially Private Mixture of Gaussians Clustering with Morse Theory 95
6.1 Problem Statement 95
6.2 Background 98
6.2.1 Mixture of Gaussians 98
6.2.2 Morse Theory 99
6.2.3 Dynamical System Perspective 101
6.3 Proposed Method 104
6.3.1 Differentially private clustering 105
6.3.2 Transition equilibrium vectors and the weighted graph 108
6.3.3 Hierarchical merging of sub-clusters 111
6.4 Theoretical Results 112
6.5 Experiments 117
6.5.1 Experimental Setting 117
6.5.2 Experimental Results 119
6.6 Chapter Summary 122
Chapter 7 Conclusion 124
7.1 Conclusion 124
7.2 Future Direction 126
Bibliography 128
๊ตญ๋ฌธ์ด๋ก 154๋ฐ
Strategic ambivalence above, selective implementation below: an institutional analysis of Kazakhstan`s policy toward skilled labor
Aiming at entering the top thirty most competitive economies in the world by 2050
Kazakhstan faces a problem of inadequate human capital. However, an objective demand for
foreign skilled workers notwithstanding, Kazakhstan fails to attract as many of them as its
labor market needs. Driven by this puzzle, the given study analyzes labor migration policy of
Kazakhstan regarding skilled workers. It attempts to explain what factors make Kazakhstani
labor migration policy ineffective under the condition when skilled foreign workers are
needed. Two main factors influence the outcomes of labor migration policy implementation:
decentralized decision-making and strategic ambiguity. Transferring the function of policy
implementation to local-level bureaucratic institutions the state not only shifts its
responsibilities to bureaucrats but also provides them with a certain degree of autonomy and
discretion. However, the state and its institutions have no a coherent vision of the national
interest in labor migration. Bureaucrats concerned with their professional duties have a more
protectionist stance on foreign specialistsโ inflows. Meanwhile, aimed at increasing these
inflows the state ensures its interest through strategic ambiguity in its discourse and practices.
It allows the state to reconcile an economic need in more foreign skilled workers with a
political demand for a more restrictive labor migration policy. Thus, starting from above
ambiguity is manipulated by local bureaucrats to meet their professional and, occasionally,
personal interests when implementing the policy. As a result, the state fails to attract the
needed numbers of foreign specialists. In other words, the policy through which the state
aims to achieve its goals turns to be ineffective. This thesis demonstrates that an institutional approach with an emphasis on the
bureaucratic model of decision-making is a better way to understand the reasons of labor
ix
migration policy ineffectiveness in Kazakhstan. However, it also shows that when
bureaucrats are involved in the policy-making process the findings from this case can be
applied to the countries other than Kazakhstan and public policies other than migratio
- โฆ