60 research outputs found

    공동 임베딩 및 분산 임베딩 방식을 통한 교차 모달 표현 학습 방법 연구

    Get PDF
    학위논문(박사) -- 서울대학교대학원 : 공과대학 전기·정보공학부, 2022.2. 최진영.본 논문에서는 교차 모달 표현 학습에서 발생할 수 있는 문제점들을 개선하기 위한 두 가지 방법을 제안한다. 첫 째, 기존의 공동 임베딩 방식의 교차 모달 표현 학습 모델이 상이한 모달 데이터 사이의 표현을 학습하기 어려운 단점을 해결하기 위하여, 분산 임베딩 방식의 교차 모달 학습 모델을 제안한다. 분산 임베딩 방식의 학습 모델은 먼저 각 모달마다 독립적으로 단독 모달 표현 학습을 수행함으로써 각 모달마다 특화된 임베딩 공간을 학습한다. 그 후 교차 모달 표현을 학습하기 위해 여러 모달의 임베딩 공간사이를 연결하는 연상학습 모듈을 학습한다. 두 단계를 거치는 학습 과정을 통해 제안하는 모델은 상이한 모달들 간의 교차 모달 표현학습도 잘 수행할 수 있으며, 쌍이 주어지지 않은 교차 모달 데이터도 활용하여 학습할 수 있다는 장점을 가진다. 상이한 모달 관계 중 하나인 시각과 청각 모달 간의 데이터 생성 실험에서 제안하는 방법은 기존의 공동 임베딩 방식의 모델보다 향상된 성능을 검증하였다. 둘 째, 교차 모달 표현 학습을 위해서는 모달간 쌍을 이루는 데이터가 필수적이지만 실제 응용분야에서 충분한 수의 데이터 쌍을 확보하는 것은 어렵다. 이러한 문제점을 해결하기 위하여 교차 모달 표현 학습을 위한 능동적 학습 방법을 제안한다. 특히 교차 모달 표현 학습 관련 응용분야 중 하나인 이미지-텍스트 반환에 대한 능동적 학습을 제안한다. 기존의 이미지-텍스트 반환에 대한 능동적 학습 시나리오는 최신의 이미지-텍스트 반환 데이터셋에 적용하기 어렵기 때문에, 본 논문에서는 우선 최신의 데이터셋에 적합한 능동적 학습 시나리오를 먼저 제안한다. 주어진 이미지-텍스트 쌍 데이터에 대하여 사람에게 분류 라벨을 요청하는 기존의 시나리오와는 달리, 제안하는 시나리오는 쌍이 주어지지 않은 이미지 혹은 텍스트 데이터에 대하여 사람에게 나머지 모달리티의 데이터를 요청하여 쌍 데이터를 확보하는 것을 목표로 한다. 또한 제안하는 시나리오에 적합한 능동적 학습 알고리즘도 제안한다. 제안하는 알고리즘은 이미지-텍스트 반환에서 주로 사용되는 최대 힌지 트리플렛 손실함수에 가장 영향력을 많이 끼칠 것으로 생각되는 데이터를 선별한다. 이를 위해 특정 데이터가 손실함수에 영향력을 미칠 수 있는 조건을 정의하고, 정의된 조건에 기반하여 데이터가 손실함수에 미치는 영향력 점수를 추정한다. 제안하는 알고리즘은 영향력 점수가 가장 높은 순서대로 데이터를 선택하여 사람에게 나머지 쌍 데이터를 제공해줄 것을 요청한다. 최신의 이미지-텍스트 데이터셋에서의 제안하는 알고리즘이 무작위로 쌍 데이터를 확보하는 것보다 학습데이터 수 대비 향상된 성능을 달성하는 것을 보여주었다.In this dissertation, we propose two methods to overcome problems that may occur in cross-modal representation learning. First, in order to overcome the problem that the existing joint embedding based model is difficult to learn relation among data from heterogeneous modalities, we propose a cross-modal representation learning model adopting the distributed embedding method. The proposed model first learns intra-modal association by training a specialized embedding space for each modality with single-modal representation learning. Then the proposed model learns cross-modal association by introducing associator, which connects the embedding spaces of multiple modalities. To separate the learning process of intra-modal association and cross-modal association, the model parameters involved in intra-modal association are not updated during training of cross-modal association. Through the two-step learning process, the proposed model can well perform cross-modal representation learning among heterogeneous modalities. Furthermore, the proposed model has the advantage of utilizing unpaired data for learning. We validated the proposed method in the cross-modal data generation task between visual and auditory modalities, which is one of the heterogeneous modal relationships. The proposed method achieves improved performance compared to the existing joint-embedding based models. Second, though cross-modal paired data is essential for cross-modal representation learning, securing a sufficient number of paired data is too difficult in practical applications. To mitigate data shortage problem, we propose an active learning method for cross-modal representation learning. In particular, we propose active learning for image-text retrieval, which is one of the most popular applications related to cross-modal representation learning. Since the existing active learning scenario for image-text retrieval can not be applied to the recent image-text retrieval benchmarks, we first propose an active learning scenario feasible for the recent benchmarks. In contrast to the existing scenario where a category label for a given image-text pair data is queried to the human experts, in the proposed scenario, unpaired image or text data are given and human experts are requested to pair the unpaired data. We also proposed an active learning algorithm for the proposed scenario. The proposed algorithm selects the data that is expected to have the most influence on the max-hinge triplet loss function, which is mainly adopted loss function in recent image-text retrieval method. To this end, we define the condition that data can influence the loss function, and estimate the influence score (referred to as HN-Score) of the data on the loss function based on the defined condition. The proposed algorithm selects the data of the highest score. We validate the effectiveness of the proposed active learning algorithm through the various experiments on recent image-text retrieval benchmarks.1 Introduction 1 2 Preliminary 4 2.1 Associative Learning in Human Brain 4 2.2 Cross-modal Representation Learning 6 2.3 Active Learning 8 3 Distributed Embedding Model 15 3.1 Contribution 15 3.2 Motivation 17 3.3 Graphical Modeling 19 3.4 Realization 22 3.5 Experiment 30 3.6 Summary 47 4 Cross-modal Active Learning 48 4.1 Contribution 48 4.2 Proposed Active Learning for ITR 51 4.3 Experiments 59 4.4 Summary 69 4.5 Appendix 70 5 Conclusion 93박

    A machine learning approach to Structural Health Monitoring with a view towards wind turbines

    Get PDF
    The work of this thesis is centred around Structural Health Monitoring (SHM) and is divided into three main parts. The thesis starts by exploring di�erent architectures of auto-association. These are evaluated in order to demonstrate the ability of nonlinear auto-association of neural networks with one nonlinear hidden layer as it is of great interest in terms of reduced computational complexity. It is shown that linear PCA lacks performance for novelty detection. The novel key study which is revealed ampli�es that single hidden layer auto-associators are not performing in a similar fashion to PCA. The second part of this study concerns formulating pattern recognition algorithms for SHM purposes which could be used in the wind energy sector as SHM regarding this research �eld is still in an embryonic level compared to civil and aerospace engineering. The purpose of this part is to investigate the e�ectiveness and performance of such methods in structural damage detection. Experimental measurements such as high frequency responses functions (FRFs) were extracted from a 9m WT blade throughout a full-scale continuous fatigue test. A preliminary analysis of a model regression of virtual SCADA data from an o�shore wind farm is also proposed using Gaussian processes and neural network regression techniques. The third part of this work introduces robust multivariate statistical methods into SHM by inclusively revealing how the in uence of environmental and operational variation a�ects features that are sensitive to damage. The algorithms that are described are the Minimum Covariance Determinant Estimator (MCD) and the Minimum Volume Enclosing Ellipsoid (MVEE). These robust outlier methods are inclusive and in turn there is no need to pre-determine an undamaged condition data set, o�ering an important advantage over other multivariate methodologies. Two real life experimental applications to the Z24 bridge and to an aircraft wing are analysed. Furthermore, with the usage of the robust measures, the data variable correlation reveals linear or nonlinear connections

    Active symbols and internal models: Towards a cognitive connectionism

    Full text link
    In the first section of the article, we examine some recent criticisms of the connectionist enterprise: first, that connectionist models are fundamentally behaviorist in nature (and, therefore, non-cognitive), and second that connectionist models are fundamentally associationist in nature (and, therefore, cognitively weak). We argue that, for a limited class of connectionist models (feed-forward, pattern-associator models), the first criticism is unavoidable. With respect to the second criticism, we propose that connectionist models are fundamentally associationist but that this is appropriate for building models of human cognition. However, we do accept the point that there are cognitive capacities for which any purely associative model cannot provide a satisfactory account. The implication that we draw from is this is not that associationist models and mechanisms should be scrapped, but rather that they should be enhanced.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/45877/1/146_2005_Article_BF01889764.pd

    Neural network models: their theoretical capabilities and relevance to biology

    Get PDF

    Deep Active Learning Explored Across Diverse Label Spaces

    Get PDF
    abstract: Deep learning architectures have been widely explored in computer vision and have depicted commendable performance in a variety of applications. A fundamental challenge in training deep networks is the requirement of large amounts of labeled training data. While gathering large quantities of unlabeled data is cheap and easy, annotating the data is an expensive process in terms of time, labor and human expertise. Thus, developing algorithms that minimize the human effort in training deep models is of immense practical importance. Active learning algorithms automatically identify salient and exemplar samples from large amounts of unlabeled data and can augment maximal information to supervised learning models, thereby reducing the human annotation effort in training machine learning models. The goal of this dissertation is to fuse ideas from deep learning and active learning and design novel deep active learning algorithms. The proposed learning methodologies explore diverse label spaces to solve different computer vision applications. Three major contributions have emerged from this work; (i) a deep active framework for multi-class image classication, (ii) a deep active model with and without label correlation for multi-label image classi- cation and (iii) a deep active paradigm for regression. Extensive empirical studies on a variety of multi-class, multi-label and regression vision datasets corroborate the potential of the proposed methods for real-world applications. Additional contributions include: (i) a multimodal emotion database consisting of recordings of facial expressions, body gestures, vocal expressions and physiological signals of actors enacting various emotions, (ii) four multimodal deep belief network models and (iii) an in-depth analysis of the effect of transfer of multimodal emotion features between source and target networks on classification accuracy and training time. These related contributions help comprehend the challenges involved in training deep learning models and motivate the main goal of this dissertation.Dissertation/ThesisDoctoral Dissertation Electrical Engineering 201

    Understanding deep architectures and the effect of unsupervised pre-training

    Full text link
    Cette thèse porte sur une classe d'algorithmes d'apprentissage appelés architectures profondes. Il existe des résultats qui indiquent que les représentations peu profondes et locales ne sont pas suffisantes pour la modélisation des fonctions comportant plusieurs facteurs de variation. Nous sommes particulièrement intéressés par ce genre de données car nous espérons qu'un agent intelligent sera en mesure d'apprendre à les modéliser automatiquement; l'hypothèse est que les architectures profondes sont mieux adaptées pour les modéliser. Les travaux de Hinton (2006) furent une véritable percée, car l'idée d'utiliser un algorithme d'apprentissage non-supervisé, les machines de Boltzmann restreintes, pour l'initialisation des poids d'un réseau de neurones supervisé a été cruciale pour entraîner l'architecture profonde la plus populaire, soit les réseaux de neurones artificiels avec des poids totalement connectés. Cette idée a été reprise et reproduite avec succès dans plusieurs contextes et avec une variété de modèles. Dans le cadre de cette thèse, nous considérons les architectures profondes comme des biais inductifs. Ces biais sont représentés non seulement par les modèles eux-mêmes, mais aussi par les méthodes d'entraînement qui sont souvent utilisés en conjonction avec ceux-ci. Nous désirons définir les raisons pour lesquelles cette classe de fonctions généralise bien, les situations auxquelles ces fonctions pourront être appliquées, ainsi que les descriptions qualitatives de telles fonctions. L'objectif de cette thèse est d'obtenir une meilleure compréhension du succès des architectures profondes. Dans le premier article, nous testons la concordance entre nos intuitions---que les réseaux profonds sont nécessaires pour mieux apprendre avec des données comportant plusieurs facteurs de variation---et les résultats empiriques. Le second article est une étude approfondie de la question: pourquoi l'apprentissage non-supervisé aide à mieux généraliser dans un réseau profond? Nous explorons et évaluons plusieurs hypothèses tentant d'élucider le fonctionnement de ces modèles. Finalement, le troisième article cherche à définir de façon qualitative les fonctions modélisées par un réseau profond. Ces visualisations facilitent l'interprétation des représentations et invariances modélisées par une architecture profonde.This thesis studies a class of algorithms called deep architectures. We argue that models that are based on a shallow composition of local features are not appropriate for the set of real-world functions and datasets that are of interest to us, namely data with many factors of variation. Modelling such functions and datasets is important if we are hoping to create an intelligent agent that can learn from complicated data. Deep architectures are hypothesized to be a step in the right direction, as they are compositions of nonlinearities and can learn compact distributed representations of data with many factors of variation. Training fully-connected artificial neural networks---the most common form of a deep architecture---was not possible before Hinton (2006) showed that one can use stacks of unsupervised Restricted Boltzmann Machines to initialize or pre-train a supervised multi-layer network. This breakthrough has been influential, as the basic idea of using unsupervised learning to improve generalization in deep networks has been reproduced in a multitude of other settings and models. In this thesis, we cast the deep learning ideas and techniques as defining a special kind of inductive bias. This bias is defined not only by the kind of functions that are eventually represented by such deep models, but also by the learning process that is commonly used for them. This work is a study of the reasons for why this class of functions generalizes well, the situations where they should work well, and the qualitative statements that one could make about such functions. This thesis is thus an attempt to understand why deep architectures work. In the first of the articles presented we study the question of how well our intuitions about the need for deep models correspond to functions that they can actually model well. In the second article we perform an in-depth study of why unsupervised pre-training helps deep learning and explore a variety of hypotheses that give us an intuition for the dynamics of learning in such architectures. Finally, in the third article, we want to better understand what a deep architecture models, qualitatively speaking. Our visualization approach enables us to understand the representations and invariances modelled and learned by deeper layers

    Jätevedenpuhdistamojen prosessinohjauksen ja operoinnin kehittäminen data-analytiikan avulla: esimerkkejä teollisuudesta ja kansainvälisiltä puhdistamoilta

    Get PDF
    Instrumentation, control and automation are central for operation of municipal wastewater treatment plants. Treatment performance can be further improved and secured by processing and analyzing the collected process and equipment data. New challenges from resource efficiency, climate change and aging infrastructure increase the demand for understanding and controlling plant-wide interactions. This study aims to review what needs, barriers, incentives and opportunities Finnish wastewater treatment plants have for developing current process control and operation systems with data analytics. The study is conducted through interviews, thematic analysis and case studies of real-life applications in process industries and international utilities. Results indicate that for many utilities, additional measures for quality assurance of instruments, equipment and controllers are necessary before advanced control strategies can be applied. Readily available data could be used to improve the operational reliability of the process. 14 case studies of advanced data processing, analysis and visualization methods used in Finnish and international wastewater treatment plants as well as Finnish process industries are reviewed. Examples include process optimization and quality assurance solutions that have proven benefits in operational use. Applicability of these solutions for identified development needs is initially evaluated. Some of the examples are estimated to have direct potential for application in Finnish WWTPs. For other case studies, further piloting or research efforts to assess the feasibility and cost-benefits for WWTPs are suggested. As plant operation becomes more centralized and outsourced in the future, need for applying data analytics is expected to increase.Prosessinohjaus- ja automaatiojärjestelmillä on keskeinen rooli modernien jätevedenpuhdistamojen operoinnissa. Prosessi- ja laitetietoa paremmin hyödyntämällä prosessia voidaan ohjata entistä tehokkaammin ja luotettavammin. Kiertotalous, ilmastonmuutos ja infrastruktuurin ikääntyminen korostavat entisestään tarvetta ymmärtää ja ohjata myös eri osaprosessien välisiä vuorovaikutuksia. Tässä työssä tarkastellaan tarpeita, esteitä, kannustimia ja mahdollisuuksia kehittää jätevedenpuhdistamojen ohjausta ja operointia data-analytiikan avulla. Eri sidosryhmien näkemyksiä kartoitetaan haastatteluilla, joiden tuloksia käsitellään temaattisen analyysin kautta. Löydösten perusteella potentiaalisia ratkaisuja kartoitetaan suomalaisten ja kansainvälisten puhdistamojen sekä prosessiteollisuuden jo käyttämistä sovelluksista. Löydökset osoittavat, että monilla puhdistamoilla tarvitaan nykyistä merkittävästi kattavampia menetelmiä instrumentoinnin, laitteiston ja ohjauksen laadunvarmistukseen, ennen kuin edistyneempien prosessinohjausmenetelmien käyttöönotto on mahdollista. Operoinnin toimintavarmuutta ja luotettavuutta voitaisiin kehittää monin tavoin hyödyntämällä jo kerättyä prosessi- ja laitetietoa. Työssä esitellään yhteensä 14 esimerkkiä puhdistamoilla ja prosessiteollisuudessa käytössä olevista prosessinohjaus- ja laadunvarmistusmenetelmistä. Osalla ratkaisuista arvioidaan sellaisenaan olevan laajaa sovelluspotentiaalia suomalaisilla jätevedenpuhdistamoilla. Useiden ratkaisujen käyttöönottoa voitaisiin edistää pilotoinnilla tai jatkotutkimuksella potentiaalisten hyötyjen ja kustannusten arvioimiseksi. Jo kerättyä prosessi- ja laitetietoa hyödyntävien ratkaisujen kysynnän odotetaan tulevaisuudessa lisääntyvän, kun puhdistamojen operointi keskittyy ja paineet kustannus- ja energiatehokkuudelle kasvavat
    corecore