444 research outputs found

    Adversarial Data Programming: Using GANs to Relax the Bottleneck of Curated Labeled Data

    Full text link
    Paucity of large curated hand-labeled training data for every domain-of-interest forms a major bottleneck in the deployment of machine learning models in computer vision and other fields. Recent work (Data Programming) has shown how distant supervision signals in the form of labeling functions can be used to obtain labels for given data in near-constant time. In this work, we present Adversarial Data Programming (ADP), which presents an adversarial methodology to generate data as well as a curated aggregated label has given a set of weak labeling functions. We validated our method on the MNIST, Fashion MNIST, CIFAR 10 and SVHN datasets, and it outperformed many state-of-the-art models. We conducted extensive experiments to study its usefulness, as well as showed how the proposed ADP framework can be used for transfer learning as well as multi-task learning, where data from two domains are generated simultaneously using the framework along with the label information. Our future work will involve understanding the theoretical implications of this new framework from a game-theoretic perspective, as well as explore the performance of the method on more complex datasets.Comment: CVPR 2018 main conference pape

    Blind Multiclass Ensemble Classification

    Get PDF
    The rising interest in pattern recognition and data analytics has spurred the development of innovative machine learning algorithms and tools. However, as each algorithm has its strengths and limitations, one is motivated to judiciously fuse multiple algorithms in order to find the "best" performing one, for a given dataset. Ensemble learning aims at such high-performance meta-algorithm, by combining the outputs from multiple algorithms. The present work introduces a blind scheme for learning from ensembles of classifiers, using a moment matching method that leverages joint tensor and matrix factorization. Blind refers to the combiner who has no knowledge of the ground-truth labels that each classifier has been trained on. A rigorous performance analysis is derived and the proposed scheme is evaluated on synthetic and real datasets.Comment: To appear in IEEE Transactions in Signal Processin

    파라미터 학습 통한 데이터 잡음 및 간섭극복 연구

    Get PDF
    학위논문 (박사) -- 서울대학교 대학원 : 공과대학 전기·컴퓨터공학부, 2021. 2. 정교민.인공신경망 모델에 다량의 데이터를 학습시키는 방식은 컴퓨터 비전 및 자연어 처리 분야의 문제들을 해결하는데 새로운 패러다임으로 자리매김하였다. 기존 사람의 직관으로 모델을 설정하는 방식과 비교하여 높은 성능을 달성할 수 있었으나, 학습데이터의 양과 품질에 따라서 그 성능이 크게 좌우된다. 이렇게 인공 신경망을 효과적으로 훈련하려면 많은 양의 데이터를 모으는 것과 데이터의 품질을 저하시키는 요인을 파악하는 것이 중요하다. 본 연구에서는 라벨링된 데이터의 품질을 결정하는 주요 요인으로 알려져 있는 잡음(Noise)과 간섭(Interference)을 극복할 수 있는 기법을 제시한다. 연구자들은 일반적으로 웹기반의 크라우드 소싱시스템을 사용하여 다양한 사람들로부터 답변을 수집하여 데이터그룹을 구성한다\cite{simonyan2014very}. 그러나 사람들의 답변으로 얻는 데이터는 작업 지침에 대한 오해, 책임 부족 및 고유한 오류로 인해서 데이터 입력(Input)과 출력(Target)사이에 잡음이 포함된다. 본 연구에서는 이렇게 크라우드 소싱을 통해 라벨링된 데이터에 존재하는 잡음을 극복하기 위한 추론 알고리즘을 제안한다. 두번째로, 모델의 학습성능을 저하시키는 요인인 데이터간의 간섭을 다룬다. 잡음이 제거되어 정제된 입력과 출력을 라벨링된 데이터 샘플이라고 하면, 학습시에 샘플들 사이의 관계를 생각할 수 있다. 사람 수준의 인공지능에 도달하기 위해서는 하나의 모델이 하나의 문제만을 해결하는 것이 아니라 시간상 순차적으로 직면하는 여러 문제를 동시에 해결할 수 있어야 한다. 이러한 상황에서, 샘플들 사이에 간섭이 발생할 수 있고, 학계에서는 연속학습(Continual Learning)에서의 "Catastrophic Forgetting"또는 "Semantic Drift"으로 정의하고 있다. 본 연구에서는 이러한 간섭을 효과적으로 극복하기 위한 방법에 대한 연구를 다룬다. 앞서 언급한 데이터 잡음을 극복하기 위해서 첫 번째 장에서는 크라우드 소싱 시스템의 이산 객관식 및 실수 벡터 회귀 작업에 대한 새로운 추론 알고리즘을 각각 제안한다. 제안 된 알고리즘은 크라우드 소싱 모델을 그래프 모델(Graphical Model)로서 상정하고, 테스크와 답변을 주는 사람들간의 두 가지 유형의 메시지를 반복적으로 주고 받음으로써 각 작업의 정답과 각 작업자의 신뢰성을 추정 할 수 있다. 또한 이들의 평균 성능은 확률적 군중 모델을 이용하여 분석하고 입증한다. 이러한 성능에러 한계는 작업당 할당되는 사람들의 수와 작업자의 평균 신뢰성의해 결정된다. 사람들의 평균 신뢰도가 일정 수준을 넘어서면, 제안된 알고리즘의 평균 성능은 모든 작업자의 신뢰성을 알고있는 오라클 추정기 (이론적인 한계)에 수렴한다. 실제 데이터 세트와 합성 데이터 세트 모두에 대한 광범위한 실험을 통해, 제안된 알고리즘의 실제 성능이 이전의 state-of-the-art 알고리즘들 보다 우수하다는 것을 입증한다. 논문의 두 번째 장에서는 연속학습상황에서 데이터샘플사이에 발생하는 간섭을 해결하기 위해, 항상성기반의 메타 학습 구조 (Homeostatic Meta Model)를 제안한다. 구체적으로, 이전 테스크 중요한 학습 변수를 찾고 정규화에 선별적으로 적용하는 방법을 사용하는데, 제안된 모델은 이러한 정규화의 강도를 자동으로 제어한다. 이러한 기법은 새로운 학습을 진행할 때 이전에 획득한 지식을 최소한으로 잃어버리도록 인공신경망의 학습을 유도한다. 다양한 유형의 연속 학습 과제에서 제안된 방법을 검증하는데, 실험적으로 제안된 방법이 학습의 간섭완화 측면에서 기존 방법보다 우수하다는 점을 보인다.또한 기존 시냅스 가소성 기반 방법들에 비해 상대적으로 변화에 강인하다.제안된 모델에 의해 생성된 정규화의 강도 값은 시냅스에서 항상성 의 음의 피드백 메커니즘과 유사하게, 특정 범위 내에서 능동적으로 제어된다.Data-driven approaches based on neural networks have emerged as new paradigm to solve problems in computer vision and natural language processing fields. These approaches achieve better performance compared to existing human-design approaches (heuristic), however, these performance gains solely relies on a large amount of high quality labeled data. Accordingly, it is important to collect a large amount of data and improve the quality of data by analyzing degrading factors in order to well-train a model. In this dissertation, I propose iterative algorithms to relieve noise of labeled data in crowdsourcing system and meta architecture to alleviate interference among them in continual learning scenarios respectively. Researchers generally collect data using crowdsourcing system which utilizes human evaluations. However, human annotators' decisions may vary significantly due to misconceptions of task instructions, the lack of responsibility, and inherent noise. To relieve the noise in responses from crowd annotators, I propose novel inference algorithms for discrete multiple choice and real-valued vector regression tasks. Web-based crowdsourcing platforms are widely used for collecting large amount of labeled data. Due to low-paid workers and inherent noise, the quality of acquired data could be easily degraded. The proposed algorithms can overcome the noise by estimating the true answer of each task and a reliability of each worker updating two types of messages iteratively. For performance guarantee, the performances of the algorithms are theoretically proved under probabilistic crowd model. Interestingly, their performance bounds depend on the number of queries per task and the average quality of workers. Under a certain condition, each average performance becomes close to an oracle estimator which knows the reliability of every worker (theoretical upper bound). Through extensive experiments with both real-world and synthetic datasets, the practical performance of algorithms are verified. In fact, they are superior to other state-of-the-art algorithms. Second, when a model learns a sequence of tasks one by one (continual learning), previously learned knowledge may conflict with new knowledge. It is well-known phenomenon called "Catastrophic Forgetting" or "Semantic Drift". In this dissertation, we call the phenomena "Interference" since it occurs between two knowledge from labeled data separated in time. It is essential to control the amount of noise and interference for neural network to be well-trained. In the second part of dissertation, to solve the Interference among labeled data from consecutive tasks in continual learning scenario, a homeostasis-inspired meta learning architecture (HM) is proposed. The HM automatically controls the intensity of regularization (IoR) by capturing important parameters from the previous tasks and the current learning direction. By adjusting IoR, a learner can balance the amount of interference and degrees of freedom for its current learning. Experimental results are provided on various types of continual learning tasks. Those results show that the proposed method notably outperforms the conventional methods in terms of average accuracy and amount of the interference. In experiments, I verify that HM is relatively stable and robust compared to the existing Synaptic Plasticity based methods. Interestingly, the IoR generated by HM appears to be proactively controlled within a certain range, which resembles a negative feedback mechanism of homeostasis in synapses.Contents Abstract Contents List of Tables List of Figures 1 INTRODUCTION 1 2 Reliable multiple-choice iterative algorithm for crowdsourcing systems 6 2.1 Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 2.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2.1 Task Allocation . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.2.2 Multiple Iterative Algorithm . . . . . . . . . . . . . . . . . . 8 2.2.3 Task Allocation for General Setting . . . . . . . . . . . . . . 10 2.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.4 Analysis of algorithms . . . . . . . . . . . . . . . . . . . . . . . . . 16 2.4.1 Quality of workers . . . . . . . . . . . . . . . . . . . . . . . 16 2.4.2 Bound on the Average Error Probability . . . . . . . . . . . . 18 2.4.3 Proof of the Theorem 1 . . . . . . . . . . . . . . . . . . . . . 20 2.4.4 Proof of Sub-Gaussianity . . . . . . . . . . . . . . . . . . . . 22 2.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 iii2.6 Related Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3 Reliable Aggregation Method for Vector Regression in Crowdsourcing 38 3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 3.2 Inference Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 3.2.1 Task Message . . . . . . . . . . . . . . . . . . . . . . . . . . 40 3.2.2 Worker Message . . . . . . . . . . . . . . . . . . . . . . . . 40 3.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . 42 3.3.1 Real crowdsourcing data . . . . . . . . . . . . . . . . . . . . 43 3.4 Performance Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.4.1 Dirichlet crowd model . . . . . . . . . . . . . . . . . . . . . 48 3.4.2 Error Bound . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.4.3 Optimality of Oracle Estimator . . . . . . . . . . . . . . . . . 51 3.4.4 Performance Proofs . . . . . . . . . . . . . . . . . . . . . . . 52 3.5 Related Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4 Homeostasis-Inspired Meta Continual Learning 60 4.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 4.1.1 Continual Learning . . . . . . . . . . . . . . . . . . . . . . . 60 4.1.2 Meta Learning . . . . . . . . . . . . . . . . . . . . . . . . . 62 4.2 Homeostatic Meta-Model . . . . . . . . . . . . . . . . . . . . . . . . 63 4.3 Preliminary Experiments and Findings . . . . . . . . . . . . . . . . . 66 4.3.1 Block-wise Permutation . . . . . . . . . . . . . . . . . . . . 67 4.3.2 Performance Metrics . . . . . . . . . . . . . . . . . . . . . . 68 4.4 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 4.4.1 Datasets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.4.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 4.4.3 Overall Performance . . . . . . . . . . . . . . . . . . . . . . 70 4.5 Related Literature . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 iv4.6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71 5 Conclusion 78 Abstract (In Korean) 89Docto

    Statistical Analysis and Design of Crowdsourcing Applications

    Get PDF
    This thesis develops methods for the analysis and design of crowdsourced experiments and crowdsourced labeling tasks. Much of this document focuses on applications including running natural field experiments, estimating the number of objects in images and collecting labels for word sense disambiguation. Observed shortcomings of the crowdsourced experiments inspired the development of methodology for running more powerful experiments via matching on-the-fly. Using the label data to estimate response functions inspired work on non-parametric function estimation using Bayesian Additive Regression Trees (BART). This work then inspired extensions to BART such as incorporation of missing data as well as a user-friendly R package

    Statistical and Machine Learning Models for Remote Sensing Data Mining - Recent Advancements

    Get PDF
    This book is a reprint of the Special Issue entitled "Statistical and Machine Learning Models for Remote Sensing Data Mining - Recent Advancements" that was published in Remote Sensing, MDPI. It provides insights into both core technical challenges and some selected critical applications of satellite remote sensing image analytics

    Compréhension de contenus visuels par analyse conjointe du contenu et des usages

    Get PDF
    Dans cette thèse, nous traitons de la compréhension de contenus visuels, qu’il s’agisse d’images, de vidéos ou encore de contenus 3D. On entend par compréhension la capacité à inférer des informations sémantiques sur le contenu visuel. L’objectif de ce travail est d’étudier des méthodes combinant deux approches : 1) l’analyse automatique des contenus et 2) l’analyse des interactions liées à l’utilisation de ces contenus (analyse des usages, en plus bref). Dans un premier temps, nous étudions l’état de l’art issu des communautés de la vision par ordinateur et du multimédia. Il y a 20 ans, l’approche dominante visait une compréhension complètement automatique des images. Cette approche laisse aujourd’hui plus de place à différentes formes d’interventions humaines. Ces dernières peuvent se traduire par la constitution d’une base d’apprentissage annotée, par la résolution interactive de problèmes (par exemple de détection ou de segmentation) ou encore par la collecte d’informations implicites issues des usages du contenu. Il existe des liens riches et complexes entre supervision humaine d’algorithmes automatiques et adaptation des contributions humaines via la mise en œuvre d’algorithmes automatiques. Ces liens sont à l’origine de questions de recherche modernes : comment motiver des intervenants humains ? Comment concevoir des scénarii interactifs pour lesquels les interactions contribuent à comprendre le contenu manipulé ? Comment vérifier la qualité des traces collectées ? Comment agréger les données d’usage ? Comment fusionner les données d’usage avec celles, plus classiques, issues d’une analyse automatique ? Notre revue de la littérature aborde ces questions et permet de positionner les contributions de cette thèse. Celles-ci s’articulent en deux grandes parties. La première partie de nos travaux revisite la détection de régions importantes ou saillantes au travers de retours implicites d’utilisateurs qui visualisent ou acquièrent des con- tenus visuels. En 2D d’abord, plusieurs interfaces de vidéos interactives (en particulier la vidéo zoomable) sont conçues pour coordonner des analyses basées sur le contenu avec celles basées sur l’usage. On généralise ces résultats en 3D avec l’introduction d’un nouveau détecteur de régions saillantes déduit de la capture simultanée de vidéos de la même performance artistique publique (spectacles de danse, de chant etc.) par de nombreux utilisateurs. La seconde contribution de notre travail vise une compréhension sémantique d’images fixes. Nous exploitons les données récoltées à travers un jeu, Ask’nSeek, que nous avons créé. Les interactions élémentaires (comme les clics) et les données textuelles saisies par les joueurs sont, comme précédemment, rapprochées d’analyses automatiques des images. Nous montrons en particulier l’intérêt d’interactions révélatrices des relations spatiales entre différents objets détectables dans une même scène. Après la détection des objets d’intérêt dans une scène, nous abordons aussi le problème, plus ambitieux, de la segmentation. ABSTRACT : This thesis focuses on the problem of understanding visual contents, which can be images, videos or 3D contents. Understanding means that we aim at inferring semantic information about the visual content. The goal of our work is to study methods that combine two types of approaches: 1) automatic content analysis and 2) an analysis of how humans interact with the content (in other words, usage analysis). We start by reviewing the state of the art from both Computer Vision and Multimedia communities. Twenty years ago, the main approach was aiming at a fully automatic understanding of images. This approach today gives way to different forms of human intervention, whether it is through the constitution of annotated datasets, or by solving problems interactively (e.g. detection or segmentation), or by the implicit collection of information gathered from content usages. These different types of human intervention are at the heart of modern research questions: how to motivate human contributors? How to design interactive scenarii that will generate interactions that contribute to content understanding? How to check or ensure the quality of human contributions? How to aggregate human contributions? How to fuse inputs obtained from usage analysis with traditional outputs from content analysis? Our literature review addresses these questions and allows us to position the contributions of this thesis. In our first set of contributions we revisit the detection of important (or salient) regions through implicit feedback from users that either consume or produce visual contents. In 2D, we develop several interfaces of interactive video (e.g. zoomable video) in order to coordinate content analysis and usage analysis. We also generalize these results to 3D by introducing a new detector of salient regions that builds upon simultaneous video recordings of the same public artistic performance (dance show, chant, etc.) by multiple users. The second contribution of our work aims at a semantic understanding of fixed images. With this goal in mind, we use data gathered through a game, Ask’nSeek, that we created. Elementary interactions (such as clicks) together with textual input data from players are, as before, mixed with automatic analysis of images. In particular, we show the usefulness of interactions that help revealing spatial relations between different objects in a scene. After studying the problem of detecting objects on a scene, we also adress the more ambitious problem of segmentation
    corecore