Search CORE

833 research outputs found

Recommended from our members

Active learning with committees : an approach to efficient learning in text categorization using linear threshold algorithms

Author: Liere Ray
Publication venue: 'Oregon State University'
Publication date
Field of study

We developed and investigated machine learning methods that require minimal preprocessing of the input data, use few training examples, run fast, and still obtain high levels of accuracy. Most approaches to designing machine learning programs are based on the supervised learning paradigm – training examples are chosen randomly and given to the learner. We explore the "active learning" paradigm – the learner automatically selects the more informative training examples. Our domain of interest is text categorization, but most of the methods developed are quite general. The purpose of text categorization is to assign each document in a collection to appropriate categories. Most existing text categorization methods require large amounts of time to prepare the documents for learning and large numbers of examples for training. Humans must assign correct categories to documents before they can be used for training; this costs time and money. Our goal is to develop machine learning methods that, when compared to other methods currently available, are more efficient in time and space, use fewer training documents, and are as accurate. We developed the Active Learning with Committees (ALC) framework – inspired by the Query by Committee approach of Freund, Seung, et al. A "committee" is a group of learners that jointly participate in learning and in predicting the classes of new examples. We perform minimal preprocessing of the documents and thus the domain is noisy, high dimensional, and has large numbers of irrelevant attributes. We use linear threshold learning algorithms to obtain computational efficiency with respect to these large numbers of attributes, with specific algorithms being chosen because they also generalize well when large numbers of attributes are irrelevant. We developed and analyzed several ALC systems. Our results show that it is possible to design active learning systems that scale up to large numbers of features and obtain accuracies comparable to the supervised learning methods while using an order of magnitude fewer examples and an order of magnitude less time. The ALC methods developed have run times on the order of seconds, typically use only 5 - 7% of the training documents, and are as accurate as their supervised counterparts

ScholarsArchive@OSU

Nonparametric weighted stochastic block models

Author: Peixoto Tiago P.
Publication venue: 'American Physical Society (APS)'
Publication date: 16/01/2018
Field of study

We present a Bayesian formulation of weighted stochastic block models that can be used to infer the large-scale modular structure of weighted networks, including their hierarchical organization. Our method is nonparametric, and thus does not require the prior knowledge of the number of groups or other dimensions of the model, which are instead inferred from data. We give a comprehensive treatment of different kinds of edge weights (i.e. continuous or discrete, signed or unsigned, bounded or unbounded), as well as arbitrary weight transformations, and describe an unsupervised model selection approach to choose the best network description. We illustrate the application of our method to a variety of empirical weighted networks, such as global migrations, voting patterns in congress, and neural connections in the human brain.Comment: 19 pages, 11 figures. Code is freely available as part of graph-tool at https://graph-tool.skewed.de . See also the HOWTO at https://graph-tool.skewed.de/static/doc/demos/inference/inference.htm

arXiv.org e-Print Archive

OPUS

Deep Structured Models for Large Scale Object Co-detection and Segmentation

Author: Hayder Zeeshan
Publication venue
Publication date: 01/01/2017
Field of study

Structured decisions are often required for a large variety of image and scene understanding tasks in computer vision, with few of them being object detection, localization, semantic segmentation and many more. Structured prediction deals with learning inherent structure by incorporating contextual information from several images and multiple tasks. However, it is very challenging when dealing with large scale image datasets where performance is limited by high computational costs and expressive power of the underlying representation learning techniques. In this thesis, we present efficient and effective deep structured models for context-aware object detection, co-localization and instance-level semantic segmentation. First, we introduce a principled formulation for object co-detection using a fully-connected conditional random field (CRF). We build an explicit graph whose vertices represent object candidates (instead of pixel values) and edges encode the object similarity via simple, yet effective pairwise potentials. More specifically, we design a weighted mixture of Gaussian kernels for class-specific object similarity, and formulate kernel weights estimation as a least-squares regression problem. Its solution can therefore be obtained in closed-form. Furthermore, in contrast with traditional co-detection approaches, it has been shown that inference in such fully-connected CRFs can be performed efficiently using an approximate mean-field method with high-dimensional Gaussian filtering. This lets us effectively leverage information in multiple images. Next, we extend our class-specific co-detection framework to multiple object categories. We model object candidates with rich, high-dimensional features learned using a deep convolutional neural network. In particular, our max-margin and directloss structural boosting algorithms enable us to learn the most suitable features that best encode pairwise similarity relationships within our CRF framework. Furthermore, it guarantees that the time and space complexity is O(n t) where n is the total number of candidate boxes in the pool and t the number of mean-field iterations. Moreover, our experiments evidence the importance of learning rich similarity measures to account for the contextual relations across object classes and instances. However, all these methods are based on precomputed object candidates (or proposals), thus localization performance is limited by the quality of bounding-boxes. To address this, we present an efficient object proposal co-generation technique that leverages the collective power of multiple images. In particular, we design a deep neural network layer that takes unary and pairwise features as input, builds a fully-connected CRF and produces mean-field marginals as output. It also lets us backpropagate the gradient through entire network by unrolling the iterations of CRF inference. Furthermore, this layer simplifies the end-to-end learning, thus effectively benefiting from multiple candidates to co-generate high-quality object proposals. Finally, we develop a multi-task strategy to jointly learn object detection, localization and instance-level semantic segmentation in a single network. In particular, we introduce a novel representation based on the distance transform of the object masks. To this end, we design a new residual-deconvolution architecture that infers such a representation and decodes it into the final binary object mask. We show that the predicted masks can go beyond the scope of the bounding boxes and that the multiple tasks can benefit from each other. In summary, in this thesis, we exploit the joint power of multiple images as well as multiple tasks to improve generalization performance of structured learning. Our novel deep structured models, similarity learning techniques and residual-deconvolution architecture can be used to make accurate and reliable inference for key vision tasks. Furthermore, our quantitative and qualitative experiments on large scale challenging image datasets demonstrate the superiority of the proposed approaches over the state-of-the-art methods

The Australian National University

Argument-Based and Multi-faceted Rating to Support Large-Scale Deliberation

Author: Tanasijevic Sanja
Publication venue: KIT-Bibliothek, Karlsruhe
Publication date: 01/01/2016
Field of study

KITopen

Natural Image Statistics and Low-Complexity Feature Selection

Author: M. Vasconcelos
N. Vasconcelos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Crossref