32 research outputs found
Efficient, noise-tolerant, and private learning via boosting
We introduce a simple framework for designing private boosting algorithms. We give natural conditions under which these algorithms are differentially private, efficient, and noise-tolerant PAC learners. To demonstrate our framework, we use it to construct noise-tolerant and private PAC learners for large-margin halfspaces whose sample complexity does not depend on the dimension.
We give two sample complexity bounds for our large-margin halfspace learner. One bound is based only on differential privacy, and uses this guarantee as an asset for ensuring generalization.
This first bound illustrates a general methodology for obtaining PAC learners from privacy, which may be of independent interest. The second bound uses standard techniques
from the theory of large-margin classification (the fat-shattering dimension) to match the best known sample complexity for differentially private learning of large-margin halfspaces, while
additionally tolerating random label noise.https://arxiv.org/pdf/2002.01100.pd
Scalable Greedy Algorithms for Transfer Learning
In this paper we consider the binary transfer learning problem, focusing on
how to select and combine sources from a large pool to yield a good performance
on a target task. Constraining our scenario to real world, we do not assume the
direct access to the source data, but rather we employ the source hypotheses
trained from them. We propose an efficient algorithm that selects relevant
source hypotheses and feature dimensions simultaneously, building on the
literature on the best subset selection problem. Our algorithm achieves
state-of-the-art results on three computer vision datasets, substantially
outperforming both transfer learning and popular feature selection baselines in
a small-sample setting. We also present a randomized variant that achieves the
same results with the computational cost independent from the number of source
hypotheses and feature dimensions. Also, we theoretically prove that, under
reasonable assumptions on the source hypotheses, our algorithm can learn
effectively from few examples
Nonparametric Bayesian Topic Modelling with Auxiliary Data
The intent of this dissertation in computer science is to study
topic models for text analytics. The first objective of this
dissertation is to incorporate auxiliary information present in
text corpora to improve topic modelling for natural language
processing (NLP) applications. The second objective of this
dissertation is to extend existing topic models to employ
state-of-the-art nonparametric Bayesian techniques for better
modelling of text data. In particular, this dissertation focusses
on:
- incorporating hashtags, mentions, emoticons, and target-opinion
dependency present in tweets, together with an external sentiment
lexicon, to perform opinion mining or sentiment analysis on
products and services;
- leveraging abstracts, titles, authors, keywords, categorical
labels, and the citation network to perform bibliographic
analysis on research publications, using a supervised or
semi-supervised topic model; and
- employing the hierarchical Pitman-Yor process (HPYP) and the
Gaussian process (GP) to jointly model text, hashtags, authors,
and the follower network in tweets for corpora exploration and
summarisation.
In addition, we provide a framework for implementing arbitrary
HPYP topic models to ease the development of our proposed topic
models, made possible by modularising the Pitman-Yor processes.
Through extensive experiments and qualitative assessment, we find
that topic models fit better to the data as we utilise more
auxiliary information and by employing the Bayesian nonparametric
method
Technology for large space systems: A bibliography with indexes (supplement 16)
This bibliography lists 673 reports, articles and other documents introduced into the NASA scientific and technical information system between July 1, 1986 and December 31, 1986. Its purpose is to provide helpful information to the researcher, manager, and designer in technology development and mission design according to system interactive analysis and design, structural and thermal analysis and design, structural concepts and control systems, electronics, advanced materials, assembly concepts, propulsion, and solar power satellite systems