155 research outputs found
Explicit Learning Curves for Transduction and Application to Clustering and Compression Algorithms
Inductive learning is based on inferring a general rule from a finite data
set and using it to label new data. In transduction one attempts to solve the
problem of using a labeled training set to label a set of unlabeled points,
which are given to the learner prior to learning. Although transduction seems
at the outset to be an easier task than induction, there have not been many
provably useful algorithms for transduction. Moreover, the precise relation
between induction and transduction has not yet been determined. The main
theoretical developments related to transduction were presented by Vapnik more
than twenty years ago. One of Vapnik's basic results is a rather tight error
bound for transductive classification based on an exact computation of the
hypergeometric tail. While tight, this bound is given implicitly via a
computational routine. Our first contribution is a somewhat looser but explicit
characterization of a slightly extended PAC-Bayesian version of Vapnik's
transductive bound. This characterization is obtained using concentration
inequalities for the tail of sums of random variables obtained by sampling
without replacement. We then derive error bounds for compression schemes such
as (transductive) support vector machines and for transduction algorithms based
on clustering. The main observation used for deriving these new error bounds
and algorithms is that the unlabeled test points, which in the transductive
setting are known in advance, can be used in order to construct useful data
dependent prior distributions over the hypothesis space
Nonmyopic ϵ-Bayes-Optimal Active Learning of Gaussian Processes
A fundamental issue in active learning of Gaussian processes is that of the exploration-exploitation trade-off. This paper presents a novel nonmyopic ϵ-Bayes-optimal active learning (ϵ-BAL) approach that jointly and naturally optimizes the trade-off. In contrast, existing works have primarily developed myopic/greedy algorithms or performed exploration and exploitation separately. To perform active learning in real time, we then propose an anytime algorithm based on ϵ-BAL with performance guarantee and empirically demonstrate using synthetic and real-world datasets that, with limited budget, it outperforms the state-of-the-art algorithms.Singapore. National Research Foundation (Singapore-MIT Alliance for Research and Technology Center
Estudi de la coautoria de publicacions cientĂfiques entre UPC i cinc universitats dels Estats Units : Caltech, Stanford University, UC Davis, UC Irvine i UCLA
S'analitza la coautoria de la UPC amb autors vinculats a institucions acadèmiques dels Estats Units, per totes les à rees temà tiques, de gener de 2009 a juny de 2014.Postprint (published version
On relational learning and discovery in social networks: a survey
The social networking scene has evolved tremendously over the years. It has grown in relational complexities that extend a vast presence onto popular social media platforms on the internet. With the advance of sentimental computing and social complexity, relationships which were once thought to be simple have now become multi-dimensional and widespread in the online scene. This explosion in the online social scene has attracted much research attention. The main aims of this work revolve around the knowledge discovery and datamining processes of these feature-rich relations. In this paper, we provide a survey of relational learning and discovery through popular social analysis of different structure types which are integral to applications within the emerging field of sentimental and affective computing. It is hoped that this contribution will add to the clarity of how social networks are analyzed with the latest groundbreaking methods and provide certain directions for future improvements
- …