2,591 research outputs found
Large-Margin Determinantal Point Processes
Determinantal point processes (DPPs) offer a powerful approach to modeling
diversity in many applications where the goal is to select a diverse subset. We
study the problem of learning the parameters (the kernel matrix) of a DPP from
labeled training data. We make two contributions. First, we show how to
reparameterize a DPP's kernel matrix with multiple kernel functions, thus
enhancing modeling flexibility. Second, we propose a novel parameter estimation
technique based on the principle of large margin separation. In contrast to the
state-of-the-art method of maximum likelihood estimation, our large-margin loss
function explicitly models errors in selecting the target subsets, and it can
be customized to trade off different types of errors (precision vs. recall).
Extensive empirical studies validate our contributions, including applications
on challenging document and video summarization, where flexibility in modeling
the kernel matrix and balancing different errors is indispensable.Comment: 15 page
Component separation methods for the Planck mission
The Planck satellite will map the full sky at nine frequencies from 30 to 857
GHz. The CMB intensity and polarization that are its prime targets are
contaminated by foreground emission. The goal of this paper is to compare
proposed methods for separating CMB from foregrounds based on their different
spectral and spatial characteristics, and to separate the foregrounds into
components of different physical origin. A component separation challenge has
been organized, based on a set of realistically complex simulations of sky
emission. Several methods including those based on internal template
subtraction, maximum entropy method, parametric method, spatial and harmonic
cross correlation methods, and independent component analysis have been tested.
Different methods proved to be effective in cleaning the CMB maps from
foreground contamination, in reconstructing maps of diffuse Galactic emissions,
and in detecting point sources and thermal Sunyaev-Zeldovich signals. The power
spectrum of the residuals is, on the largest scales, four orders of magnitude
lower than that of the input Galaxy power spectrum at the foreground minimum.
The CMB power spectrum was accurately recovered up to the sixth acoustic peak.
The point source detection limit reaches 100 mJy, and about 2300 clusters are
detected via the thermal SZ effect on two thirds of the sky. We have found that
no single method performs best for all scientific objectives. We foresee that
the final component separation pipeline for Planck will involve a combination
of methods and iterations between processing steps targeted at different
objectives such as diffuse component separation, spectral estimation and
compact source extraction.Comment: Matches version accepted by A&A. A version with high resolution
figures is available at http://people.sissa.it/~leach/compsepcomp.pd
Robust Large-Margin Learning in Hyperbolic Space
Recently, there has been a surge of interest in representation learning in
hyperbolic spaces, driven by their ability to represent hierarchical data with
significantly fewer dimensions than standard Euclidean spaces. However, the
viability and benefits of hyperbolic spaces for downstream machine learning
tasks have received less attention. In this paper, we present, to our
knowledge, the first theoretical guarantees for learning a classifier in
hyperbolic rather than Euclidean space. Specifically, we consider the problem
of learning a large-margin classifier for data possessing a hierarchical
structure. Our first contribution is a hyperbolic perceptron algorithm, which
provably converges to a separating hyperplane. We then provide an algorithm to
efficiently learn a large-margin hyperplane, relying on the careful injection
of adversarial examples. Finally, we prove that for hierarchical data that
embeds well into hyperbolic space, the low embedding dimension ensures superior
guarantees when learning the classifier directly in hyperbolic space.Comment: Accepted to NeurIPS 202
Hashing for Similarity Search: A Survey
Similarity search (nearest neighbor search) is a problem of pursuing the data
items whose distances to a query item are the smallest from a large database.
Various methods have been developed to address this problem, and recently a lot
of efforts have been devoted to approximate search. In this paper, we present a
survey on one of the main solutions, hashing, which has been widely studied
since the pioneering work locality sensitive hashing. We divide the hashing
algorithms two main categories: locality sensitive hashing, which designs hash
functions without exploring the data distribution and learning to hash, which
learns hash functions according the data distribution, and review them from
various aspects, including hash function design and distance measure and search
scheme in the hash coding space
Manifold Based Deep Learning: Advances and Machine Learning Applications
Manifolds are topological spaces that are locally Euclidean and find applications in dimensionality reduction, subspace learning, visual domain adaptation, clustering, and more. In this dissertation, we propose a framework for linear dimensionality reduction called the proxy matrix optimization (PMO) that uses the Grassmann manifold for optimizing over orthogonal matrix manifolds. PMO is an iterative and flexible method that finds the lower-dimensional projections for various linear dimensionality reduction methods by changing the objective function. PMO is suitable for Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), Canonical Correlation Analysis (CCA), Maximum Autocorrelation Factors (MAF), and Locality Preserving Projections (LPP). We extend PMO to incorporate robust Lp-norm versions of PCA and LDA, which uses fractional p-norms making them more robust to noisy data and outliers. The PMO method is designed to be realized as a layer in a neural network for maximum benefit. In order to do so, the incremental versions of PCA, LDA, and LPP are included in the PMO framework for problems where the data is not all available at once. Next, we explore the topic of domain shift in visual domain adaptation by combining concepts from spherical manifolds and deep learning. We investigate domain shift, which quantifies how well a model trained on a source domain adapts to a similar target domain with a metric called Spherical Optimal Transport (SpOT). We adopt the spherical manifold along with an orthogonal projection loss to obtain the features from the source and target domains. We then use the optimal transport with the cosine distance between the features as a way to measure the gap between the domains. We show, in our experiments with domain adaptation datasets, that SpOT does better than existing measures for quantifying domain shift and demonstrates a better correlation with the gain of transfer across domains
- …