16,072 research outputs found
Two weight norm inequalities for fractional integral operators and commutators
In these lecture notes we describe some recent work on two weight norm
inequalities for fractional integral operators, also known as Riesz potentials,
and for commutators of fractional integrals. These notes are based on three
lectures delivered at the 6th International Course of Mathematical Analysis in
Andalucia, held in Antequera, Spain, September 8-12, 2014. They are, however,
greatly expanded to include both new results and many details that I did not
present in my lectures due to time constraints
A Survey on Multi-Task Learning
Multi-Task Learning (MTL) is a learning paradigm in machine learning and its
aim is to leverage useful information contained in multiple related tasks to
help improve the generalization performance of all the tasks. In this paper, we
give a survey for MTL. First, we classify different MTL algorithms into several
categories, including feature learning approach, low-rank approach, task
clustering approach, task relation learning approach, and decomposition
approach, and then discuss the characteristics of each approach. In order to
improve the performance of learning tasks further, MTL can be combined with
other learning paradigms including semi-supervised learning, active learning,
unsupervised learning, reinforcement learning, multi-view learning and
graphical models. When the number of tasks is large or the data dimensionality
is high, batch MTL models are difficult to handle this situation and online,
parallel and distributed MTL models as well as dimensionality reduction and
feature hashing are reviewed to reveal their computational and storage
advantages. Many real-world applications use MTL to boost their performance and
we review representative works. Finally, we present theoretical analyses and
discuss several future directions for MTL
Representing Sets as Summed Semantic Vectors
Representing meaning in the form of high dimensional vectors is a common and
powerful tool in biologically inspired architectures. While the meaning of a
set of concepts can be summarized by taking a (possibly weighted) sum of their
associated vectors, this has generally been treated as a one-way operation. In
this paper we show how a technique built to aid sparse vector decomposition
allows in many cases the exact recovery of the inputs and weights to such a
sum, allowing a single vector to represent an entire set of vectors from a
dictionary. We characterize the number of vectors that can be recovered under
various conditions, and explore several ways such a tool can be used for
vector-based reasoning.Comment: In Biologically Inspired Cognitive Architectures 201
A survey of dimensionality reduction techniques
Experimental life sciences like biology or chemistry have seen in the recent
decades an explosion of the data available from experiments. Laboratory
instruments become more and more complex and report hundreds or thousands
measurements for a single experiment and therefore the statistical methods face
challenging tasks when dealing with such high dimensional data. However, much
of the data is highly redundant and can be efficiently brought down to a much
smaller number of variables without a significant loss of information. The
mathematical procedures making possible this reduction are called
dimensionality reduction techniques; they have widely been developed by fields
like Statistics or Machine Learning, and are currently a hot research topic. In
this review we categorize the plethora of dimension reduction techniques
available and give the mathematical insight behind them
The boundedness of multilinear Calder\'on-Zygmund operators on weighted and variable Hardy spaces
We establish the boundedness of the multilinear Calder\'on-Zygmund operators
from a product of weighted Hardy spaces into a weighted Hardy or Lebesgue
space. Our results generalize to the weighted setting results obtained by
Grafakos and Kalton (Collect. Math. 2001) and recent work by the third author,
Grafakos, Nakamura, and Sawano. As part of our proof we provide a finite atomic
decomposition theorem for weighted Hardy spaces, which is interesting in its
own right. As a consequence of our weighted results, we prove the corresponding
estimates on variable Hardy spaces. Our main tool is a multilinear
extrapolation theorem that generalizes a result of the first author and Naibo
(Differential Integral Equations 2016)
Data-Dependent Coresets for Compressing Neural Networks with Applications to Generalization Bounds
We present an efficient coresets-based neural network compression algorithm
that sparsifies the parameters of a trained fully-connected neural network in a
manner that provably approximates the network's output. Our approach is based
on an importance sampling scheme that judiciously defines a sampling
distribution over the neural network parameters, and as a result, retains
parameters of high importance while discarding redundant ones. We leverage a
novel, empirical notion of sensitivity and extend traditional coreset
constructions to the application of compressing parameters. Our theoretical
analysis establishes guarantees on the size and accuracy of the resulting
compressed network and gives rise to generalization bounds that may provide new
insights into the generalization properties of neural networks. We demonstrate
the practical effectiveness of our algorithm on a variety of neural network
configurations and real-world data sets.Comment: First two authors contributed equall
Spectral Sparsification of Simplicial Complexes for Clustering and Label Propagation
As a generalization of the use of graphs to describe pairwise interactions,
simplicial complexes can be used to model higher-order interactions between
three or more objects in complex systems. There has been a recent surge in
activity for the development of data analysis methods applicable to simplicial
complexes, including techniques based on computational topology, higher-order
random processes, generalized Cheeger inequalities, isoperimetric inequalities,
and spectral methods. In particular, spectral learning methods (e.g. label
propagation and clustering) that directly operate on simplicial complexes
represent a new direction for analyzing such complex datasets.
To apply spectral learning methods to massive datasets modeled as simplicial
complexes, we develop a method for sparsifying simplicial complexes that
preserves the spectrum of the associated Laplacian matrices. We show that the
theory of Spielman and Srivastava for the sparsification of graphs extends to
simplicial complexes via the up Laplacian. In particular, we introduce a
generalized effective resistance for simplices, provide an algorithm for
sparsifying simplicial complexes at a fixed dimension, and give a specific
version of the generalized Cheeger inequality for weighted simplicial
complexes. Finally, we introduce higher-order generalizations of spectral
clustering and label propagation for simplicial complexes and demonstrate via
experiments the utility of the proposed spectral sparsification method for
these applications
Sparsity Within and Across Overlapping Groups
Recently, penalties promoting signals that are sparse within and across
groups have been proposed. In this letter, we propose a generalization that
allows to encode more intricate dependencies within groups. However, this
complicates the realization of the threshold function associated with the
penalty, which hinders the use of the penalty in energy minimization. We
discuss how to sidestep this problem, and demonstrate the use of the modified
penalty in an energy minimization formulation for an inverse problem
Extrapolation and Factorization
A modestly revised version of lecture notes that were distributed to
accompany my four lectures at the 2017 Spring School of Analysis at Paseky,
sponsored by Charles University, Prague. They are an introductory survey of
Rubio de Francia extrapolation, Jones factorization, and applications
Decomposition-Based Transfer Distance Metric Learning for Image Classification
Distance metric learning (DML) is a critical factor for image analysis and
pattern recognition. To learn a robust distance metric for a target task, we
need abundant side information (i.e., the similarity/dissimilarity pairwise
constraints over the labeled data), which is usually unavailable in practice
due to the high labeling cost. This paper considers the transfer learning
setting by exploiting the large quantity of side information from certain
related, but different source tasks to help with target metric learning (with
only a little side information). The state-of-the-art metric learning
algorithms usually fail in this setting because the data distributions of the
source task and target task are often quite different. We address this problem
by assuming that the target distance metric lies in the space spanned by the
eigenvectors of the source metrics (or other randomly generated bases). The
target metric is represented as a combination of the base metrics, which are
computed using the decomposed components of the source metrics (or simply a set
of random bases); we call the proposed method, decomposition-based transfer DML
(DTDML). In particular, DTDML learns a sparse combination of the base metrics
to construct the target metric by forcing the target metric to be close to an
integration of the source metrics. The main advantage of the proposed method
compared with existing transfer metric learning approaches is that we directly
learn the base metric coefficients instead of the target metric. To this end,
far fewer variables need to be learned. We therefore obtain more reliable
solutions given the limited side information and the optimization tends to be
faster. Experiments on the popular handwritten image (digit, letter)
classification and challenge natural image annotation tasks demonstrate the
effectiveness of the proposed method
- …