106,840 research outputs found
Testing and Learning on Distributions with Symmetric Noise Invariance
Kernel embeddings of distributions and the Maximum Mean Discrepancy (MMD),
the resulting distance between distributions, are useful tools for fully
nonparametric two-sample testing and learning on distributions. However, it is
rarely that all possible differences between samples are of interest --
discovered differences can be due to different types of measurement noise, data
collection artefacts or other irrelevant sources of variability. We propose
distances between distributions which encode invariance to additive symmetric
noise, aimed at testing whether the assumed true underlying processes differ.
Moreover, we construct invariant features of distributions, leading to learning
algorithms robust to the impairment of the input distributions with symmetric
additive noise.Comment: 22 page
Sticky Seeding in Discrete-Time Reversible-Threshold Networks
When nodes can repeatedly update their behavior (as in agent-based models
from computational social science or repeated-game play settings) the problem
of optimal network seeding becomes very complex. For a popular
spreading-phenomena model of binary-behavior updating based on thresholds of
adoption among neighbors, we consider several planning problems in the design
of \textit{Sticky Interventions}: when adoption decisions are reversible, the
planner aims to find a Seed Set where temporary intervention leads to long-term
behavior change. We prove that completely converting a network at minimum cost
is -hard to approximate and that maximizing conversion
subject to a budget is -hard to approximate. Optimization
heuristics which rely on many objective function evaluations may still be
practical, particularly in relatively-sparse networks: we prove that the
long-term impact of a Seed Set can be evaluated in operations. For a
more descriptive model variant in which some neighbors may be more influential
than others, we show that under integer edge weights from
objective function evaluation requires only operations. These
operation bounds are based on improvements we give for bounds on
time-steps-to-convergence under discrete-time reversible-threshold updates in
networks.Comment: 19 pages, 2 figure
Recommended from our members
Prediction of microbial communities for urban metagenomics using neural network approach.
BACKGROUND:Microbes are greatly associated with human health and disease, especially in densely populated cities. It is essential to understand the microbial ecosystem in an urban environment for cities to monitor the transmission of infectious diseases and detect potentially urgent threats. To achieve this goal, the DNA sample collection and analysis have been conducted at subway stations in major cities. However, city-scale sampling with the fine-grained geo-spatial resolution is expensive and laborious. In this paper, we introduce MetaMLAnn, a neural network based approach to infer microbial communities at unsampled locations given information reflecting different factors, including subway line networks, sampling material types, and microbial composition patterns. RESULTS:We evaluate the effectiveness of MetaMLAnn based on the public metagenomics dataset collected from multiple locations in the New York and Boston subway systems. The experimental results suggest that MetaMLAnn consistently performs better than other five conventional classifiers under different taxonomic ranks. At genus level, MetaMLAnn can achieve F1 scores of 0.63 and 0.72 on the New York and the Boston datasets, respectively. CONCLUSIONS:By exploiting heterogeneous features, MetaMLAnn captures the hidden interactions between microbial compositions and the urban environment, which enables precise predictions of microbial communities at unmeasured locations
Training Gaussian Mixture Models at Scale via Coresets
How can we train a statistical mixture model on a massive data set? In this
work we show how to construct coresets for mixtures of Gaussians. A coreset is
a weighted subset of the data, which guarantees that models fitting the coreset
also provide a good fit for the original data set. We show that, perhaps
surprisingly, Gaussian mixtures admit coresets of size polynomial in dimension
and the number of mixture components, while being independent of the data set
size. Hence, one can harness computationally intensive algorithms to compute a
good approximation on a significantly smaller data set. More importantly, such
coresets can be efficiently constructed both in distributed and streaming
settings and do not impose restrictions on the data generating process. Our
results rely on a novel reduction of statistical estimation to problems in
computational geometry and new combinatorial complexity results for mixtures of
Gaussians. Empirical evaluation on several real-world datasets suggests that
our coreset-based approach enables significant reduction in training-time with
negligible approximation error
Learning Sparse High Dimensional Filters: Image Filtering, Dense CRFs and Bilateral Neural Networks
Bilateral filters have wide spread use due to their edge-preserving
properties. The common use case is to manually choose a parametric filter type,
usually a Gaussian filter. In this paper, we will generalize the
parametrization and in particular derive a gradient descent algorithm so the
filter parameters can be learned from data. This derivation allows to learn
high dimensional linear filters that operate in sparsely populated feature
spaces. We build on the permutohedral lattice construction for efficient
filtering. The ability to learn more general forms of high-dimensional filters
can be used in several diverse applications. First, we demonstrate the use in
applications where single filter applications are desired for runtime reasons.
Further, we show how this algorithm can be used to learn the pairwise
potentials in densely connected conditional random fields and apply these to
different image segmentation tasks. Finally, we introduce layers of bilateral
filters in CNNs and propose bilateral neural networks for the use of
high-dimensional sparse data. This view provides new ways to encode model
structure into network architectures. A diverse set of experiments empirically
validates the usage of general forms of filters
One-class classifiers based on entropic spanning graphs
One-class classifiers offer valuable tools to assess the presence of outliers
in data. In this paper, we propose a design methodology for one-class
classifiers based on entropic spanning graphs. Our approach takes into account
the possibility to process also non-numeric data by means of an embedding
procedure. The spanning graph is learned on the embedded input data and the
outcoming partition of vertices defines the classifier. The final partition is
derived by exploiting a criterion based on mutual information minimization.
Here, we compute the mutual information by using a convenient formulation
provided in terms of the -Jensen difference. Once training is
completed, in order to associate a confidence level with the classifier
decision, a graph-based fuzzy model is constructed. The fuzzification process
is based only on topological information of the vertices of the entropic
spanning graph. As such, the proposed one-class classifier is suitable also for
data characterized by complex geometric structures. We provide experiments on
well-known benchmarks containing both feature vectors and labeled graphs. In
addition, we apply the method to the protein solubility recognition problem by
considering several representations for the input samples. Experimental results
demonstrate the effectiveness and versatility of the proposed method with
respect to other state-of-the-art approaches.Comment: Extended and revised version of the paper "One-Class Classification
Through Mutual Information Minimization" presented at the 2016 IEEE IJCNN,
Vancouver, Canad
Completeness results for quasi-categories of algebras, homotopy limits, and related general constructions
Consider a diagram of quasi-categories that admit and functors that preserve
limits or colimits of a fixed shape. We show that any weighted limit whose
weight is a projective cofibrant simplicial functor is again a quasi-category
admitting these (co)limits and that they are preserved by the functors in the
limit cone. In particular, the Bousfield-Kan homotopy limit of a diagram of
quasi-categories admit any limits or colimits existing in and preserved by the
functors in that diagram. In previous work, we demonstrated that the
quasi-category of algebras for a homotopy coherent monad could be described as
a weighted limit with projective cofibrant weight, so these results immediately
provide us with important (co)completeness results for quasi-categories of
algebras. These generalise most of the classical categorical results, except
for a well known theorem which shows that limits lift to the category of
algebras for any monad, regardless of whether its functor part preserves those
limits. The second half of this paper establishes this more general result in
the quasi-categorical setting: showing that the monadic forgetful functor of
the quasi-category of algebras for a homotopy coherent monad creates all limits
that exist in the base quasi-category, without further assumption on the monad.
This proof relies upon a more delicate and explicit analysis of the particular
weight used to define quasi-categories of algebras.Comment: 33 pages; a sequel to arXiv:1306.5144 and arXiv:1310.8279; v3: final
journal version with updated internal references to the new version of
"Homotopy coherent adjunctions and the formal theory of monads
Low Rank Approximation of Binary Matrices: Column Subset Selection and Generalizations
Low rank matrix approximation is an important tool in machine learning. Given
a data matrix, low rank approximation helps to find factors, patterns and
provides concise representations for the data. Research on low rank
approximation usually focus on real matrices. However, in many applications
data are binary (categorical) rather than continuous. This leads to the problem
of low rank approximation of binary matrix. Here we are given a
binary matrix and a small integer . The goal is to find two binary
matrices and of sizes and respectively, so
that the Frobenius norm of is minimized. There are two models of this
problem, depending on the definition of the dot product of binary vectors: The
model and the Boolean semiring model. Unlike low rank
approximation of real matrix which can be efficiently solved by Singular Value
Decomposition, approximation of binary matrix is -hard even for .
In this paper, we consider the problem of Column Subset Selection (CSS), in
which one low rank matrix must be formed by columns of the data matrix. We
characterize the approximation ratio of CSS for binary matrices. For
model, we show the approximation ratio of CSS is bounded by
and this bound is asymptotically tight. For
Boolean model, it turns out that CSS is no longer sufficient to obtain a bound.
We then develop a Generalized CSS (GCSS) procedure in which the columns of one
low rank matrix are generated from Boolean formulas operating bitwise on
columns of the data matrix. We show the approximation ratio of GCSS is bounded
by , and the exponential dependency on is inherent.Comment: 38 page
- …