43,078 research outputs found
Fast and exact search for the partition with minimal information loss
In analysis of multi-component complex systems, such as neural systems,
identifying groups of units that share similar functionality will aid
understanding of the underlying structures of the system. To find such a
grouping, it is useful to evaluate to what extent the units of the system are
separable. Separability or inseparability can be evaluated by quantifying how
much information would be lost if the system were partitioned into subsystems,
and the interactions between the subsystems were hypothetically removed. A
system of two independent subsystems are completely separable without any loss
of information while a system of strongly interacted subsystems cannot be
separated without a large loss of information. Among all the possible
partitions of a system, the partition that minimizes the loss of information,
called the Minimum Information Partition (MIP), can be considered as the
optimal partition for characterizing the underlying structures of the system.
Although the MIP would reveal novel characteristics of the neural system, an
exhaustive search for the MIP is numerically intractable due to the
combinatorial explosion of possible partitions. Here, we propose a
computationally efficient search to precisely identify the MIP among all
possible partitions by exploiting the submodularity of the measure of
information loss. Mutual information is one such submodular information loss
functions, and is a natural choice for measuring the degree of statistical
dependence between paired sets of random variables. By using mutual information
as a loss function, we show that the search for MIP can be performed in a
practical order of computational time for a reasonably large system. We also
demonstrate that MIP search allows for the detection of underlying global
structures in a network of nonlinear oscillators
Clustering clinical departments for wards to achieve a prespecified blocking probability
When the number of available beds in a hospital is limited and fixed, it can be beneficial to cluster several clinical departments such that the probability of not being able to admit a patient is acceptably small. The clusters are then assigned to the available wards such that enough beds are available to guarantee a blocking probability below a prespecified value. We first give an exact formulation of the problem to be able to achieve optimal solutions. To reduce computation times, we also introduce two heuristic solution methods. The first heuristic is similar to the exact solution method, however, the number of beds needed is approximated by a linear function. The second heuristic uses a local search approach to determine the assignment of clinical departments to clusters and a restricted version of the exact solution method to determine the assignment of clusters to wards
Visual Search at eBay
In this paper, we propose a novel end-to-end approach for scalable visual
search infrastructure. We discuss the challenges we faced for a massive
volatile inventory like at eBay and present our solution to overcome those. We
harness the availability of large image collection of eBay listings and
state-of-the-art deep learning techniques to perform visual search at scale.
Supervised approach for optimized search limited to top predicted categories
and also for compact binary signature are key to scale up without compromising
accuracy and precision. Both use a common deep neural network requiring only a
single forward inference. The system architecture is presented with in-depth
discussions of its basic components and optimizations for a trade-off between
search relevance and latency. This solution is currently deployed in a
distributed cloud infrastructure and fuels visual search in eBay ShopBot and
Close5. We show benchmark on ImageNet dataset on which our approach is faster
and more accurate than several unsupervised baselines. We share our learnings
with the hope that visual search becomes a first class citizen for all large
scale search engines rather than an afterthought.Comment: To appear in 23rd SIGKDD Conference on Knowledge Discovery and Data
Mining (KDD), 2017. A demonstration video can be found at
https://youtu.be/iYtjs32vh4
Segmentation of the Poisson and negative binomial rate models: a penalized estimator
We consider the segmentation problem of Poisson and negative binomial (i.e.
overdispersed Poisson) rate distributions. In segmentation, an important issue
remains the choice of the number of segments. To this end, we propose a
penalized log-likelihood estimator where the penalty function is constructed in
a non-asymptotic context following the works of L. Birg\'e and P. Massart. The
resulting estimator is proved to satisfy an oracle inequality. The performances
of our criterion is assessed using simulated and real datasets in the RNA-seq
data analysis context
- …