31,478 research outputs found
Incremental Entity Resolution from Linked Documents
In many government applications we often find that information about
entities, such as persons, are available in disparate data sources such as
passports, driving licences, bank accounts, and income tax records. Similar
scenarios are commonplace in large enterprises having multiple customer,
supplier, or partner databases. Each data source maintains different aspects of
an entity, and resolving entities based on these attributes is a well-studied
problem. However, in many cases documents in one source reference those in
others; e.g., a person may provide his driving-licence number while applying
for a passport, or vice-versa. These links define relationships between
documents of the same entity (as opposed to inter-entity relationships, which
are also often used for resolution). In this paper we describe an algorithm to
cluster documents that are highly likely to belong to the same entity by
exploiting inter-document references in addition to attribute similarity. Our
technique uses a combination of iterative graph-traversal, locality-sensitive
hashing, iterative match-merge, and graph-clustering to discover unique
entities based on a document corpus. A unique feature of our technique is that
new sets of documents can be added incrementally while having to re-resolve
only a small subset of a previously resolved entity-document collection. We
present performance and quality results on two data-sets: a real-world database
of companies and a large synthetically generated `population' database. We also
demonstrate benefit of using inter-document references for clustering in the
form of enhanced recall of documents for resolution.Comment: 15 pages, 8 figures, patented wor
Analysing randomised controlled trials with missing data : Choice of approach affects conclusions
Copyright © 2012 Elsevier Inc. All rights reserved. PMID: 22265924 [PubMed - indexed for MEDLINE]Peer reviewedPostprin
Maximizing Welfare in Social Networks under a Utility Driven Influence Diffusion Model
Motivated by applications such as viral marketing, the problem of influence
maximization (IM) has been extensively studied in the literature. The goal is
to select a small number of users to adopt an item such that it results in a
large cascade of adoptions by others. Existing works have three key
limitations. (1) They do not account for economic considerations of a user in
buying/adopting items. (2) Most studies on multiple items focus on competition,
with complementary items receiving limited attention. (3) For the network
owner, maximizing social welfare is important to ensure customer loyalty, which
is not addressed in prior work in the IM literature. In this paper, we address
all three limitations and propose a novel model called UIC that combines
utility-driven item adoption with influence propagation over networks. Focusing
on the mutually complementary setting, we formulate the problem of social
welfare maximization in this novel setting. We show that while the objective
function is neither submodular nor supermodular, surprisingly a simple greedy
allocation algorithm achieves a factor of of the optimum
expected social welfare. We develop \textsf{bundleGRD}, a scalable version of
this approximation algorithm, and demonstrate, with comprehensive experiments
on real and synthetic datasets, that it significantly outperforms all
baselines.Comment: 33 page
A statistical mechanics approach to autopoietic immune networks
The aim of this work is to try to bridge over theoretical immunology and
disordered statistical mechanics. Our long term hope is to contribute to the
development of a quantitative theoretical immunology from which practical
applications may stem. In order to make theoretical immunology appealing to the
statistical physicist audience we are going to work out a research article
which, from one side, may hopefully act as a benchmark for future improvements
and developments, from the other side, it is written in a very pedagogical way
both from a theoretical physics viewpoint as well as from the theoretical
immunology one.
Furthermore, we have chosen to test our model describing a wide range of
features of the adaptive immune response in only a paper: this has been
necessary in order to emphasize the benefit available when using disordered
statistical mechanics as a tool for the investigation. However, as a
consequence, each section is not at all exhaustive and would deserve deep
investigation: for the sake of completeness, we restricted details in the
analysis of each feature with the aim of introducing a self-consistent model.Comment: 22 pages, 14 figur
Comparative analysis of imaging configurations and objectives for Fourier microscopy
Fourier microscopy is becoming an increasingly important tool for the
analysis of optical nanostructures and quantum emitters. However, achieving
quantitative Fourier space measurements requires a thorough understanding of
the impact of aberrations introduced by optical microscopes, which have been
optimized for conventional real-space imaging. Here, we present a detailed
framework for analyzing the performance of microscope objectives for several
common Fourier imaging configurations. To this end, we model objectives from
Nikon, Olympus, and Zeiss using parameters that were inferred from patent
literature and confirmed, where possible, by physical disassembly. We then
examine the aberrations most relevant to Fourier microscopy, including the
alignment tolerances of apodization factors for different objective classes,
the effect of magnification on the modulation transfer function, and
vignetting-induced reductions of the effective numerical aperture for
wide-field measurements. Based on this analysis, we identify an optimal
objective class and imaging configuration for Fourier microscopy. In addition,
as a resource for future studies, the Zemax files for the objectives and setups
used in this analysis have been made publicly available.Comment: For related figshare fileset with complete Zemax models of microscope
objectives, tube lenses, and Fourier imaging configurations, see Ref. [41]
(available at http://dx.doi.org/10.6084/m9.figshare.1481270
- …