31,478 research outputs found

    Incremental Entity Resolution from Linked Documents

    Full text link
    In many government applications we often find that information about entities, such as persons, are available in disparate data sources such as passports, driving licences, bank accounts, and income tax records. Similar scenarios are commonplace in large enterprises having multiple customer, supplier, or partner databases. Each data source maintains different aspects of an entity, and resolving entities based on these attributes is a well-studied problem. However, in many cases documents in one source reference those in others; e.g., a person may provide his driving-licence number while applying for a passport, or vice-versa. These links define relationships between documents of the same entity (as opposed to inter-entity relationships, which are also often used for resolution). In this paper we describe an algorithm to cluster documents that are highly likely to belong to the same entity by exploiting inter-document references in addition to attribute similarity. Our technique uses a combination of iterative graph-traversal, locality-sensitive hashing, iterative match-merge, and graph-clustering to discover unique entities based on a document corpus. A unique feature of our technique is that new sets of documents can be added incrementally while having to re-resolve only a small subset of a previously resolved entity-document collection. We present performance and quality results on two data-sets: a real-world database of companies and a large synthetically generated `population' database. We also demonstrate benefit of using inter-document references for clustering in the form of enhanced recall of documents for resolution.Comment: 15 pages, 8 figures, patented wor

    Analysing randomised controlled trials with missing data : Choice of approach affects conclusions

    Get PDF
    Copyright © 2012 Elsevier Inc. All rights reserved. PMID: 22265924 [PubMed - indexed for MEDLINE]Peer reviewedPostprin

    Maximizing Welfare in Social Networks under a Utility Driven Influence Diffusion Model

    Full text link
    Motivated by applications such as viral marketing, the problem of influence maximization (IM) has been extensively studied in the literature. The goal is to select a small number of users to adopt an item such that it results in a large cascade of adoptions by others. Existing works have three key limitations. (1) They do not account for economic considerations of a user in buying/adopting items. (2) Most studies on multiple items focus on competition, with complementary items receiving limited attention. (3) For the network owner, maximizing social welfare is important to ensure customer loyalty, which is not addressed in prior work in the IM literature. In this paper, we address all three limitations and propose a novel model called UIC that combines utility-driven item adoption with influence propagation over networks. Focusing on the mutually complementary setting, we formulate the problem of social welfare maximization in this novel setting. We show that while the objective function is neither submodular nor supermodular, surprisingly a simple greedy allocation algorithm achieves a factor of (1−1/e−ϵ)(1-1/e-\epsilon) of the optimum expected social welfare. We develop \textsf{bundleGRD}, a scalable version of this approximation algorithm, and demonstrate, with comprehensive experiments on real and synthetic datasets, that it significantly outperforms all baselines.Comment: 33 page

    A statistical mechanics approach to autopoietic immune networks

    Full text link
    The aim of this work is to try to bridge over theoretical immunology and disordered statistical mechanics. Our long term hope is to contribute to the development of a quantitative theoretical immunology from which practical applications may stem. In order to make theoretical immunology appealing to the statistical physicist audience we are going to work out a research article which, from one side, may hopefully act as a benchmark for future improvements and developments, from the other side, it is written in a very pedagogical way both from a theoretical physics viewpoint as well as from the theoretical immunology one. Furthermore, we have chosen to test our model describing a wide range of features of the adaptive immune response in only a paper: this has been necessary in order to emphasize the benefit available when using disordered statistical mechanics as a tool for the investigation. However, as a consequence, each section is not at all exhaustive and would deserve deep investigation: for the sake of completeness, we restricted details in the analysis of each feature with the aim of introducing a self-consistent model.Comment: 22 pages, 14 figur

    Comparative analysis of imaging configurations and objectives for Fourier microscopy

    Full text link
    Fourier microscopy is becoming an increasingly important tool for the analysis of optical nanostructures and quantum emitters. However, achieving quantitative Fourier space measurements requires a thorough understanding of the impact of aberrations introduced by optical microscopes, which have been optimized for conventional real-space imaging. Here, we present a detailed framework for analyzing the performance of microscope objectives for several common Fourier imaging configurations. To this end, we model objectives from Nikon, Olympus, and Zeiss using parameters that were inferred from patent literature and confirmed, where possible, by physical disassembly. We then examine the aberrations most relevant to Fourier microscopy, including the alignment tolerances of apodization factors for different objective classes, the effect of magnification on the modulation transfer function, and vignetting-induced reductions of the effective numerical aperture for wide-field measurements. Based on this analysis, we identify an optimal objective class and imaging configuration for Fourier microscopy. In addition, as a resource for future studies, the Zemax files for the objectives and setups used in this analysis have been made publicly available.Comment: For related figshare fileset with complete Zemax models of microscope objectives, tube lenses, and Fourier imaging configurations, see Ref. [41] (available at http://dx.doi.org/10.6084/m9.figshare.1481270
    • …
    corecore