30,393 research outputs found

    Enhancing Domain Word Embedding via Latent Semantic Imputation

    Full text link
    We present a novel method named Latent Semantic Imputation (LSI) to transfer external knowledge into semantic space for enhancing word embedding. The method integrates graph theory to extract the latent manifold structure of the entities in the affinity space and leverages non-negative least squares with standard simplex constraints and power iteration method to derive spectral embeddings. It provides an effective and efficient approach to combining entity representations defined in different Euclidean spaces. Specifically, our approach generates and imputes reliable embedding vectors for low-frequency words in the semantic space and benefits downstream language tasks that depend on word embedding. We conduct comprehensive experiments on a carefully designed classification problem and language modeling and demonstrate the superiority of the enhanced embedding via LSI over several well-known benchmark embeddings. We also confirm the consistency of the results under different parameter settings of our method.Comment: ACM SIGKDD 201

    Vacuum structure and string tension in Yang-Mills dimeron ensembles

    Full text link
    We numerically simulate ensembles of SU(2) Yang-Mills dimeron solutions with a statistical weight determined by the classical action and perform a comprehensive analysis of their properties. In particular, we examine the extent to which these ensembles capture topological and confinement properties of the Yang-Mills vacuum. This further allows us to test the classic picture of meron-induced quark confinement as triggered by dimeron dissociation. At small bare couplings, spacial, topological-charge and color correlations among the dimerons generate a short-range order which screens topological charges. With increasing coupling this order weakens rapidly, however, in part because the dimerons gradually dissociate into their meron constituents. Monitoring confinement properties by evaluating Wilson-loop expectation values, we find the growing disorder due to these progressively liberated merons to generate a finite and (with the coupling) increasing string tension. The short-distance behavior of the static quark-antiquark potential, on the other hand, is dominated by small, "instanton-like" dimerons. String tension, action density and topological susceptibility of the dimeron ensembles in the physical coupling region turn out to be of the order of standard values. Hence the above results demonstrate without reliance on weak-coupling or low-density approximations that the dissociating dimeron component in the Yang-Mills vacuum can indeed produce a meron-populated confining phase. The density of coexisting, hardly dissociated and thus instanton-like dimerons seems to remain large enough, on the other hand, to reproduce much of the additional phenomenology successfully accounted for by non-confining instanton vacuum models. Hence dimeron ensembles should provide an efficient basis for a rather complete description of the Yang-Mills vacuum.Comment: 36 pages, 17 figure

    Matching Methods for Causal Inference: A Review and a Look Forward

    Full text link
    When estimating causal effects using observational data, it is desirable to replicate a randomized experiment as closely as possible by obtaining treated and control groups with similar covariate distributions. This goal can often be achieved by choosing well-matched samples of the original treated and control groups, thereby reducing bias due to the covariates. Since the 1970s, work on matching methods has examined how to best choose treated and control subjects for comparison. Matching methods are gaining popularity in fields such as economics, epidemiology, medicine and political science. However, until now the literature and related advice has been scattered across disciplines. Researchers who are interested in using matching methods---or developing methods related to matching---do not have a single place to turn to learn about past and current research. This paper provides a structure for thinking about matching methods and guidance on their use, coalescing the existing research (both old and new) and providing a summary of where the literature on matching methods is now and where it should be headed.Comment: Published in at http://dx.doi.org/10.1214/09-STS313 the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Semantic Visual Localization

    Full text link
    Robust visual localization under a wide range of viewing conditions is a fundamental problem in computer vision. Handling the difficult cases of this problem is not only very challenging but also of high practical relevance, e.g., in the context of life-long localization for augmented reality or autonomous robots. In this paper, we propose a novel approach based on a joint 3D geometric and semantic understanding of the world, enabling it to succeed under conditions where previous approaches failed. Our method leverages a novel generative model for descriptor learning, trained on semantic scene completion as an auxiliary task. The resulting 3D descriptors are robust to missing observations by encoding high-level 3D geometric and semantic information. Experiments on several challenging large-scale localization datasets demonstrate reliable localization under extreme viewpoint, illumination, and geometry changes

    Terminology mining in social media

    Get PDF
    The highly variable and dynamic word usage in social media presents serious challenges for both research and those commercial applications that are geared towards blogs or other user-generated non-editorial texts. This paper discusses and exemplifies a terminology mining approach for dealing with the productive character of the textual environment in social media. We explore the challenges of practically acquiring new terminology, and of modeling similarity and relatedness of terms from observing realistic amounts of data. We also discuss semantic evolution and density, and investigate novel measures for characterizing the preconditions for terminology mining
    corecore