3,315 research outputs found

    Type-Constrained Representation Learning in Knowledge Graphs

    Full text link
    Large knowledge graphs increasingly add value to various applications that require machines to recognize and understand queries and their semantics, as in search or question answering systems. Latent variable models have increasingly gained attention for the statistical modeling of knowledge graphs, showing promising results in tasks related to knowledge graph completion and cleaning. Besides storing facts about the world, schema-based knowledge graphs are backed by rich semantic descriptions of entities and relation-types that allow machines to understand the notion of things and their semantic relationships. In this work, we study how type-constraints can generally support the statistical modeling with latent variable models. More precisely, we integrated prior knowledge in form of type-constraints in various state of the art latent variable approaches. Our experimental results show that prior knowledge on relation-types significantly improves these models up to 77% in link-prediction tasks. The achieved improvements are especially prominent when a low model complexity is enforced, a crucial requirement when these models are applied to very large datasets. Unfortunately, type-constraints are neither always available nor always complete e.g., they can become fuzzy when entities lack proper typing. We show that in these cases, it can be beneficial to apply a local closed-world assumption that approximates the semantics of relation-types based on observations made in the data

    The Wordsmith As Worldsmith in Shakespeare\u27s As You Like It

    Get PDF
    N/

    Mechanisms of base selection by the E.coli mispaired uracil glycosylase

    Get PDF
    The repair of the multitude of single-base lesions formed daily in the cells of all living organisms is accomplished primarily by the base-excision repair (BER) pathway that initiates repair through a series of lesion-selective glycosylases. In this paper, single-turnover kinetics have been measured on a series of oligonucleotide substrates containing both uracil and purine analogs for the E. coli mispaired uracil glycosylase, MUG. The relative rates of glycosylase cleavage have been correlated with the free energy of helix formation, and with the size and electronic inductive properties of a series of uracil 5-substituents. Data is presented that MUG can exploit the reduced thermodynamic stability of mispairs to distinguish U:A from U:G pairs. Discrimination against the removal of thymine results primarily from the electron-donating property of the thymine 5-methyl substituent, while the size of the methyl group relative to a hydrogen atom is a secondary factor. A series of parameters have been obtained that allow prediction of relative MUG cleavage rates that correlate well with observed relative rates that vary over five orders of magnitude for the series of base analogs examined. We propose that these parameters may be common among DNA glycosylases, however, specific glycosylases may focus more or less on each of the parameters identified. The presence of a series of glycosylases which focus on different lesion properties, all coexisting within the same cell, would provide a robust and partially redundant repair system necessary for the maintenance of the genome

    BIRD CLASSIFICATION IN NOISY ENVIRONMENTS: THEORY, RESULTS AND COMPARATIVE STUDIES

    Get PDF
    Bird classification plays an important role in minimizing collisions between birds and aircraft. It is a challenging task to perform the sound-based classification correctly in a noisy environment. This paper addresses robust techniques that can improve the classification of bird in noisy environments. A complete recognition system is described and evaluated on a bird sound database containing 1547 bird sound files, with 11 bird species. Two types of features were extracted from the sound files: Mel Frequency Cepstral Coefficient (Mfcc) and RelAtive SpecTrAl (RASTA). Also, two statistical classifiers were developed using Gaussian Mixture Models (GMM) and Hidden Markov Models (HMM), respectively. The performance of these features and models are compared. Very good recognition rates (97% for clean data and 92% for 5dB signal-to-noise ratios) have been achieved when proper feature and model were selected

    BIRD CLASSIFICATION IN NOISY ENVIRONMENTS: THEORY, RESULTS AND COMPARATIVE STUDIES

    Get PDF
    Bird classification plays an important role in minimizing collisions between birds and aircraft. It is a challenging task to perform the sound-based classification correctly in a noisy environment. This paper addresses robust techniques that can improve the classification of bird in noisy environments. A complete recognition system is described and evaluated on a bird sound database containing 1547 bird sound files, with 11 bird species. Two types of features were extracted from the sound files: Mel Frequency Cepstral Coefficient (Mfcc) and RelAtive SpecTrAl (RASTA). Also, two statistical classifiers were developed using Gaussian Mixture Models (GMM) and Hidden Markov Models (HMM), respectively. The performance of these features and models are compared. Very good recognition rates (97% for clean data and 92% for 5dB signal-to-noise ratios) have been achieved when proper feature and model were selected

    Real time plasma equilibrium reconstruction in a Tokamak

    Get PDF
    The problem of equilibrium of a plasma in a Tokamak is a free boundary problemdescribed by the Grad-Shafranov equation in axisymmetric configurations. The right hand side of this equation is a non linear source, which represents the toroidal component of the plasma current density. This paper deals with the real time identification of this non linear source from experimental measurements. The proposed method is based on a fixed point algorithm, a finite element resolution, a reduced basis method and a least-square optimization formulation

    iMap4: An Open Source Toolbox for the Statistical Fixation Mapping of Eye Movement data with Linear Mixed Modeling.

    Get PDF
    A major challenge in modern eye movement research is to statistically map where observers are looking, by isolating the significant differences between groups and conditions. Compared to signals of contemporary neuroscience measures, such as M/EEG and fMRI, eye movement data are sparser with much larger variations in space across trials and participants. As a result, the implementation of a conventional linear modeling approach on two-dimensional fixation distributions often returns unstable estimations and underpowered results, leaving this statistical problem unresolved (Liversedge, Gilchrist, & Everling. 2011). Here, we present a new version of the iMap toolbox (Caldara and Miellet, 2011) which tackles this issue by implementing a statistical framework comparable to those developped in state-of the- art neuroimaging data processing toolboxes. iMap4 uses univariate, pixel-wise Linear Mixed Models (LMM) on the smoothed fixation data, with the flexibility of coding for multiple between- and within- subject comparisons and performing all the possible linear contrasts for the fixed effects (main effects, interactions, etc.). Importantly, we also introduced novel nonparametric tests based on resampling to assess statistical significance. Finally, we validated this approach by using both experimental and Monte Carlo simulation data. iMap4 is a freely available MATLAB open source toolbox for the statistical fixation mapping of eye movement data, with a user-friendly interface providing straightforward, easy to interpret statistical graphical outputs. iMap4 matches the standards of robust statistical neuroimaging methods and represents an important step in the data-driven processing of eye movement fixation data, an important field of vision sciences
    corecore