883 research outputs found

    Biomedical ontology alignment: An approach based on representation learning

    Get PDF
    While representation learning techniques have shown great promise in application to a number of different NLP tasks, they have had little impact on the problem of ontology matching. Unlike past work that has focused on feature engineering, we present a novel representation learning approach that is tailored to the ontology matching task. Our approach is based on embedding ontological terms in a high-dimensional Euclidean space. This embedding is derived on the basis of a novel phrase retrofitting strategy through which semantic similarity information becomes inscribed onto fields of pre-trained word vectors. The resulting framework also incorporates a novel outlier detection mechanism based on a denoising autoencoder that is shown to improve performance. An ontology matching system derived using the proposed framework achieved an F-score of 94% on an alignment scenario involving the Adult Mouse Anatomical Dictionary and the Foundational Model of Anatomy ontology (FMA) as targets. This compares favorably with the best performing systems on the Ontology Alignment Evaluation Initiative anatomy challenge. We performed additional experiments on aligning FMA to NCI Thesaurus and to SNOMED CT based on a reference alignment extracted from the UMLS Metathesaurus. Our system obtained overall F-scores of 93.2% and 89.2% for these experiments, thus achieving state-of-the-art results

    Abstract Syntax Networks for Code Generation and Semantic Parsing

    Full text link
    Tasks like code generation and semantic parsing require mapping unstructured (or partially structured) inputs to well-formed, executable outputs. We introduce abstract syntax networks, a modeling framework for these problems. The outputs are represented as abstract syntax trees (ASTs) and constructed by a decoder with a dynamically-determined modular structure paralleling the structure of the output tree. On the benchmark Hearthstone dataset for code generation, our model obtains 79.2 BLEU and 22.7% exact match accuracy, compared to previous state-of-the-art values of 67.1 and 6.1%. Furthermore, we perform competitively on the Atis, Jobs, and Geo semantic parsing datasets with no task-specific engineering.Comment: ACL 2017. MR and MS contributed equall

    Knowledge discovery through ontology matching: An approach based on an Artificial Neural Network model

    Get PDF
    The fundamental principle of the Semantic Web is the creation and use of semantic annotations connected to formal descriptions, such as domain ontologies. The lack of an integrated view of all web nodes and the existence of heterogeneous domain ontologies drive new challenges in the discovery of knowledge resources, which are relevant to a user´s request. New eficient approaches for developing web intelligence and helping users to avoid irrelevant search results on the web have recently appeared. Artificial Neural Networks (ANN) being one of the most recent ones. However,there still remains a lot of work to be done in this area. This work makes a contribution to the field of knowledge-resource discovery and ontology matching techniques for the Semantic Web by presenting an approach which is based on an ANN classifier. Experimental results show that the ANN-based ontology matching model has provided satisfactory responses to the test cases.Fil: Rubiolo, Mariano. Universidad Tecnológica Nacional. Facultad Regional Santa Fe. Centro de Investigación y Desarrollo de Ingeniería en Sistemas de Información; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe; ArgentinaFil: Caliusco, Maria Laura. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe; ArgentinaFil: Stegmayer, Georgina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe; ArgentinaFil: Coronel, M.. Universidad Tecnológica Nacional; ArgentinaFil: Gareli Fabrizi, M.. Universidad Tecnológica Nacional; Argentin

    A novel neural response algorithm for protein function prediction

    Get PDF
    BACKGROUND: Large amounts of data are being generated by high-throughput genome sequencing methods. But the rate of the experimental functional characterization falls far behind. To fill the gap between the number of sequences and their annotations, fast and accurate automated annotation methods are required. Many methods, such as GOblet, GOFigure, and Gotcha, are designed based on the BLAST search. Unfortunately, the sequence coverage of these methods is low as they cannot detect the remote homologues. Adding to this, the lack of annotation specificity advocates the need to improve automated protein function prediction. RESULTS: We designed a novel automated protein functional assignment method based on the neural response algorithm, which simulates the neuronal behavior of the visual cortex in the human brain. Firstly, we predict the most similar target protein for a given query protein and thereby assign its GO term to the query sequence. When assessed on test set, our method ranked the actual leaf GO term among the top 5 probable GO terms with accuracy of 86.93%. CONCLUSIONS: The proposed algorithm is the first instance of neural response algorithm being used in the biological domain. The use of HMM profiles along with the secondary structure information to define the neural response gives our method an edge over other available methods on annotation accuracy. Results of the 5-fold cross validation and the comparison with PFP and FFPred servers indicate the prominent performance by our method. The program, the dataset, and help files are available at http://www.jjwanglab.org/NRProF/.published_or_final_versio

    From Text to Knowledge with Graphs: modelling, querying and exploiting textual content

    Full text link
    This paper highlights the challenges, current trends, and open issues related to the representation, querying and analytics of content extracted from texts. The internet contains vast text-based information on various subjects, including commercial documents, medical records, scientific experiments, engineering tests, and events that impact urban and natural environments. Extracting knowledge from this text involves understanding the nuances of natural language and accurately representing the content without losing information. This allows knowledge to be accessed, inferred, or discovered. To achieve this, combining results from various fields, such as linguistics, natural language processing, knowledge representation, data storage, querying, and analytics, is necessary. The vision in this paper is that graphs can be a well-suited text content representation once annotated and the right querying and analytics techniques are applied. This paper discusses this hypothesis from the perspective of linguistics, natural language processing, graph models and databases and artificial intelligence provided by the panellists of the DOING session in the MADICS Symposium 2022
    corecore