Search CORE

883 research outputs found

Biomedical ontology alignment: An approach based on representation learning

Author: Kalousis Alexandros
Kiritsis Dimitris
Kolyvakis Prodromos
Smith Barry
Publication venue
Publication date: 01/01/2018
Field of study

While representation learning techniques have shown great promise in application to a number of different NLP tasks, they have had little impact on the problem of ontology matching. Unlike past work that has focused on feature engineering, we present a novel representation learning approach that is tailored to the ontology matching task. Our approach is based on embedding ontological terms in a high-dimensional Euclidean space. This embedding is derived on the basis of a novel phrase retrofitting strategy through which semantic similarity information becomes inscribed onto fields of pre-trained word vectors. The resulting framework also incorporates a novel outlier detection mechanism based on a denoising autoencoder that is shown to improve performance. An ontology matching system derived using the proposed framework achieved an F-score of 94% on an alignment scenario involving the Adult Mouse Anatomical Dictionary and the Foundational Model of Anatomy ontology (FMA) as targets. This compares favorably with the best performing systems on the Ontology Alignment Evaluation Initiative anatomy challenge. We performed additional experiments on aligning FMA to NCI Thesaurus and to SNOMED CT based on a reference alignment extracted from the UMLS Metathesaurus. Our system obtained overall F-scores of 93.2% and 89.2% for these experiments, thus achieving state-of-the-art results

PhilPapers

Hes-so: ArODES Open Archive (University of Applied Sciences and Arts Western Switzerland / Haute école spécialisée de Suisse occidentale / FH Westschweiz)

Directory of Open Access Journals

Abstract Syntax Networks for Code Generation and Semantic Parsing

Author: Klein Dan
Rabinovich Maxim
Stern Mitchell
Publication venue
Publication date: 01/01/2017
Field of study

Tasks like code generation and semantic parsing require mapping unstructured (or partially structured) inputs to well-formed, executable outputs. We introduce abstract syntax networks, a modeling framework for these problems. The outputs are represented as abstract syntax trees (ASTs) and constructed by a decoder with a dynamically-determined modular structure paralleling the structure of the output tree. On the benchmark Hearthstone dataset for code generation, our model obtains 79.2 BLEU and 22.7% exact match accuracy, compared to previous state-of-the-art values of 67.1 and 6.1%. Furthermore, we perform competitively on the Atis, Jobs, and Geo semantic parsing datasets with no task-specific engineering.Comment: ACL 2017. MR and MS contributed equall

arXiv.org e-Print Archive

Crossref

Knowledge discovery through ontology matching: An approach based on an Artificial Neural Network model

Author: Caliusco Maria Laura
Coronel M.
Gareli Fabrizi M.
Rubiolo Mariano
Stegmayer Georgina
Publication venue: Elsevier Science Inc.
Publication date: 01/07/2012
Field of study

The fundamental principle of the Semantic Web is the creation and use of semantic annotations connected to formal descriptions, such as domain ontologies. The lack of an integrated view of all web nodes and the existence of heterogeneous domain ontologies drive new challenges in the discovery of knowledge resources, which are relevant to a user´s request. New eficient approaches for developing web intelligence and helping users to avoid irrelevant search results on the web have recently appeared. Artificial Neural Networks (ANN) being one of the most recent ones. However,there still remains a lot of work to be done in this area. This work makes a contribution to the field of knowledge-resource discovery and ontology matching techniques for the Semantic Web by presenting an approach which is based on an ANN classifier. Experimental results show that the ANN-based ontology matching model has provided satisfactory responses to the test cases.Fil: Rubiolo, Mariano. Universidad Tecnológica Nacional. Facultad Regional Santa Fe. Centro de Investigación y Desarrollo de Ingeniería en Sistemas de Información; Argentina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe; ArgentinaFil: Caliusco, Maria Laura. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe; ArgentinaFil: Stegmayer, Georgina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Conicet - Santa Fe; ArgentinaFil: Coronel, M.. Universidad Tecnológica Nacional; ArgentinaFil: Gareli Fabrizi, M.. Universidad Tecnológica Nacional; Argentin

CONICET Digital

A novel neural response algorithm for protein function prediction

Author: Wang J
Xiao QW
Yalamanchili HK
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

BACKGROUND: Large amounts of data are being generated by high-throughput genome sequencing methods. But the rate of the experimental functional characterization falls far behind. To fill the gap between the number of sequences and their annotations, fast and accurate automated annotation methods are required. Many methods, such as GOblet, GOFigure, and Gotcha, are designed based on the BLAST search. Unfortunately, the sequence coverage of these methods is low as they cannot detect the remote homologues. Adding to this, the lack of annotation specificity advocates the need to improve automated protein function prediction. RESULTS: We designed a novel automated protein functional assignment method based on the neural response algorithm, which simulates the neuronal behavior of the visual cortex in the human brain. Firstly, we predict the most similar target protein for a given query protein and thereby assign its GO term to the query sequence. When assessed on test set, our method ranked the actual leaf GO term among the top 5 probable GO terms with accuracy of 86.93%. CONCLUSIONS: The proposed algorithm is the first instance of neural response algorithm being used in the biological domain. The use of HMM profiles along with the secondary structure information to define the neural response gives our method an edge over other available methods on annotation accuracy. Results of the 5-fold cross validation and the comparison with PFP and FFPred servers indicate the prominent performance by our method. The program, the dataset, and help files are available at http://www.jjwanglab.org/NRProF/.published_or_final_versio

HKU Scholars Hub

Recommended from our members

Multi-class protein fold classification using a new ensemble machine learning approach.

Author: Deville Y
Gilbert D
Tan A
Publication venue: GIW
Publication date: 01/01/2003
Field of study

Protein structure classification represents an important process in understanding the associations between sequence and structure as well as possible functional and evolutionary relationships. Recent structural genomics initiatives and other high-throughput experiments have populated the biological databases at a rapid pace. The amount of structural data has made traditional methods such as manual inspection of the protein structure become impossible. Machine learning has been widely applied to bioinformatics and has gained a lot of success in this research area. This work proposes a novel ensemble machine learning method that improves the coverage of the classifiers under the multi-class imbalanced sample sets by integrating knowledge induced from different base classifiers, and we illustrate this idea in classifying multi-class SCOP protein fold data. We have compared our approach with PART and show that our method improves the sensitivity of the classifier in protein fold classification. Furthermore, we have extended this method to learning over multiple data types, preserving the independence of their corresponding data sources, and show that our new approach performs at least as well as the traditional technique over a single joined data source. These experimental results are encouraging, and can be applied to other bioinformatics problems similarly characterised by multi-class imbalanced data sets held in multiple data sources

Brunel University Research Archive

From Text to Knowledge with Graphs: modelling, querying and exploiting textual content

Author: Alves Mirian Halfeld Ferrari
Forst Anne-Lyse Minard
Vargas-Solar Genoveva
Publication venue
Publication date: 09/10/2023
Field of study

This paper highlights the challenges, current trends, and open issues related to the representation, querying and analytics of content extracted from texts. The internet contains vast text-based information on various subjects, including commercial documents, medical records, scientific experiments, engineering tests, and events that impact urban and natural environments. Extracting knowledge from this text involves understanding the nuances of natural language and accurately representing the content without losing information. This allows knowledge to be accessed, inferred, or discovered. To achieve this, combining results from various fields, such as linguistics, natural language processing, knowledge representation, data storage, querying, and analytics, is necessary. The vision in this paper is that graphs can be a well-suited text content representation once annotated and the right querying and analytics techniques are applied. This paper discusses this hypothesis from the perspective of linguistics, natural language processing, graph models and databases and artificial intelligence provided by the panellists of the DOING session in the MADICS Symposium 2022

arXiv.org e-Print Archive