Search CORE

45,342 research outputs found

Ontology of core data mining entities

Author: A Bernstein
A Golbraikh
A Karalic
B Smith
B Smith
B Smith
C Silla
C Vens
D Demšar
D Kocev
D Kocev
D Qi
D Young
DJ Hand
F Serban
G Madjarov
G Tsoumakas
GH Bakir
H Mannila
HP Kriegel
I Slavkov
J Vanschoren
K Button
Larisa Soldatova
LN Soldatova
M Courtot
M Ford
M Žáková
MA Avery
MA Avery
MF López
O Spjuth
P Robinson
Panče Panov
Q Yang
R Caruana
R Guha
R Guha
RD King
RD King
RR Brinkman
Sašo Džeroski
T Dietterich
V Podpečan
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/07/2014
Field of study

In this article, we present OntoDM-core, an ontology of core data mining entities. OntoDM-core defines themost essential datamining entities in a three-layered ontological structure comprising of a specification, an implementation and an application layer. It provides a representational framework for the description of mining structured data, and in addition provides taxonomies of datasets, data mining tasks, generalizations, data mining algorithms and constraints, based on the type of data. OntoDM-core is designed to support a wide range of applications/use cases, such as semantic annotation of data mining algorithms, datasets and results; annotation of QSAR studies in the context of drug discovery investigations; and disambiguation of terms in text mining. The ontology has been thoroughly assessed following the practices in ontology engineering, is fully interoperable with many domain resources and is easy to extend

Crossref

Brunel University Research Archive

A UIMA wrapper for the NCBO annotator

Author: Baumgartner
C. Jonquet
C. Roeder
Hunter
K. Verspoor
L. Hunter
N. H. Shah
W. A. Baumgartner
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

Summary: The Unstructured Information Management Architecture (UIMA) framework and web services are emerging as useful tools for integrating biomedical text mining tools. This note describes our work, which wraps the National Center for Biomedical Ontology (NCBO) Annotator—an ontology-based annotation service—to make it available as a component in UIMA workflows

Crossref

PubMed Central

HAL Descartes

University of Melbourne Institutional Repository

Text Mining Method to Develop D-Matrix for Fault Diagnosis

Author: Ms. Amruta Kulkarni, Prof. Jyoti Nighot
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 31/03/2016
Field of study

The D-matrix is one amongst the quality diagnostic models specified by IEEE Standard. This framework catches underlying connections between symptoms and failure modes in structured fashion. This framework is called as Dependency or Diagnosis framework (D-matrix).Proposed system describes text mining method based on an ontology to develop D-matrix by mining repair verbatim written in unstructured text. Here repair verbatim are collected during fault diagnosis. Then mining algorithms are applied to find dependencies. D-Matrix is constructed for different dataset, then we generate a combined D-matrix by taking common parameters from each D-matrix and then a graph is formed for that D-matrix

International Journal on Recent and Innovation Trends in Computing and Communication

EXACT2: the semantics of biomedical protocols

Author: A Maccagnan
A Pease
A Sackmann
A Sujathaa
Brian B Rudkin
CJ Mungall
Daniel Nadis
Doi
Emma Haddi
Grunwald
H Obokata
I Mura
J Taubert
K Wolstencroft
Larisa N Soldatova
LN Soldatova
LN Soldatova
LN Soldatova
M Courtot
M Hilario
M Schilling
Nigel J Saunders
Piyali S Basu
R Garside
RD King
Ross D King
RR Brinkman
S Mitchell
S Rune
S Shapin
T Bittner
T Klingström
Th Paul
V Rätzel
Véronique Baumlé
W Ceusters
Wolfgang Marwan
Z Xiang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

© 2014 Soldatova et al.; licensee BioMed Central. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.This article has been made available through the Brunel Open Access Publishing Fund.Background: The reliability and reproducibility of experimental procedures is a cornerstone of scientific practice. There is a pressing technological need for the better representation of biomedical protocols to enable other agents (human or machine) to better reproduce results. A framework that ensures that all information required for the replication of experimental protocols is essential to achieve reproducibility. Methods: We have developed the ontology EXACT2 (EXperimental ACTions) that is designed to capture the full semantics of biomedical protocols required for their reproducibility. To construct EXACT2 we manually inspected hundreds of published and commercial biomedical protocols from several areas of biomedicine. After establishing a clear pattern for extracting the required information we utilized text-mining tools to translate the protocols into a machine amenable format. We have verified the utility of EXACT2 through the successful processing of previously ‘unseen’ (not used for the construction of EXACT2) protocols. Results: The paper reports on a fundamentally new version EXACT2 that supports the semantically-defined representation of biomedical protocols. The ability of EXACT2 to capture the semantics of biomedical procedures was verified through a text mining use case. In this EXACT2 is used as a reference model for text mining tools to identify terms pertinent to experimental actions, and their properties, in biomedical protocols expressed in natural language. An EXACT2-based framework for the translation of biomedical protocols to a machine amenable format is proposed. Conclusions: The EXACT2 ontology is sufficient to record, in a machine processable form, the essential information about biomedical protocols. EXACT2 defines explicit semantics of experimental actions, and can be used by various computer applications. It can serve as a reference model for for the translation of biomedical protocols in natural language into a semantically-defined format.This work has been partially funded by the Brunel University BRIEF award and a grant from Occams Resources

Goldsmiths Research Online

Crossref

Springer - Publisher Connector

PubMed Central

Brunel University Research Archive

Mapping semantic knowledge for unsupervised text categorisation

Author: Li Yuefeng
Tao Xiaohui
Yong Jianming
Zhang Ji
Publication venue: Australian Computer Society Inc.
Publication date: 01/02/2013
Field of study

Text categorisation is challenging, due to the complex structure with heterogeneous, changing topics in documents. The performance of text categorisation relies on the quality of samples, effectiveness of document features, and the topic coverage of categories, depending on the employing strategies; supervised or unsupervised; single labelled or multi-labelled. Attempting to deal with these reliability issues in text categorisation, we propose an unsupervised multi-labelled text categorisation approach that maps the local knowledge in documents to global knowledge in a world ontology to optimise categorisation result. The conceptual framework of the approach consists of three modules; pattern mining for feature extraction; feature-subject mapping for categorisation; concept generalisation for optimised categorisation. The approach has been promisingly evaluated by compared with typical text categorisation methods, based on the ground truth encoded by human experts

University of Southern Queensland ePrints

Deploying mutation impact text-mining software with the SADI Semantic Web Services framework

Author: Baker Christopher JO
Laurila Jonas Bergman
Riazanov Alexandre
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Background: Mutation impact extraction is an important task designed to harvest relevant annotations from scientific documents for reuse in multiple contexts. Our previous work on text mining for mutation impacts resulted in (i) the development of a GATE-based pipeline that mines texts for information about impacts of mutations on proteins, (ii) the population of this information into our OWL DL mutation impact ontology, and (iii) establishing an experimental semantic database for storing the results of text mining. Results: This article explores the possibility of using the SADI framework as a medium for publishing our mutation impact software and data. SADI is a set of conventions for creating web services with semantic descriptions that facilitate automatic discovery and orchestration. We describe a case study exploring and demonstrating the utility of the SADI approach in our context. We describe several SADI services we created based on our text mining API and data, and demonstrate how they can be used in a number of biologically meaningful scenarios through a SPARQL interface (SHARE) to SADI services. In all cases we pay special attention to the integration of mutation impact services with external SADI services providing information about related biological entities, such as proteins, pathways, and drugs. Conclusion: We have identified that SADI provides an effective way of exposing our mutation impact data suc

CiteSeerX

Crossref

Springer - Publisher Connector

PubMed Central

A Machine Learning Based Analytical Framework for Semantic Annotation Requirements

Author: Hassanzadeh Hamed
Keyvanpour MohammadReza
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 26/04/2011
Field of study

The Semantic Web is an extension of the current web in which information is given well-defined meaning. The perspective of Semantic Web is to promote the quality and intelligence of the current web by changing its contents into machine understandable form. Therefore, semantic level information is one of the cornerstones of the Semantic Web. The process of adding semantic metadata to web resources is called Semantic Annotation. There are many obstacles against the Semantic Annotation, such as multilinguality, scalability, and issues which are related to diversity and inconsistency in content of different web pages. Due to the wide range of domains and the dynamic environments that the Semantic Annotation systems must be performed on, the problem of automating annotation process is one of the significant challenges in this domain. To overcome this problem, different machine learning approaches such as supervised learning, unsupervised learning and more recent ones like, semi-supervised learning and active learning have been utilized. In this paper we present an inclusive layered classification of Semantic Annotation challenges and discuss the most important issues in this field. Also, we review and analyze machine learning applications for solving semantic annotation problems. For this goal, the article tries to closely study and categorize related researches for better understanding and to reach a framework that can map machine learning techniques into the Semantic Annotation challenges and requirements

arXiv.org e-Print Archive

Crossref