Search CORE

5,162 research outputs found

Sequential pattern mining for discovering gene interactions and their contextual information from biomedical texts

Author: Cellier Peggy
Charnois Thierry
Crémilleux Bruno
Gandrillon Olivier
Klema Jiri
Manguin Jean-Luc
Plantevit Marc
Rigotti Christophe
Publication venue: BioMed Central
Publication date: 01/01/2015
Field of study

International audienceBackgroundDiscovering gene interactions and their characterizations from biological text collections is a crucial issue in bioinformatics. Indeed, text collections are large and it is very difficult for biologists to fully take benefit from this amount of knowledge. Natural Language Processing (NLP) methods have been applied to extract background knowledge from biomedical texts. Some of existing NLP approaches are based on handcrafted rules and thus are time consuming and often devoted to a specific corpus. Machine learning based NLP methods, give good results but generate outcomes that are not really understandable by a user.ResultsWe take advantage of an hybridization of data mining and natural language processing to propose an original symbolic method to automatically produce patterns conveying gene interactions and their characterizations. Therefore, our method not only allows gene interactions but also semantics information on the extracted interactions (e.g., modalities, biological contexts, interaction types) to be detected. Only limited resource is required: the text collection that is used as a training corpus. Our approach gives results comparable to the results given by state-of-the-art methods and is even better for the gene interaction detection in AIMed.ConclusionsExperiments show how our approach enables to discover interactions and their characterizations. To the best of our knowledge, there is few methods that automatically extract the interactions and also associated semantics information. The extracted gene interactions from PubMed are available through a simple web interface at https://bingotexte.greyc.fr/ webcite. The software is available at https://bingo2.greyc.fr/?q=node/22 webcite

HAL - Normandie Université

HAL-CentraleSupelec

HAL-UJM

INRIA a CCSD electronic archive server

Biomedical Literature Mining for Biological Databases Annotation

Author: Donato Malerba
Gaetano Scioscia
Marcella Attimonelli
Margherita Berardi
Pietro Leo
Roberta Piredda
Publication venue: 'IntechOpen'
Publication date: 01/01/2008
Field of study

IntechOpen

Crossref

Archivio istituzionale della ricerca - Università di Bari

Artifact Lifecycle Discovery

Author: Dumas Marlon
Fahland Dirk
Popova Viara
Publication venue
Publication date: 01/01/2013
Field of study

Artifact-centric modeling is a promising approach for modeling business processes based on the so-called business artifacts - key entities driving the company's operations and whose lifecycles define the overall business process. While artifact-centric modeling shows significant advantages, the overwhelming majority of existing process mining methods cannot be applied (directly) as they are tailored to discover monolithic process models. This paper addresses the problem by proposing a chain of methods that can be applied to discover artifact lifecycle models in Guard-Stage-Milestone notation. We decompose the problem in such a way that a wide range of existing (non-artifact-centric) process discovery and analysis methods can be reused in a flexible manner. The methods presented in this paper are implemented as software plug-ins for ProM, a generic open-source framework and architecture for implementing process mining tools

arXiv.org e-Print Archive

Repository TU/e

Pure OAI Repository

Knowledge-based Biomedical Data Science 2019

Author: Callahan Tiffany J.
Hunter Lawrence E.
Pielke-Lombardo Harrison
Tripodi Ignacio J.
Publication venue
Publication date: 08/10/2019
Field of study

Knowledge-based biomedical data science (KBDS) involves the design and implementation of computer systems that act as if they knew about biomedicine. Such systems depend on formally represented knowledge in computer systems, often in the form of knowledge graphs. Here we survey the progress in the last year in systems that use formally represented knowledge to address data science problems in both clinical and biological domains, as well as on approaches for creating knowledge graphs. Major themes include the relationships between knowledge graphs and machine learning, the use of natural language processing, and the expansion of knowledge-based approaches to novel domains, such as Chinese Traditional Medicine and biodiversity.Comment: Manuscript 43 pages with 3 tables; Supplemental material 43 pages with 3 table

arXiv.org e-Print Archive

Large Graph Analysis in the GMine System

Author: Faloutsos Christos
Pan Jia-Yu
Rodrigues Jr. Jose F.
Tong Hanghang
Traina Jr. Caetano
Traina Agma J. M.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/05/2015
Field of study

Current applications have produced graphs on the order of hundreds of thousands of nodes and millions of edges. To take advantage of such graphs, one must be able to find patterns, outliers and communities. These tasks are better performed in an interactive environment, where human expertise can guide the process. For large graphs, though, there are some challenges: the excessive processing requirements are prohibitive, and drawing hundred-thousand nodes results in cluttered images hard to comprehend. To cope with these problems, we propose an innovative framework suited for any kind of tree-like graph visual design. GMine integrates (a) a representation for graphs organized as hierarchies of partitions - the concepts of SuperGraph and Graph-Tree; and (b) a graph summarization methodology - CEPS. Our graph representation deals with the problem of tracing the connection aspects of a graph hierarchy with sub linear complexity, allowing one to grasp the neighborhood of a single node or of a group of nodes in a single click. As a proof of concept, the visual environment of GMine is instantiated as a system in which large graphs can be investigated globally and locally

arXiv.org e-Print Archive

CiteSeerX