17,253 research outputs found
The potential of text mining in data integration and network biology for plant research : a case study on Arabidopsis
Despite the availability of various data repositories for plant research, a wealth of information currently remains hidden within the biomolecular literature. Text mining provides the necessary means to retrieve these data through automated processing of texts. However, only recently has advanced text mining methodology been implemented with sufficient computational power to process texts at a large scale. In this study, we assess the potential of large-scale text mining for plant biology research in general and for network biology in particular using a state-of-the-art text mining system applied to all PubMed abstracts and PubMed Central full texts. We present extensive evaluation of the textual data for Arabidopsis thaliana, assessing the overall accuracy of this new resource for usage in plant network analyses. Furthermore, we combine text mining information with both protein-protein and regulatory interactions from experimental databases. Clusters of tightly connected genes are delineated from the resulting network, illustrating how such an integrative approach is essential to grasp the current knowledge available for Arabidopsis and to uncover gene information through guilt by association. All large-scale data sets, as well as the manually curated textual data, are made publicly available, hereby stimulating the application of text mining data in future plant biology studies
Statistical data mining for symbol associations in genomic databases
A methodology is proposed to automatically detect significant symbol
associations in genomic databases. A new statistical test is proposed to assess
the significance of a group of symbols when found in several genesets of a
given database. Applied to symbol pairs, the thresholded p-values of the test
define a graph structure on the set of symbols. The cliques of that graph are
significant symbol associations, linked to a set of genesets where they can be
found. The method can be applied to any database, and is illustrated MSigDB C2
database. Many of the symbol associations detected in C2 or in non-specific
selections did correspond to already known interactions. On more specific
selections of C2, many previously unkown symbol associations have been
detected. These associations unveal new candidates for gene or protein
interactions, needing further investigation for biological evidence
Assembly of an interactive correlation network for the Arabidopsis genome using a novel heuristic clustering algorithm
Peer reviewedPublisher PD
Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?
The organization and mining of malaria genomic and post-genomic data is
highly motivated by the necessity to predict and characterize new biological
targets and new drugs. Biological targets are sought in a biological space
designed from the genomic data from Plasmodium falciparum, but using also the
millions of genomic data from other species. Drug candidates are sought in a
chemical space containing the millions of small molecules stored in public and
private chemolibraries. Data management should therefore be as reliable and
versatile as possible. In this context, we examined five aspects of the
organization and mining of malaria genomic and post-genomic data: 1) the
comparison of protein sequences including compositionally atypical malaria
sequences, 2) the high throughput reconstruction of molecular phylogenies, 3)
the representation of biological processes particularly metabolic pathways, 4)
the versatile methods to integrate genomic data, biological representations and
functional profiling obtained from X-omic experiments after drug treatments and
5) the determination and prediction of protein structures and their molecular
docking with drug candidate structures. Progresses toward a grid-enabled
chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa
- …