Search CORE

25,099 research outputs found

Advances in protein ontology project

Author: Chang Elizabeth
Dillon Tharam S.
Sidhu Amandeep
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2006
Field of study

Advances in proteomics and protein expression techniques have lead to the elucidation of large amounts of protein data. Various data mining algorithms and mathematical models provide methods for analyzing this data; however, there are two issues that need to be addressed: (1) the need for standards for defining protein data description and exchange formats so they can be exchanged across the World Wide Web, and also read into data mining software in a consistent format and (2) eliminating errors which arise with the data integration methodologies for complex queries. Protein Ontology is designed to meet these needs by providing a structured protein data specification for Protein Data Representation. Protein Ontology is a standard for representing protein data in a way that helps in defining data integration and data mining models for Protein Structure and Function. In this paper we summarize the structure of Protein Ontology we developed earlier, its current applications to various protein families, and its future development

OPUS - University of Technology Sydney

espace@Curtin

Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

Author: Bastien Olivier
Birkholtz Lyn-Marie
Breton Vincent
Grando Delphine
Hofmann-Apitius Martin
Jacq Nicolas
Joubert Fourie
Kasam Vinod
Louw Abraham I
Maréchal Eric
Ortet Philippe
Roy Sylvaine
Saïdani Nadia
Wells Gordon
Zimmermann Marc
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2006
Field of study

The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained from X-omic experiments after drug treatments and 5) the determination and prediction of protein structures and their molecular docking with drug candidate structures. Progresses toward a grid-enabled chemogenomic knowledge space are discussed.Comment: 43 pages, 4 figures, to appear in Malaria Journa

Hal - Université Grenoble Alpes

HAL AMU

Fraunhofer-ePrints

HAL Clermont Université

HAL Descartes

HAL-CEA

ProdInra

arXiv.org e-Print Archive

HAL-IN2P3

Springer - Publisher Connector

PubMed Central

UPSpace at the University of Pretoria

ImmPort, toward repurposing of open access immunological assay data for translational and clinical research

Author: Bhattacharya Sanchita
Butte Atul
Chen Jieming
Dunn Patrick
Hu Zicheng
Schaefer Henry
Shankar Ravi
Shen-Orr Shai
Smith Barry
Thomas Cristel
Thomson Elizabeth
Wiser Jeffrey
Zalocusky Kelly
Publication venue
Publication date: 01/01/2018
Field of study

Immunology researchers are beginning to explore the possibilities of reproducibility, reuse and secondary analyses of immunology data. Open-access datasets are being applied in the validation of the methods used in the original studies, leveraging studies for meta-analysis, or generating new hypotheses. To promote these goals, the ImmPort data repository was created for the broader research community to explore the wide spectrum of clinical and basic research data and associated findings. The ImmPort ecosystem consists of four components–Private Data, Shared Data, Data Analysis, and Resources—for data archiving, dissemination, analyses, and reuse. To date, more than 300 studies have been made freely available through the ImmPort Shared Data portal , which allows research data to be repurposed to accelerate the translation of new insights into discoveries

PhilPapers

eScholarship - University of California

Applicability of semi-supervised learning assumptions for gene ontology terms prediction

Author: Castellanos Cesar German
Jaramillo-Garzón Jorge Alberto
Perera Lluna Alexandre
Publication venue
Publication date: 01/01/2016
Field of study

Gene Ontology (GO) is one of the most important resources in bioinformatics, aiming to provide a unified framework for the biological annotation of genes and proteins across all species. Predicting GO terms is an essential task for bioinformatics, but the number of available labelled proteins is in several cases insufficient for training reliable machine learning classifiers. Semi-supervised learning methods arise as a powerful solution that explodes the information contained in unlabelled data in order to improve the estimations of traditional supervised approaches. However, semi-supervised learning methods have to make strong assumptions about the nature of the training data and thus, the performance of the predictor is highly dependent on these assumptions. This paper presents an analysis of the applicability of semi-supervised learning assumptions over the specific task of GO terms prediction, focused on providing judgment elements that allow choosing the most suitable tools for specific GO terms. The results show that semi-supervised approaches significantly outperform the traditional supervised methods and that the highest performances are reached when applying the cluster assumption. Besides, it is experimentally demonstrated that cluster and manifold assumptions are complimentary to each other and an analysis of which GO terms can be more prone to be correctly predicted with each assumption, is provided.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

ZENODO

Directory of Open Access Journals

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Repositorio Institucional ITM

The Non-Coding RNA Ontology (NCRO): A comprehensive resource for the unification of non-coding RNA biology

Author: et al
Wang Xiaowei
Publication venue: Digital Commons@Becker
Publication date: 01/01/2016
Field of study

Digital Commons@Becker

The Blood Ontology: An ontology in the domain of hematology

Author: Barcellos Almeida Mauricio
Carneiro Proietti Anna Barbara de Freitas
Jiye Ai
Smith Barry
Publication venue
Publication date: 01/01/2011
Field of study

Despite the importance of human blood to clinical practice and research, hematology and blood transfusion data remain scattered throughout a range of disparate sources. This lack of systematization concerning the use and definition of terms poses problems for physicians and biomedical professionals. We are introducing here the Blood Ontology, an ongoing initiative designed to serve as a controlled vocabulary for use in organizing information about blood. The paper describes the scope of the Blood Ontology, its stage of development and some of its anticipated uses

PhilPapers

Yeast Features: Identifying Significant Features Shared Among Yeast Proteins for Functional Genomics

Author: Ashkan Golshani
Frank Dehne
James J. Cheetham
James R. Green
Md Alamgir
Michel Dumontier
Myron L. Smith
Nadereh Mir-Rashed
Veronika Eroukova
Publication venue
Publication date: 18/09/2008
Field of study

Background
High throughput yeast functional genomics experiments are revealing associations among tens to hundreds of genes using numerous experimental conditions. To fully understand how the identified genes might be involved in the observed system, it is essential to consider the widest range of biological annotation possible. Biologists often start their search by collating the annotation provided for each protein within databases such as the Saccharomyces Genome Database, manually comparing them for similar features, and empirically assessing their significance. Such tasks can be automated, and more precise calculations of the significance can be determined using established probability measures. 
Results
We developed Yeast Features, an intuitive online tool to help establish the significance of finding a diverse set of shared features among a collection of yeast proteins. A total of 18,786 features from the Saccharomyces Genome Database are considered, including annotation based on the Gene Ontology’s molecular function, biological process and cellular compartment, as well as conserved domains, protein-protein and genetic interactions, complexes, metabolic pathways, phenotypes and publications. The significance of shared features is estimated using a hypergeometric probability, but novel options exist to improve the significance by adding background knowledge of the experimental system. For instance, increased statistical significance is achieved in gene deletion experiments because interactions with essential genes will never be observed. We further demonstrate the utility by suggesting the functional roles of the indirect targets of an aminoglycoside with a known mechanism of action, and also the targets of an herbal extract with a previously unknown mode of action. The identification of shared functional features may also be used to propose novel roles for proteins of unknown function, including a role in protein synthesis for YKL075C.
Conclusions
Yeast Features (YF) is an easy to use web-based application (http://software.dumontierlab.com/yeastfeatures/) which can identify and prioritize features that are shared among a set of yeast proteins. This approach is shown to be valuable in the analysis of complex data sets, in which the extracted associations revealed significant functional relationships among the gene products.&#xa

Nature Precedings

Genome-wide signatures of complex introgression and adaptive evolution in the big cats.

Author: Antunes Agostinho
Assis Juliana
Azevedo Fernando CC
Bi Ke
Brassaloti Ricardo A
Coutinho Luiz L
Eizirik Eduardo
Fernandes Gabriel
Figueiró Henrique V
Gabaldón Toni
Hughes Graham M
Kantek Daniel
Komissarov Aleksey
Li Gang
Linderoth Tyler
Loska Damian
Morato Ronaldo G
Murphy William J
Nielsen Rasmus
Nunes Adauto LV
O'Brien Stephen J
Oliveira Guilherme
Pais Fabiano
Ramalho Emiliano
Rodrigues Maíra R
Santos Sarah HD
Saragüeta Patricia
Silveira Leandro
Teeling Emma C
Teixeira Rodrigo HF
Trinca Cristine S
Trindade Fernanda J
Villela Priscilla MS
Publication venue: eScholarship, University of California
Publication date: 01/01/2017
Field of study

The great cats of the genus Panthera comprise a recent radiation whose evolutionary history is poorly understood. Their rapid diversification poses challenges to resolving their phylogeny while offering opportunities to investigate the historical dynamics of adaptive divergence. We report the sequence, de novo assembly, and annotation of the jaguar (Panthera onca) genome, a novel genome sequence for the leopard (Panthera pardus), and comparative analyses encompassing all living Panthera species. Demographic reconstructions indicated that all of these species have experienced variable episodes of population decline during the Pleistocene, ultimately leading to small effective sizes in present-day genomes. We observed pervasive genealogical discordance across Panthera genomes, caused by both incomplete lineage sorting and complex patterns of historical interspecific hybridization. We identified multiple signatures of species-specific positive selection, affecting genes involved in craniofacial and limb development, protein metabolism, hypoxia, reproduction, pigmentation, and sensory perception. There was remarkable concordance in pathways enriched in genomic segments implicated in interspecies introgression and in positive selection, suggesting that these processes were connected. We tested this hypothesis by developing exome capture probes targeting ~19,000 Panthera genes and applying them to 30 wild-caught jaguars. We found at least two genes (DOCK3 and COL4A5, both related to optic nerve development) bearing significant signatures of interspecies introgression and within-species positive selection. These findings indicate that post-speciation admixture has contributed genetic material that facilitated the adaptive evolution of big cat lineages

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

CONICET Digital

eScholarship - University of California

NSU Works