Search CORE

823 research outputs found

The Trypanosoma brucei MitoCarta and its regulation and splicing pattern during development

Author: Astrid Chanfon
Audic
Bannai
Benne
Bertrand
Besteiro
Bhasin
Bochud-Allemann
Borst
Brown
Burges
Chaudhuri
Claros
Cui
Daniel Nilsson
de Almeida
Dubchak
Eisenhaber
Eisenhaber
Emanuelsson
Ferguson
Folsch
Guda
Guda
Guo
Halic
Hashimi
Herrmann
Horton
Horton
Horvath
Hua
Huang
Huinan Wang
Juan Cui
Kanehisa
Kapila Gunasekera
Kumar
Lee
Li
Long
Lu
Matthews
Michels
Mokranjac
Nair
Nakai
Nilsson
Pagliarini
Panigrahi
Park
Perocchi
Petsalaki
Priest
Priest
Priest
Prilusky
Pusnik
Reinhardt
Sabatini
Simpson
Sloof
Small
Sutton
Tasker
Tetaud
Torsten Ochsenreiter
Uboldi
Vassella
Vickerman
von Heijne
Xiaobai Zhang
Xiaofeng Song
Xie
Ying Xu
Publication venue: Oxford University Press
Publication date: 01/01/2010
Field of study

It has long been known that trypanosomes regulate mitochondrial biogenesis during the life cycle of the parasite; however, the mitochondrial protein inventory (MitoCarta) and its regulation remain unknown. We present a novel computational method for genome-wide prediction of mitochondrial proteins using a support vector machine-based classifier with ∼90% prediction accuracy. Using this method, we predicted the mitochondrial localization of 468 proteins with high confidence and have experimentally verified the localization of a subset of these proteins. We then applied a recently developed parallel sequencing technology to determine the expression profiles and the splicing patterns of a total of 1065 predicted MitoCarta transcripts during the development of the parasite, and showed that 435 of the transcripts significantly changed their expressions while 630 remain unchanged in any of the three life stages analyzed. Furthermore, we identified 298 alternatively splicing events, a small subset of which could lead to dual localization of the corresponding proteins

Crossref

PubMed Central

Bern Open Repository and Information System (BORIS)

The Trypanosoma \u3ci\u3ebrucei\u3c/i\u3e MitoCarta and its regulation and splicing pattern during development

Author: Chanfon Astrid
Cui Juan
Gunasekera Kapila
Nilsson Daniel
Ochsenreiter Torsten
Song Xiaofeng
Wang Huinan
Xu Ying
Zhang Xiaobai
Publication venue: DigitalCommons@University of Nebraska - Lincoln
Publication date: 01/01/2010
Field of study

It has long been known that trypanosomes regulate mitochondrial biogenesis during the life cycle of the parasite; however, the mitochondrial protein inventory (MitoCarta) and its regulation remain unknown. We present a novel computational method for genome-wide prediction of mitochondrial proteins using a support vector machine-based classifier with ~90% prediction accuracy. Using this method, we predicted the mitochondrial localization of 468 proteins with high confidence and have experimentally verified the localization of a subset of these proteins. We then applied a recently developed parallel sequencing technology to determine the expression profiles and the splicing patterns of a total of 1065 predicted MitoCarta transcripts during the development of the parasite, and showed that 435 of the transcripts significantly changed their expressions while 630 remain unchanged in any of the three life stages analyzed. Furthermore, we identified 298 alternatively splicing events, a small subset of which could lead to dual localization of the corresponding proteins

DigitalCommons@University of Nebraska

The Trypanosoma brucei MitoCarta and its regulation and splicing pattern during development

Author: Chanfon Astrid
Cui Juan
Gunasekera Kapila
Nilsson Daniel
Ochsenreiter Torsten
Song Xiaofeng
Wang Huinan
Xu Ying
Zhang Xiaobai
Publication venue
Publication date: 02/08/2017
Field of study

RERO DOC Digital Library

Large-scale automated protein function prediction

Author: Kahanda Indika
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2016
Field of study

Includes bibliographical references.2016 Summer.Proteins are the workhorses of life, and identifying their functions is a very important biological problem. The function of a protein can be loosely defined as everything it performs or happens to it. The Gene Ontology (GO) is a structured vocabulary which captures protein function in a hierarchical manner and contains thousands of terms. Through various wet-lab experiments over the years scientists have been able to annotate a large number of proteins with GO categories which reflect their functionality. However, experimentally determining protein functions is a highly resource-intensive task, and a large fraction of proteins remain un-annotated. Recently a plethora automated methods have emerged and their reasonable success in computationally determining the functions of proteins using a variety of data sources – by sequence/structure similarity or using various biological network data, has led to establishing automated function prediction (AFP) as an important problem in bioinformatics. In a typical machine learning problem, cross-validation is the protocol of choice for evaluating the accuracy of a classifier. But, due to the process of accumulation of annotations over time, we identify the AFP as a combination of two sub-tasks: making predictions on annotated proteins and making predictions on previously unannotated proteins. In our first project, we analyze the performance of several protein function prediction methods in these two scenarios. Our results show that GOstruct, an AFP method that our lab has previously developed, and two other popular methods: binary SVMs and guilt by association, find it hard to achieve the same level of accuracy on these two tasks compared to the performance evaluated through cross-validation, and that predicting novel annotations for previously annotated proteins is a harder problem than predicting annotations for uncharacterized proteins. We develop GOstruct 2.0 by proposing improvements which allows the model to make use of information of a protein's current annotations to better handle the task of predicting novel annotations for previously annotated proteins. Experimental results on yeast and human data show that GOstruct 2.0 outperforms the original GOstruct, demonstrating the effectiveness of the proposed improvements. Although the biomedical literature is a very informative resource for identifying protein function, most AFP methods do not take advantage of the large amount of information contained in it. In our second project, we conduct the first ever comprehensive evaluation on the effectiveness of literature data for AFP. Specifically, we extract co-mentions of protein-GO term pairs and bag-of-words features from the literature and explore their effectiveness in predicting protein function. Our results show that literature features are very informative of protein function but with further room for improvement. In order to improve the quality of automatically extracted co-mentions, we formulate the classification of co-mentions as a supervised learning problem and propose a novel method based on graph kernels. Experimental results indicate the feasibility of using this co-mention classifier as a complementary method that aids the bio-curators who are responsible for maintaining databases such as Gene Ontology. This is the first study of the problem of protein-function relation extraction from biomedical text. The recently developed human phenotype ontology (HPO), which is very similar to GO, is a standardized vocabulary for describing the phenotype abnormalities associated with human diseases. At present, only a small fraction of human protein coding genes have HPO annotations. But, researchers believe that a large portion of currently unannotated genes are related to disease phenotypes. Therefore, it is important to predict gene-HPO term associations using accurate computational methods. In our third project, we introduce PHENOstruct, a computational method that directly predicts the set of HPO terms for a given gene. We compare PHENOstruct with several baseline methods and show that it outperforms them in every respect. Furthermore, we highlight a collection of informative data sources suitable for the problem of predicting gene-HPO associations, including large scale literature mining data

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Computational Methods for the Analysis of Genomic Data and Biological Processes

Author
Publication venue: 'MDPI AG'
Publication date: 01/05/2021
Field of study

In recent decades, new technologies have made remarkable progress in helping to understand biological systems. Rapid advances in genomic profiling techniques such as microarrays or high-performance sequencing have brought new opportunities and challenges in the fields of computational biology and bioinformatics. Such genetic sequencing techniques allow large amounts of data to be produced, whose analysis and cross-integration could provide a complete view of organisms. As a result, it is necessary to develop new techniques and algorithms that carry out an analysis of these data with reliability and efficiency. This Special Issue collected the latest advances in the field of computational methods for the analysis of gene expression data, and, in particular, the modeling of biological processes. Here we present eleven works selected to be published in this Special Issue due to their interest, quality, and originality

Directory of Open Access Books (DOAB)

'Unite and conquer': enhanced prediction of protein subcellular localization by integrating multiple specialized tools

Author: A Bulashevska
A Krogh
C Andreoli
C Guda
C Guda
CS Yu
E Badidi
E Frank
GE Tusnady
Gertraud Burger
H Bannai
H Shatkay
HB Shen
HB Shen
I Small
JL Heazlewood
JR Quinlan
JY Shi
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KC Chou
KJ Park
L Kall
M Bhasin
M Boden
MG Claros
MS Scott
N Pfanner
N Wiedemann
O Emanuelsson
P Donnes
QB Gao
S Džeroski
S Hua
S Matsuda
SHB Chou KC
T Hirokawa
T Zhang
W Li
X Xiao
Y Huang
Yao Qing Shen
YD Cai
YD Cai
YL Chen
YX Pan
Z Lu
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background Knowing the subcellular location of proteins provides clues to their function as well as the interconnectivity of biological processes. Dozens of tools are available for predicting protein location in the eukaryotic cell. Each tool performs well on certain data sets, but their predictions often disagree for a given protein. Since the individual tools each have particular strengths, we set out to integrate them in a way that optimally exploits their potential. The method we present here is applicable to various subcellular locations, but tailored for predicting whether or not a protein is localized in mitochondria. Knowledge of the mitochondrial proteome is relevant to understanding the role of this organelle in global cellular processes. Results In order to develop a method for enhanced prediction of subcellular localization, we integrated the outputs of available localization prediction tools by several strategies, and tested the performance of each strategy with known mitochondrial proteins. The accuracy obtained (up to 92%) surpasses by far the individual tools. The method of integration proved crucial to the performance. For the prediction of mitochondrion-located proteins, integration via a two-layer decision tree clearly outperforms simpler methods, as it allows emphasis of biologically relevant features such as the mitochondrial targeting peptide and transmembrane domains. Conclusion We developed an approach that enhances the prediction accuracy of mitochondrial proteins by uniting the strength of specialized tools. The combination of machine-learning based integration with biological expert knowledge leads to improved performance. This approach also alleviates the conundrum of how to choose between conflicting predictions. Our approach is easy to implement, and applicable to predicting subcellular locations other than mitochondria, as well as other biological features. For a trial of our approach, we provide a webservice for mitochondrial protein prediction (named YimLOC), which can be accessed through the AnaBench suite at http://anabench.bcm.umontreal.ca/anabench/. The source code is provided in the Additional File <supplr sid="S2">2</supplr>. <suppl id="S2"> <title> Additional file 2 </title> <text> This file contains scripts for the online server YimLOC. Please note that there scripts only codes for the ready-to-use STACK-mem-DT described in the main text. The scripts do not provide the training process. </text> <file name="1471-2105-8-420-S2.pdf"> Click here for file </file> </suppl

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Bioinformatics analysis of mitochondrial disease

Author: Lythgow Kieren
Publication venue: Newcastle University
Publication date: 01/01/2011
Field of study

PhD thesisSeveral bioinformatic methods have been developed to aid the identification of novel nuclear-mitochondrial genes involved in disease. Previous research has aimed to increase the sensitivity and specificity of these predictions through a combination of available techniques. This investigation shows the optimum sensitivity and specificity can be achieved by carefully selecting seven specific classifiers in combination. The results also show that increasing the number of classifiers even further can paradoxically decrease the sensitivity and specificity of a prediction. Additionally, text mining applications are playing a huge role in disease candidate gene identification providing resources for interpreting the vast quantities of biomedical literature currently available. A workflow resource was developed identifying a number of genes potentially associated with Lebers Hereditary Optic Neuropathy (LHON). This included specific orthologues in mouse displaying a potential association to LHON not annotated as such in humans. Mitochondrial DNA (mtDNA) fragments have been transferred to the human nuclear genome over evolutionary time. These insertions were compared to an existing database of 263 mtDNA deletions to highlight any associated mechanisms governing DNA loss from mitochondria. Flanking regions were also screened within the nuclear genome that surrounded these insertions for transposable elements, GC content and mitochondrial genes. No obvious association was found relating NUMTs to mtDNA deletions. NUMTs do not appear to be distributed throughout the genome via transposition and integrate predominantly in areas of low %GC with low gene content. These areas also lacked evidence of an elevated number of surrounding nuclear-mitochondrial genes but a further genome-wide study is required

Newcastle University eTheses

Towards AI-driven longevity research: An overview

Author: Bischof Evelyne
Calabrese Giuliana
Cappilli Simone
Chersoni Emmanuele
Marino Nicola
Mazzotta Alessandro
Putignano Guido
Santuccione Antonella
Santus Enrico.
Scarano Bryan
Vanhaelen Quentin
Zhavoronkov Alex
Publication venue
Publication date: 01/01/2023
Field of study

Archivio della ricerca della Scuola Superiore Sant'Anna