Search CORE

17 research outputs found

Active machine learning for transmembrane helix prediction

Author: Carbonell Jaime G
Ganapathiraju Madhavi K
Osmanbeyoglu Hatice U
Wehner Jessica A
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Abstract Background About 30% of genes code for membrane proteins, which are involved in a wide variety of crucial biological functions. Despite their importance, experimentally determined structures correspond to only about 1.7% of protein structures deposited in the Protein Data Bank due to the difficulty in crystallizing membrane proteins. Algorithms that can identify proteins whose high-resolution structure can aid in predicting the structure of many previously unresolved proteins are therefore of potentially high value. Active machine learning is a supervised machine learning approach which is suitable for this domain where there are a large number of sequences but only very few have known corresponding structures. In essence, active learning seeks to identify proteins whose structure, if revealed experimentally, is maximally predictive of others. Results An active learning approach is presented for selection of a minimal set of proteins whose structures can aid in the determination of transmembrane helices for the remaining proteins. TMpro, an algorithm for high accuracy TM helix prediction we previously developed, is coupled with active learning. We show that with a well-designed selection procedure, high accuracy can be achieved with only few proteins. TMpro, trained with a single protein achieved an F-score of 94% on benchmark evaluation and 91% on MPtopo dataset, which correspond to the state-of-the-art accuracies on TM helix prediction that are achieved usually by training with over 100 training proteins. Conclusion Active learning is suitable for bioinformatics applications, where manually characterized data are not a comprehensive representation of all possible data, and in fact can be a very sparse subset thereof. It aids in selection of data instances which when characterized experimentally can improve the accuracy of computational characterization of remaining raw data. The results presented here also demonstrate that the feature extraction method of TMpro is well designed, achieving a very good separation between TM and non TM segments

Crossref

Springer - Publisher Connector

PubMed Central

Carolina Digital Repository

A Pan-cancer analysis reveals high-frequency genetic alterations in mediators of signaling by the tgf-β superfamily

We present an integromic analysis of gene alterations that modulate transforming growth factor β (TGF-β)-Smad-mediated signaling in 9,125 tumor samples across 33 cancer types in The Cancer Genome Atlas (TCGA). Focusing on genes that encode mediators and regulators of TGF-β signaling, we found at least one genomic alteration (mutation, homozygous deletion, or amplification) in 39% of samples, with highest frequencies in gastrointestinal cancers. We identified mutation hotspots in genes that encode TGF-β ligands (BMP5), receptors (TGFBR2, AVCR2A, and BMPR2), and Smads (SMAD2 and SMAD4). Alterations in the TGF-β superfamily correlated positively with expression of metastasis-associated genes and with decreased survival. Correlation analyses showed the contributions of mutation, amplification, deletion, DNA methylation, and miRNA expression to transcriptional activity of TGF-β signaling in each cancer type. This study provides a broad molecular perspective relevant for future functional and therapeutic studies of the diverse cancer pathways mediated by the TGF-β superfamily

Diposit Digital de la Universitat de Barcelona

SWI/SNF tumor suppressor gene PBRM1/BAF180 in human clear cell kidney cancer

Author: Amrita M. Nargund
Emily H. Cheng
Hatice U. Osmanbeyoglu
James J. Hsieh
Publication venue: Taylor & Francis Group
Publication date: 01/07/2017
Field of study

Mutations within chromatin modulating protein complexes have dominated the novel cancer gene landscape. However, little is known about how individual aberrations contribute to cancer formation. A novel Pbrm1 kidney cancer mouse model examining the role of Pbrm1 provides much needed clue concerning how SWI/SNF complexes might function as tumor suppressors

Directory of Open Access Journals

Active Learning for Membrane Protein Structure Prediction

Author: Hatice U. Osmanbeyoglu (5427746)
Jaime G. Carbonell (5361842)
Jessica A. Wehner (5427749)
Madhavi K. Ganapathiraju (117604)
Publication venue
Publication date: 30/06/2018
Field of study

Background: About 30% of genes code for membrane proteins, which are involved in a wide variety of crucial biological functions. Despite their importance, experimentally determined structures correspond to only about 1.7% of protein structures deposited in the Protein Data Bank due to the difficulty in crystallizing membrane proteins. Algorithms that can identify proteins whose high-resolution structure can aid in predicting the structure of many previously unresolved proteins are therefore of potentially high value. Active machine learning is a supervised machine learning approach which is suitable for this domain where there are a large number of sequences but only very few have known corresponding structures. In essence, active learning seeks to identify proteins whose structure, if revealed experimentally, is maximally predictive of others. Results: An active learning approach is presented for selection of a minimal set of proteins whose structures can aid in the determination of transmembrane helices for the remaining proteins. TMpro, an algorithm for high accuracy TM helix prediction we previously developed, is coupled with active learning.We show that with a well-designed selection procedure, high accuracy can be achieved with only few proteins. TMpro, trained with a single protein achieved an F-score of 94% on benchmark evaluation and 91% on MPtopo dataset, which correspond to the state-of-the-art accuracies on TM helix prediction that are achieved usually by training with over 100 training proteins. Conclusion: Active learning is suitable for bioinformatics applications, where manually characterized data are not a comprehensive representation of all possible data, and in fact can be a very sparse subset thereof. It aids in selection of data instances which when characterized experimentally can improve the accuracy of computational characterization of remaining raw data. The results presented here also demonstrate that the feature extraction method of TMpro is well designed, achieving a very good separation between TM and non TM segments.</p

Linking signaling pathways to transcriptional programs in breast cancer

Author: Bartholomeusz
Boutsikou
Christina S. Leslie
Gagliardi
Hatice U. Osmanbeyoglu
Jacqueline F. Bromberg
Li
Pegoraro
Raphael Pelossof
Therneau
Ueno
Yosef
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date
Field of study

Crossref

An ATR and CHK1 kinase signaling mechanism that limits origin firing during unperturbed DNA replication

Author: Chenao Qian
Christopher J. Bakkenist
Eli Rothenberg
Hatice U. Osmanbeyoglu
Lob
Marchal
Michael J. Calderon
Moiseeva
Norie Sugitani
O’Donnell
Sandra Schamus-Haynes
Simon C. Watkins
Sokka
Tatiana N. Moiseeva
Yandong Yin
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date
Field of study

Crossref

A Pan-Cancer Analysis Reveals High-Frequency Genetic Alterations in Mediators of Signaling by the TGF-β Superfamily

We present an integromic analysis of gene alterations that modulate transforming growth factor β (TGF-β)-Smad-mediated signaling in 9,125 tumor samples across 33 cancer types in The Cancer Genome Atlas (TCGA). Focusing on genes that encode mediators and regulators of TGF-β signaling, we found at least one genomic alteration (mutation, homozygous deletion, or amplification) in 39% of samples, with highest frequencies in gastrointestinal cancers. We identified mutation hotspots in genes that encode TGF-β ligands (BMP5), receptors (TGFBR2, AVCR2A, and BMPR2), and Smads (SMAD2 and SMAD4). Alterations in the TGF-β superfamily correlated positively with expression of metastasis-associated genes and\ua0with decreased survival. Correlation analyses showed the contributions of mutation, amplification, deletion, DNA methylation, and miRNA expression to transcriptional activity of TGF-β signaling in each cancer type. This study provides a broad molecular perspective relevant for future functional and therapeutic studies of the diverse cancer pathways mediated by the TGF-β superfamily

George Washington University: Health Sciences Research Commons (HSRC)

University of Queensland eSpace

A Pan-Cancer Analysis Reveals High-Frequency Genetic Alterations in Mediators of Signaling by the TGF-β Superfamily

Author: Ajani Jaffer A.
Andersen Jesper B.
Berger Ashton C.
Datto Mike
de Velasco Guillermo
Gough Nancy R.
Hansel Donna
Houseman Andres
Jogunoori Wilma
Ju Zhenlin
Kanchi Rupa S.
Korkut Anil
Kwong Lawrence N.
Li Shulin
Li Xubin
Ling Shiyun
Liu Yuexin
Lorenzi Philip L.
Mani Sendurai A.
Manyam Ganiraju
Nguyen Bao Ngoc
O\u27Rourke Colm J.
Ohshiro Kazufumi
Osmanbeyoglu Hatice U.
Pennathur Arjun
Rao Arvind
Rao Shuyun
Ravikumar Visweswaran
Robertson Gordon
Roszik Jason
Schultz Andre
Shelley Simon
Zaidi Sobia
Publication venue: Health Sciences Research Commons
Publication date: 24/10/2018
Field of study

© 2018 Elsevier Inc. We present an integromic analysis of gene alterations that modulate transforming growth factor β (TGF-β)-Smad-mediated signaling in 9,125 tumor samples across 33 cancer types in The Cancer Genome Atlas (TCGA). Focusing on genes that encode mediators and regulators of TGF-β signaling, we found at least one genomic alteration (mutation, homozygous deletion, or amplification) in 39% of samples, with highest frequencies in gastrointestinal cancers. We identified mutation hotspots in genes that encode TGF-β ligands (BMP5), receptors (TGFBR2, AVCR2A, and BMPR2), and Smads (SMAD2 and SMAD4). Alterations in the TGF-β superfamily correlated positively with expression of metastasis-associated genes and with decreased survival. Correlation analyses showed the contributions of mutation, amplification, deletion, DNA methylation, and miRNA expression to transcriptional activity of TGF-β signaling in each cancer type. This study provides a broad molecular perspective relevant for future functional and therapeutic studies of the diverse cancer pathways mediated by the TGF-β superfamily. To date, there are no studies of the TGF-β superfamily of signaling pathways across multiple cancers. This study represents a key starting point for unraveling the role of this complex superfamily in 33 divergent cancer types from over 9,000 patients

George Washington University: Health Sciences Research Commons (HSRC)