Search CORE

10,425 research outputs found

De novo prediction of PTBP1 binding and splicing targets reveals unexpected features of its RNA recognition and function.

Author: Black Douglas L
Fu Xiang-Dong
Han Areum
Linares Anthony J
Stoilov Peter
Zhou Yu
Publication venue: eScholarship, University of California
Publication date: 01/01/2014
Field of study

The splicing regulator Polypyrimidine Tract Binding Protein (PTBP1) has four RNA binding domains that each binds a short pyrimidine element, allowing recognition of diverse pyrimidine-rich sequences. This variation makes it difficult to evaluate PTBP1 binding to particular sites based on sequence alone and thus to identify target RNAs. Conversely, transcriptome-wide binding assays such as CLIP identify many in vivo targets, but do not provide a quantitative assessment of binding and are informative only for the cells where the analysis is performed. A general method of predicting PTBP1 binding and possible targets in any cell type is needed. We developed computational models that predict the binding and splicing targets of PTBP1. A Hidden Markov Model (HMM), trained on CLIP-seq data, was used to score probable PTBP1 binding sites. Scores from this model are highly correlated (ρ = -0.9) with experimentally determined dissociation constants. Notably, we find that the protein is not strictly pyrimidine specific, as interspersed Guanosine residues are well tolerated within PTBP1 binding sites. This model identifies many previously unrecognized PTBP1 binding sites, and can score PTBP1 binding across the transcriptome in the absence of CLIP data. Using this model to examine the placement of PTBP1 binding sites in controlling splicing, we trained a multinomial logistic model on sets of PTBP1 regulated and unregulated exons. Applying this model to rank exons across the mouse transcriptome identifies known PTBP1 targets and many new exons that were confirmed as PTBP1-repressed by RT-PCR and RNA-seq after PTBP1 depletion. We find that PTBP1 dependent exons are diverse in structure and do not all fit previous descriptions of the placement of PTBP1 binding sites. Our study uncovers new features of RNA recognition and splicing regulation by PTBP1. This approach can be applied to other multi-RRM domain proteins to assess binding site degeneracy and multifactorial splicing regulation

Crossref

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

The Research Repository @ WVU (West Virginia University)

Silencing disease genes in the laboratory and the clinic

Author: Corey David R.
Watts Jonathan K.
Publication venue: 'Wiley'
Publication date: 01/01/2012
Field of study

Synthetic nucleic acids are commonly used laboratory tools for modulating gene expression and have the potential to be widely used in the clinic. Progress towards nucleic acid drugs, however, has been slow and many challenges remain to be overcome before their full impact on patient care can be understood. Antisense oligonucleotides (ASOs) and small interfering RNAs (siRNAs) are the two most widely used strategies for silencing gene expression. We first describe these two approaches and contrast their relative strengths and weaknesses for laboratory applications. We then review the choices faced during development of clinical candidates and the current state of clinical trials. Attitudes towards clinical development of nucleic acid silencing strategies have repeatedly swung from optimism to depression during the past 20 years. Our goal is to provide the information needed to design robust studies with oligonucleotides, making use of the strengths of each oligonucleotide technology

Southampton (e-Prints Soton)

PubMed Central

Human Promoter Recognition Based on Principal Component Analysis

Author: Li Xiaomeng
Publication venue: Faculty of Engineering and Information Technologies, School of Electrical and Information Engineering
Publication date: 01/01/2008
Field of study

This thesis presents an innovative human promoter recognition model HPR-PCA. Principal component analysis (PCA) is applied on context feature selection DNA sequences and the prediction network is built with the artificial neural network (ANN). A thorough literature review of all the relevant topics in the promoter prediction field is also provided. As the main technique of HPR-PCA, the application of PCA on feature selection is firstly developed. In order to find informative and discriminative features for effective classification, PCA is applied on the different n-mer promoter and exon combined frequency matrices, and principal components (PCs) of each matrix are generated to construct the new feature space. ANN built classifiers are used to test the discriminability of each feature space. Finally, the 3 and 5-mer feature matrix is selected as the context feature in this model. Two proposed schemes of HPR-PCA model are discussed and the implementations of sub-modules in each scheme are introduced. The context features selected by PCA are III used to build three promoter and non-promoter classifiers. CpG-island modules are embedded into models in different ways. In the comparison, Scheme I obtains better prediction results on two test sets so it is adopted as the model for HPR-PCA for further evaluation. Three existing promoter prediction systems are used to compare to HPR-PCA on three test sets including the chromosome 22 sequence. The performance of HPR-PCA is outstanding compared to the other four systems

CiteSeerX

Sydney eScholarship

Human Promoter Recognition Based on Principal Component Analysis

Author: Li Xiaomeng
Publication venue: Faculty of Engineering and Information Technologies, School of Electrical and Information Engineering
Publication date: 01/01/2008
Field of study

Estudo Geral

Sydney eScholarship

Recommended from our members

The Expanding Landscape of Alternative Splicing Variation in Human Populations.

Author: Lin Lan
Pan Zhicheng
Park Eddie
Xing Yi
Zhang Zijun
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

Alternative splicing is a tightly regulated biological process by which the number of gene products for any given gene can be greatly expanded. Genomic variants in splicing regulatory sequences can disrupt splicing and cause disease. Recent developments in sequencing technologies and computational biology have allowed researchers to investigate alternative splicing at an unprecedented scale and resolution. Population-scale transcriptome studies have revealed many naturally occurring genetic variants that modulate alternative splicing and consequently influence phenotypic variability and disease susceptibility in human populations. Innovations in experimental and computational tools such as massively parallel reporter assays and deep learning have enabled the rapid screening of genomic variants for their causal impacts on splicing. In this review, we describe technological advances that have greatly increased the speed and scale at which discoveries are made about the genetic variation of alternative splicing. We summarize major findings from population transcriptomic studies of alternative splicing and discuss the implications of these findings for human genetics and medicine

eScholarship - University of California

Human Promoter Prediction Using DNA Numerical Representation

Author: Arniker Swarna Bai
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2010
Field of study

With the emergence of genomic signal processing, numerical representation techniques for DNA alphabet set {A, G, C, T} play a key role in applying digital signal processing and machine learning techniques for processing and analysis of DNA sequences. The choice of the numerical representation of a DNA sequence affects how well the biological properties can be reflected in the numerical domain for the detection and identification of the characteristics of special regions of interest within the DNA sequence. This dissertation presents a comprehensive study of various DNA numerical and graphical representation methods and their applications in processing and analyzing long DNA sequences. Discussions on the relative merits and demerits of the various methods, experimental results and possible future developments have also been included. Another area of the research focus is on promoter prediction in human (Homo Sapiens) DNA sequences with neural network based multi classifier system using DNA numerical representation methods. In spite of the recent development of several computational methods for human promoter prediction, there is a need for performance improvement. In particular, the high false positive rate of the feature-based approaches decreases the prediction reliability and leads to erroneous results in gene annotation.To improve the prediction accuracy and reliability, DigiPromPred a numerical representation based promoter prediction system is proposed to characterize DNA alphabets in different regions of a DNA sequence.The DigiPromPred system is found to be able to predict promoters with a sensitivity of 90.8% while reducing false prediction rate for non-promoter sequences with a specificity of 90.4%. The comparative study with state-of-the-art promoter prediction systems for human chromosome 22 shows that our proposed system maintains a good balance between prediction accuracy and reliability. To reduce the system architecture and computational complexity compared to the existing system, a simple feed forward neural network classifier known as SDigiPromPred is proposed. The SDigiPromPred system is found to be able to predict promoters with a sensitivity of 87%, 87%, 99% while reducing false prediction rate for non-promoter sequences with a specificity of 92%, 94%, 99% for Human, Drosophila, and Arabidopsis sequences respectively with reconfigurable capability compared to existing system

Scholarship at UWindsor

IN-AIS-MACA: Integrated Artificial Immune System based Multiple Attractor Cellular Automata For Human Protein Coding and Promoter Prediction of 252bp Length DNA Sequence

Author: Inampudi Ramesh Babu
Pokkuluri Kiran Sree
Pokkuluri Kiran Sree
Publication venue: Global Journals Inc. (US)
Publication date: 12/08/2014
Field of study

Gene prediction involves protein coding and promoter predictions. There is a need of integrated algorithms which can predict both these regions at a faster rate. Till date, we have individual algorithms for addressing these problems. We have developed a novel classifier IN-AIS-MACA, which can predict both these regions in genomic DNA sequences of length 252bp with 93.5% accuracy and total prediction time of 1031ms. This classifier will certainly create intuition to develop more classifiers like this

Global Journal of Computer Science and Technology (GJCST)

Hidden Markov Models for Gene Sequence Classification: Classifying the VSG genes in the Trypanosoma brucei Genome

Author: Alvarez-Valin Fernando
Basterrech Sebastián
Guerberoff Gustavo
Mesa Andrea
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 07/08/2015
Field of study

The article presents an application of Hidden Markov Models (HMMs) for pattern recognition on genome sequences. We apply HMM for identifying genes encoding the Variant Surface Glycoprotein (VSG) in the genomes of Trypanosoma brucei (T. brucei) and other African trypanosomes. These are parasitic protozoa causative agents of sleeping sickness and several diseases in domestic and wild animals. These parasites have a peculiar strategy to evade the host's immune system that consists in periodically changing their predominant cellular surface protein (VSG). The motivation for using patterns recognition methods to identify these genes, instead of traditional homology based ones, is that the levels of sequence identity (amino acid and DNA sequence) amongst these genes is often below of what is considered reliable in these methods. Among pattern recognition approaches, HMM are particularly suitable to tackle this problem because they can handle more naturally the determination of gene edges. We evaluate the performance of the model using different number of states in the Markov model, as well as several performance metrics. The model is applied using public genomic data. Our empirical results show that the VSG genes on T. brucei can be safely identified (high sensitivity and low rate of false positives) using HMM.Comment: Accepted article in July, 2015 in Pattern Analysis and Applications, Springer. The article contains 23 pages, 4 figures, 8 tables and 51 reference

arXiv.org e-Print Archive

Crossref

DSpace at VSB Technical University of Ostrava

The therapeutic potential of the CRISPR-Cas9 system for treating Duchenne muscular dystrophy

Author: Rubin David Sweeney
Publication venue
Publication date: 05/11/2016
Field of study

The CRISPR-Cas9 gene editing system gives researchers the ability to manipulate and edit DNA with unprecedented ease and precision. It was discovered in bacteria as part of their adaptive immune system, but has been reengineered to target any double stranded DNA. This burgeoning molecular tool has created great excitement as scientists are rapidly adopting it to study fields including human gene therapy, disease modeling, agriculture, gene drive in mosquitos, and many others. This paper will explore the potential impact of CRISPR-Cas9 in human therapeutics. Specifically, the potential of CRISPR-Cas9 to treat Duchenne Muscular Dystrophy will be examined. In several ways, this debilitating degenerative disease is an ideal candidate for gene-editing with CRISPR-Cas9. Recent progress in the lab has demonstrated the gene editing system’s ability to rescue dystrophin protein levels in vivo. Although CRISPR-Cas9 holds great promise for previously incurable diseases, there are still many limitations that must be overcome before the gene editing system can be used in patients. This paper will discuss these barriers as well as recent advancements to overcome them

Boston University Institutional Repository (OpenBU)

MSH3 polymorphisms and protein levels affect CAG repeat instability in huntington's disease mice

Author: A Lloret
A Lopez Castel
A Lopez Castel
A Rosenblatt
A Seriola
A Watanabe
AA Fuller
AC Haugen
AM Marcelino
Anne Messer
AV Goula
AV Goula
AV Goula
C Blake
C Chiang
C Kumar
C Savouret
C Savouret
C Savouret
CE Nestor
CE Pearson
CE Pearson
Christopher E. Pearson
CJ Otto
CM Venkatachalam
Darren G. Monckton
DG Monckton
DK Chang
E Dragileva
E Taherzadeh-Fard
EG Hutchinson
EG Hutchinson
EL McCallister
Elisabeth R. M. Tillier
EM Ramos
F Coppede
F Morales
G Gourdon
GB Panigrahi
GB Panigrahi
GF Crouse
GG Krivov
GJ Brock
Greg W. Clark
Gregory S. Barsh
H Fu
H Hashida
H Takano
H Telenius
HM Berman
HM Kim
I Holt
IV Kovtun
IV Kovtun
IV Kovtun
J Conde
J Du
J Genschel
J Jiricny
JA Ybe
JD Cleary
JD Cleary
JL Li
JL Weber
JM Harrington
JM Lee
Jodie P. Simard
JP Linton
JV Olsen
K Katoh
K Manley
K Manley
K Takano
KE De Rooij
Kevin Manley
KL Burr
L Foiry
L Giunti
L Hubert Jr
L Kennedy
L Mangiarini
L Mangiarini
L Mollersen
L Mollersen
L Tian
LN Johnson
M Clamp
M Gomes-Pereira
M Gomes-Pereira
M Mangoni
M Swami
Meera Swami
Meghan M. Slean
MH Lamers
MM Slean
NS Wexler
Peggy F. Shelbourne
PF Shelbourne
PF Shelbourne
RM Cowin
RM Cowin
RP Chen
RT Libby
RT Libby
S Ku
S Michiels
S Oda
S Tome
S Tome
S Tome
SC Vatsavayai
SC Warby
SF Altschul
SJ Littman
SL Martinez
SN Thibodeau
SR Trevino
Stéphanie Tomé
T Kin
V Ezzatizadeh
VC Wheeler
VC Wheeler
W Kabsch
WJ van Den Broek
WJ van den Broek
X Dong
XY Hauge
Y Lin
Y Watanabe
Y Zhang
YC Hsieh
Z Yang
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2013
Field of study

Expansions of trinucleotide CAG/CTG repeats in somatic tissues are thought to contribute to ongoing disease progression through an affected individual's life with Huntington's disease or myotonic dystrophy. Broad ranges of repeat instability arise between individuals with expanded repeats, suggesting the existence of modifiers of repeat instability. Mice with expanded CAG/CTG repeats show variable levels of instability depending upon mouse strain. However, to date the genetic modifiers underlying these differences have not been identified. We show that in liver and striatum the R6/1 Huntington's disease (HD) (CAG)~100 transgene, when present in a congenic C57BL/6J (B6) background, incurred expansion-biased repeat mutations, whereas the repeat was stable in a congenic BALB/cByJ (CBy) background. Reciprocal congenic mice revealed the Msh3 gene as the determinant for the differences in repeat instability. Expansion bias was observed in congenic mice homozygous for the B6 Msh3 gene on a CBy background, while the CAG tract was stabilized in congenics homozygous for the CBy Msh3 gene on a B6 background. The CAG stabilization was as dramatic as genetic deficiency of Msh2. The B6 and CBy Msh3 genes had identical promoters but differed in coding regions and showed strikingly different protein levels. B6 MSH3 variant protein is highly expressed and associated with CAG expansions, while the CBy MSH3 variant protein is expressed at barely detectable levels, associating with CAG stability. The DHFR protein, which is divergently transcribed from a promoter shared by the Msh3 gene, did not show varied levels between mouse strains. Thus, naturally occurring MSH3 protein polymorphisms are modifiers of CAG repeat instability, likely through variable MSH3 protein stability. Since evidence supports that somatic CAG instability is a modifier and predictor of disease, our data are consistent with the hypothesis that variable levels of CAG instability associated with polymorphisms of DNA repair genes may have prognostic implications for various repeat-associated diseases

Crossref

Directory of Open Access Journals

PubMed Central

Enlighten

FigShare