9,960 research outputs found
AIP1 is a novel Agenet/Tudor domain protein from Arabidopsis that interacts with regulators of DNA replication, transcription and chromatin remodeling
Background: DNA replication and transcription are dynamic processes regulating plant development that are dependent on the chromatin accessibility. Proteins belonging to the Agenet/Tudor domain family are known as histone modification "readers" and classified as chromatin remodeling proteins. Histone modifications and chromatin remodeling have profound effects on gene expression as well as on DNA replication, but how these processes are integrated has not been completely elucidated. It is clear that members of the Agenet/Tudor family are important regulators of development playing roles not well known in plants.
Methods: Bioinformatics and phylogenetic analyses of the Agenet/Tudor Family domain in the plant kingdom were carried out with sequences from available complete genomes databases. 3D structure predictions of Agenet/Tudor domains were calculated by I-TASSER server. Protein interactions were tested in two-hybrid, GST pulldown, semi-in vivo pulldown and Tandem Affinity Purification assays. Gene function was studied in a T-DNA insertion GABI-line.
Results: In the present work we analyzed the family of Agenet/Tudor domain proteins in the plant kingdom and we mapped the organization of this family throughout plant evolution. Furthermore, we characterized a member from Arabidopsis thaliana named AIP1 that harbors Agenet/Tudor and DUF724 domains. AIP1 interacts with ABAP1, a plant regulator of DNA replication licensing and gene transcription, with a plant histone modification "reader" (LHP1) and with non modified histones. AIP1 is expressed in reproductive tissues and its down-regulation delays flower development timing. Also, expression of ABAP1 and LHP1 target genes were repressed in flower buds of plants with reduced levels of AIP1.
Conclusions: AIP1 is a novel Agenet/Tudor domain protein in plants that could act as a link between DNA replication, transcription and chromatin remodeling during flower development
Computational identification and analysis of noncoding RNAs - Unearthing the buried treasures in the genome
The central dogma of molecular biology states that the genetic information flows from DNA to RNA to protein. This dogma has exerted a substantial influence on our understanding of the genetic activities in the cells. Under this influence, the prevailing assumption until the recent past was that genes are basically repositories for protein coding information, and proteins are responsible for most of the important biological functions in all cells. In the meanwhile, the importance of RNAs has remained rather obscure, and RNA was mainly viewed as a passive intermediary that bridges the gap between DNA and protein. Except for classic examples such as tRNAs (transfer RNAs) and rRNAs (ribosomal RNAs), functional noncoding RNAs were considered to be rare.
However, this view has experienced a dramatic change during the last decade, as systematic screening of various genomes identified myriads of noncoding RNAs (ncRNAs), which are RNA molecules that function without being translated into proteins [11], [40]. It has been realized that many ncRNAs play important roles in various biological processes. As RNAs can interact with other RNAs and DNAs in a sequence-specific manner, they are especially useful in tasks that require highly specific nucleotide recognition [11]. Good examples are the miRNAs (microRNAs) that regulate gene expression by targeting mRNAs (messenger RNAs) [4], [20], and the siRNAs (small interfering RNAs) that take part in the RNAi (RNA interference) pathways for gene silencing [29], [30]. Recent developments show that ncRNAs are extensively involved in many gene regulatory mechanisms [14], [17].
The roles of ncRNAs known to this day are truly diverse. These include transcription and translation control, chromosome replication, RNA processing and modification, and protein degradation and translocation [40], just to name a few. These days, it is even claimed that ncRNAs dominate the genomic output of the higher organisms such as mammals, and it is being suggested that the greater portion of their genome (which does not encode proteins) is dedicated to the control and regulation of cell development [27]. As more and more evidence piles up, greater attention is paid to ncRNAs, which have been neglected for a long time. Researchers began to realize that the vast majority of the genome that was regarded as “junk,” mainly because it was not well understood, may indeed hold the key for the best kept secrets in life, such as the mechanism of alternative splicing, the control of epigenetic variations and so forth [27]. The complete range and extent of the role of ncRNAs are not so obvious at this point, but it is certain that a comprehensive understanding of cellular processes is not possible without understanding the functions of ncRNAs [47]
Ab initio RNA folding
RNA molecules are essential cellular machines performing a wide variety of
functions for which a specific three-dimensional structure is required. Over
the last several years, experimental determination of RNA structures through
X-ray crystallography and NMR seems to have reached a plateau in the number of
structures resolved each year, but as more and more RNA sequences are being
discovered, need for structure prediction tools to complement experimental data
is strong. Theoretical approaches to RNA folding have been developed since the
late nineties when the first algorithms for secondary structure prediction
appeared. Over the last 10 years a number of prediction methods for 3D
structures have been developed, first based on bioinformatics and data-mining,
and more recently based on a coarse-grained physical representation of the
systems. In this review we are going to present the challenges of RNA structure
prediction and the main ideas behind bioinformatic approaches and physics-based
approaches. We will focus on the description of the more recent physics-based
phenomenological models and on how they are built to include the specificity of
the interactions of RNA bases, whose role is critical in folding. Through
examples from different models, we will point out the strengths of
physics-based approaches, which are able not only to predict equilibrium
structures, but also to investigate dynamical and thermodynamical behavior, and
the open challenges to include more key interactions ruling RNA folding.Comment: 28 pages, 18 figure
Highly Accurate Fragment Library for Protein Fold Recognition
Proteins play a crucial role in living organisms as they perform many vital tasks in every living cell. Knowledge of protein folding has a deep impact on understanding the heterogeneity and molecular functions of proteins. Such information leads to crucial advances in drug design and disease understanding. Fold recognition is a key step in the protein structure discovery process, especially when traditional computational methods fail to yield convincing structural homologies. In this work, we present a new protein fold recognition approach using machine learning and data mining methodologies.
First, we identify a protein structural fragment library (Frag-K) composed of a set of backbone fragments ranging from 4 to 20 residues as the structural “keywords” that can effectively distinguish between major protein folds. We firstly apply randomized spectral clustering and random forest algorithms to construct representative and sensitive protein fragment libraries from a large-scale of high-quality, non-homologous protein structures available in PDB. We analyze the impacts of clustering cut-offs on the performance of the fragment libraries. Then, the Frag-K fragments are employed as structural features to classify protein structures in major protein folds defined by SCOP (Structural Classification of Proteins). Our results show that a structural dictionary with ~400 4- to 20-residue Frag-K fragments is capable of classifying major SCOP folds with high accuracy.
Then, based on Frag-k, we design a novel deep learning architecture, so-called DeepFrag-k, which identifies fold discriminative features to improve the accuracy of protein fold recognition. DeepFrag-k is composed of two stages: the first stage employs a multimodal Deep Belief Network (DBN) to predict the potential structural fragments given a sequence, represented as a fragment vector, and then the second stage uses a deep convolution neural network (CNN) to classify the fragment vectors into the corresponding folds. Our results show that DeepFrag-k yields 92.98% accuracy in predicting the top-100 most popular fragments, which can be used to generate discriminative fragment feature vectors to improve protein fold recognition
Recommended from our members
PATTERNA: transcriptome-wide search for functional RNA elements via structural data signatures.
Establishing a link between RNA structure and function remains a great challenge in RNA biology. The emergence of high-throughput structure profiling experiments is revolutionizing our ability to decipher structure, yet principled approaches for extracting information on structural elements directly from these data sets are lacking. We present PATTERNA, an unsupervised pattern recognition algorithm that rapidly mines RNA structure motifs from profiling data. We demonstrate that PATTERNA detects motifs with an accuracy comparable to commonly used thermodynamic models and highlight its utility in automating data-directed structure modeling from large data sets. PATTERNA is versatile and compatible with diverse profiling techniques and experimental conditions
Force-induced misfolding in RNA
RNA folding is a kinetic process governed by the competition of a large
number of structures stabilized by the transient formation of base pairs that
may induce complex folding pathways and the formation of misfolded structures.
Despite of its importance in modern biophysics, the current understanding of
RNA folding kinetics is limited by the complex interplay between the weak
base-pair interactions that stabilize the native structure and the disordering
effect of thermal forces. The possibility of mechanically pulling individual
molecules offers a new perspective to understand the folding of nucleic acids.
Here we investigate the folding and misfolding mechanism in RNA secondary
structures pulled by mechanical forces. We introduce a model based on the
identification of the minimal set of structures that reproduce the patterns of
force-extension curves obtained in single molecule experiments. The model
requires only two fitting parameters: the attempt frequency at the level of
individual base pairs and a parameter associated to a free energy correction
that accounts for the configurational entropy of an exponentially large number
of neglected secondary structures. We apply the model to interpret results
recently obtained in pulling experiments in the three-helix junction S15 RNA
molecule (RNAS15). We show that RNAS15 undergoes force-induced misfolding where
force favors the formation of a stable non-native hairpin. The model reproduces
the pattern of unfolding and refolding force-extension curves, the distribution
of breakage forces and the misfolding probability obtained in the experiments.Comment: 28 pages, 11 figure
Plant-mPLoc: A Top-Down Strategy to Augment the Power for Predicting Plant Protein Subcellular Localization
One of the fundamental goals in proteomics and cell biology is to identify the
functions of proteins in various cellular organelles and pathways. Information of
subcellular locations of proteins can provide useful insights for revealing their
functions and understanding how they interact with each other in cellular network
systems. Most of the existing methods in predicting plant protein subcellular
localization can only cover three or four location sites, and none of them can be
used to deal with multiplex plant proteins that can simultaneously exist at two, or
move between, two or more different location sites. Actually, such multiplex proteins
might have special biological functions worthy of particular notice. The present
study was devoted to improve the existing plant protein subcellular location
predictors from the aforementioned two aspects. A new predictor called
“Plant-mPLoc” is developed by integrating the gene ontology
information, functional domain information, and sequential evolutionary information
through three different modes of pseudo amino acid composition. It can be used to
identify plant proteins among the following 12 location sites: (1) cell membrane, (2)
cell wall, (3) chloroplast, (4) cytoplasm, (5) endoplasmic reticulum, (6)
extracellular, (7) Golgi apparatus, (8) mitochondrion, (9) nucleus, (10) peroxisome,
(11) plastid, and (12) vacuole. Compared with the existing methods for predicting
plant protein subcellular localization, the new predictor is much more powerful and
flexible. Particularly, it also has the capacity to deal with multiple-location
proteins, which is beyond the reach of any existing predictors specialized for
identifying plant protein subcellular localization. As a user-friendly web-server,
Plant-mPLoc is freely accessible at http://www.csbio.sjtu.edu.cn/bioinf/plant-multi/. Moreover, for the
convenience of the vast majority of experimental scientists, a step-by-step guide is
provided on how to use the web-server to get the desired results. It is anticipated
that the Plant-mPLoc predictor as presented in this paper will become a very useful
tool in plant science as well as all the relevant areas
G-Quadruplex Aptamer Beacon for Detection of Prostate Cancer Biomarker
The prostate is the major male reproductive gland involved in male fertility and plays
an important role in triggering of molecular pathways relevant to fertility success.
Unfortunately, in Portugal prostate cancer is the most common cancer type among
men, being asymptomatic in earlier stages. Thus, is important early detection of
disease.
NCL is a multifunctional protein involved in multiple biological processes under both
physiological and pathological processes and can have several cellular localizations.
Cell surface protein overexpression was found restricted to cancer cells, namely in
prostate cancer cells. Thus, we can consider NCL as a potential biomarker for cancer
diagnosis and a target for cancer treatment. The AS1411 is an aptamer capable to
recognise and binds specifically NCL and have a therapeutic effect on cancer cells
through of induction of antiproliferative activity. Beyond its therapeutic use, AS1411
can be used in imaging and diagnostic, particularly on aptasensors development. One
of the most relevant characteristics of this aptamer is the ability to fold in a G4
conformation, a secondary structure of nucleic acids. G4 structure confers stabilization
to sequence and availability to bind NCL.
Thus, in this work is presented the first approach of use AS1411 aptamer to prostate
cancer diagnosis, namely through the design of molecular beacon (MB) designated by
AS1411N5. Initially, biophysical characterization of AS1411-N5 was done by circular
dichroism, nuclear magnetic resonance or fluorometric spectroscopies. Additionally, it
was performed microfluidic experiments, to detect NCL using AS1411-N5 in biological
samples.
The results demonstrated that the proposed AS1411-N5 adopt a G4 structure and it is
capable to bind with specificity and selectivity NCL, even in plasma of human patients
with prostate cancer.A próstata é a maior glândula reprodutiva masculina e tem um papel importante nas
vias moleculares relevantes para o sucesso da fertilização. Infelizmente, em Portugal o
cancro da próstata é o cancro mais comum entre os homens, sendo assintomático em
estadios iniciais. Assim é imperativo a deteção precoce da doença.
A nucleolina (NCL) Ă© uma proteĂna multifuncional envolvida em mĂşltiplos processos
biológicos sob condições fisiológicas e patológicas, podendo ter várias localizações
celulares. A sobre-expressĂŁo da proteĂna na superfĂcie das cĂ©lulas Ă© apenas encontrada
em células cancerosas, nomeadamente as do cancro da próstata. Assim a NCL pode ser
considerada como um potencial biomarcador para o diagnĂłstico e tratamento do
cancro da prĂłstata. O AS411 Ă© um aptamero capaz de reconhecer e ligar especificamente
a esta proteĂna, e de ter um efeito terapĂŞutico nas cĂ©lulas cancerosas ao induzir
atividade antiproliferativa. Além do uso terapêutico, a sequência pode ser utilizada na
imagiologia e diagnóstico, particularmente através do desenvolvimento de
aptasensores. Uma das caracterĂsticas mais relevantes do aptamero AS1411 Ă© a
capacidade de adotar a configuração de G-quadruplex (G4), uma estrutura secundária
dos ácidos nucleicos. As estruturas G4 conferem estabilização à sequência e capacidade
de ligar Ă NCL quando adota esta estrutura.
Assim, neste trabalho Ă© apresentada uma primeira abordagem do uso do AS1411 no
diagnóstico do cancro da próstata, nomeadamente através da construção de uma sonda
a partir da sequĂŞncia deste aptamero designado por AS1411N5. Inicialmente foi
efetuada a caracterização biofĂsica do AS1411-N5 a nĂvel da estrutura e interação com o
alvo, recorrendo Ă s espectroscopias dicroĂsmo circular e ressonância magnĂ©tica
nuclear, e ensaios fluorométricos. Adicionalmente foram efetuadas experiências de
microfluĂdica, para o uso do AS1411N5 como sonda de deteção da NCL.
Estes resultados demonstraram, que o AS1411-N5adota a estrutura G4 e Ă© capaz de ligar
especificamente e com seletividade com a NCL, mesmo em amostras biolĂłgicas
- …