7,612 research outputs found

    A structural study for the optimisation of functional motifs encoded in protein sequences

    Get PDF
    BACKGROUND: A large number of PROSITE patterns select false positives and/or miss known true positives. It is possible that – at least in some cases – the weak specificity and/or sensitivity of a pattern is due to the fact that one, or maybe more, functional and/or structural key residues are not represented in the pattern. Multiple sequence alignments are commonly used to build functional sequence patterns. If residues structurally conserved in proteins sharing a function cannot be aligned in a multiple sequence alignment, they are likely to be missed in a standard pattern construction procedure. RESULTS: Here we present a new procedure aimed at improving the sensitivity and/ or specificity of poorly-performing patterns. The procedure can be summarised as follows: 1. residues structurally conserved in different proteins, that are true positives for a pattern, are identified by means of a computational technique and by visual inspection. 2. the sequence positions of the structurally conserved residues falling outside the pattern are used to build extended sequence patterns. 3. the extended patterns are optimised on the SWISS-PROT database for their sensitivity and specificity. The method was applied to eight PROSITE patterns. Whenever structurally conserved residues are found in the surface region close to the pattern (seven out of eight cases), the addition of information inferred from structural analysis is shown to improve pattern selectivity and in some cases selectivity and sensitivity as well. In some of the cases considered the procedure allowed the identification of functionally interesting residues, whose biological role is also discussed. CONCLUSION: Our method can be applied to any type of functional motif or pattern (not only PROSITE ones) which is not able to select all and only the true positive hits and for which at least two true positive structures are available. The computational technique for the identification of structurally conserved residues is already available on request and will be soon accessible on our web server. The procedure is intended for the use of pattern database curators and of scientists interested in a specific protein family for which no specific or selective patterns are yet available

    Techniques for RNA in vivo imaging in plants

    Get PDF
    Since the discovery of small RNAs and RNA silencing, RNA biology has taken a centre stage in cell and developmental biology. Small RNAs, but also mRNAs and other types of cellular and viral RNAs are processed at specific subcellular localizations. To fully understand cellular RNA metabolism and the various processes influenced byit, techniques are required that permit the sequence-specific tracking of RNAs in living cells. A variety of methods for RNA visualization have been developed since the 1990s, but plant cells pose particular challenges and not all approaches are applicable to them. On the other hand, plant RNA metabolism is particularly diverse and RNAs are even transported between cells, so RNA imaging can potentially provide many valuable insights into plant function at the cellular and tissue level. This Short Review briefly introduces the currently available techniques for plant RNA in vivo imaging and discusses their suitability for different biological questions.PostprintPeer reviewe

    Structure-Function Studies of the ORF1 Protein From the Insertion-Site Specific Retroposon M5 Found in Indo-Pakistan Urban Malarial Vector Anopheles stephensi

    Get PDF
    In the 1950s Barbara McClintock inferred the occurrence of transposition: the movement of small segments of DNA - entities known as transposable elements from one position of the genome to another (McClintock, 1950). Classification of transposable elements in regards to mechanism of transposition distinguishes them into two groups; transposons (Class II) and retroposons (Class I). The term retrotransposon was coined as it illustrates the transposition of these elements is dependent on the reverse transcription of RNA to DNA through a reverse transcriptase, also known as the ‘copy and paste’ transposition. The M5 retroposon has been found in numerous mosquito species such as Anopheles stephensi. M5 present in these Anopheles is a class 1, non-LTR transposable element of the jockey clade family with two open reading frames (ORF). Due to its APE like endonuclease, M5 should transpose to random sites of the genome. However, in A. stephensi the element has been reported to transpose with site specificity. The aim of the project was to gain structural and functional information on the role of the ORF1 protein of M5 in order to understand the element’s site specificity. To perform functional and structural studies, an Escherichia coli expression vector was designed with a synthetic AsM5 ORF1 insert. Heterologous expression and purification of ORF1p in E.coli produced signs degradation or very low yield of the unfolded protein possibly due to the host’s inability to process some eukaryotic features required for the protein. Saccharomyces cerevisiae was then chosen as an expression system for AsM5 ORF1p production. The AsM5 ORF1 gene was cloned from the E.coli expression vector into the pYES2/CT S.cerevisiae expression vector and sequencing verified that the AsM5 ORF1 insert was successfully cloned into the pYES2/CT vector. Optimization of the lysis and expression protocol in S.cerevisiae had slowed progress but a highly effective method of cell lysis was developed. Expression of the full length ORF1p in S.cerevisiae was not confirmed and difficulties in expression could be attributed to the fact that the original synthetic ORF1 sequence which was cloned is codon optimised for expression in E.coli hindering expression in S.cerevisiae. CPSF100_C was one of the conserved domains identified in the AsM5 ORF1 amino acid sequence using conserved domain web tools. For further analysis the CPSF100_C domain was cloned into an E.coli vector and successfully expressed in rich media, the protein was the purified using immobilized metal ion affinity chromatography (IMAC) and ion exchange chromatography (IEC). In order to progress to NMR studies of the domain,15N labelled expression of CPSF100_Cp was performed. Usually, several expression showed the non-peptide fusion partner glutathione S-transferase (GST) being expressed without the protein of interest possibly due to mRNA instability when the gene is expressed in minimal media. The identification of protein domains such as CPSF100_C and their interactions with nucleic acids and other proteins will likely be the key to understanding AsM5’s site specific retrotransposition

    Towards Understanding the Origin of Genetic Languages

    Full text link
    Molecular biology is a nanotechnology that works--it has worked for billions of years and in an amazing variety of circumstances. At its core is a system for acquiring, processing and communicating information that is universal, from viruses and bacteria to human beings. Advances in genetics and experience in designing computers have taken us to a stage where we can understand the optimisation principles at the root of this system, from the availability of basic building blocks to the execution of tasks. The languages of DNA and proteins are argued to be the optimal solutions to the information processing tasks they carry out. The analysis also suggests simpler predecessors to these languages, and provides fascinating clues about their origin. Obviously, a comprehensive unraveling of the puzzle of life would have a lot to say about what we may design or convert ourselves into.Comment: (v1) 33 pages, contributed chapter to "Quantum Aspects of Life", edited by D. Abbott, P. Davies and A. Pati, (v2) published version with some editin

    Functional and Structural Insights into Novel Bacteriophage Defence Islands

    Get PDF
    Bacteriophages are the most abundant organisms on the planet and are a major driving force in bacterial evolution. As obligate intracellular parasites, phages are reliant on their bacterial host for propagation, but bacteria have evolved means to prevent phage infections. Bacteriophage exclusion (BREX) is a novel phage-resistance system that confers resistance to a wide array of phages, functioning independently of restriction-modification, CRISPR-Cas and abortive infection mechanisms. BREX loci are present in ~10% of bacterial and archaeal genomes, including pathogenic strains such as non-typhoidal invasive Salmonella enterica and multidrug resistant Escherichia fergusonii. Whilst investigating the mechanism of BREX in E. fergusonii, a putative endonuclease was discovered, clustered within the BREX locus. This enzyme, BrxU, was biochemically and structurally characterised, and shown to be a standalone phage defence system that targets modified phage genomes. It became clear that the BREX and BrxU phage defence systems were organised into a phage defence island, constituting a bacterial immune system capable of resisting multiple phage types. Both systems detailed in this thesis represent novel antiphage mechanisms with potential for biotechnological application. The BrxU endonuclease structure has been solved to 2.12 Å and reveals insight into key protein domains implicated in type IV restriction enzymes. BrxU has been observed to utilise a range of nucleotide and metal cofactors and confers extensive protection to its bacterial host against phage infection

    DotAligner:Identification and clustering of RNA structure motifs

    Get PDF
    Abstract The diversity of processed transcripts in eukaryotic genomes poses a challenge for the classification of their biological functions. Sparse sequence conservation in non-coding sequences and the unreliable nature of RNA structure predictions further exacerbate this conundrum. Here, we describe a computational method, DotAligner, for the unsupervised discovery and classification of homologous RNA structure motifs from a set of sequences of interest. Our approach outperforms comparable algorithms at clustering known RNA structure families, both in speed and accuracy. It identifies clusters of known and novel structure motifs from ENCODE immunoprecipitation data for 44 RNA-binding proteins

    A short survey on protein blocks.

    Get PDF
    International audienceProtein structures are classically described in terms of secondary structures. Even if the regular secondary structures have relevant physical meaning, their recognition from atomic coordinates has some important limitations such as uncertainties in the assignment of boundaries of helical and ÎČ-strand regions. Further, on an average about 50% of all residues are assigned to an irregular state, i.e., the coil. Thus different research teams have focused on abstracting conformation of protein backbone in the localized short stretches. Using different geometric measures, local stretches in protein structures are clustered in a chosen number of states. A prototype representative of the local structures in each cluster is generally defined. These libraries of local structures prototypes are named as "structural alphabets". We have developed a structural alphabet, named Protein Blocks, not only to approximate the protein structure, but also to predict them from sequence. Since its development, we and other teams have explored numerous new research fields using this structural alphabet. We review here some of the most interesting applications

    Discovering Sequence Motifs with Arbitrary Insertions and Deletions

    Get PDF
    Biology is encoded in molecular sequences: deciphering this encoding remains a grand scientific challenge. Functional regions of DNA, RNA, and protein sequences often exhibit characteristic but subtle motifs; thus, computational discovery of motifs in sequences is a fundamental and much-studied problem. However, most current algorithms do not allow for insertions or deletions (indels) within motifs, and the few that do have other limitations. We present a method, GLAM2 (Gapped Local Alignment of Motifs), for discovering motifs allowing indels in a fully general manner, and a companion method GLAM2SCAN for searching sequence databases using such motifs. glam2 is a generalization of the gapless Gibbs sampling algorithm. It re-discovers variable-width protein motifs from the PROSITE database significantly more accurately than the alternative methods PRATT and SAM-T2K. Furthermore, it usefully refines protein motifs from the ELM database: in some cases, the refined motifs make orders of magnitude fewer overpredictions than the original ELM regular expressions. GLAM2 performs respectably on the BAliBASE multiple alignment benchmark, and may be superior to leading multiple alignment methods for “motif-like” alignments with N- and C-terminal extensions. Finally, we demonstrate the use of GLAM2 to discover protein kinase substrate motifs and a gapped DNA motif for the LIM-only transcriptional regulatory complex: using GLAM2SCAN, we identify promising targets for the latter. GLAM2 is especially promising for short protein motifs, and it should improve our ability to identify the protein cleavage sites, interaction sites, post-translational modification attachment sites, etc., that underlie much of biology. It may be equally useful for arbitrarily gapped motifs in DNA and RNA, although fewer examples of such motifs are known at present. GLAM2 is public domain software, available for download at http://bioinformatics.org.au/glam2

    The development of aptamer-based probes for the detection of TB antigens ESAT-6.CFP-10 potential TB diagnostic tools

    Get PDF
    Includes abstract.Includes bibliographical references.Lack of point-of-care (PoC) diagnostic tools for TB hinders control of the disease, particularly in resource-limited, high HIV and TB prevalence countries. Therefore, there is a need for simple, rapid, accurate, and affordable PoC diagnostics to detect active TB early enough for opportune intervention. To develop TB detection probes that will constitute such diagnostics, our research group recently isolated DNA aptamers that bind to a putative marker for active TB; the ESAT-6.CFP-10 heterodimer. Aptamers are highly specific artificial mimics of antibodies that have shown great prospects in diagnostic applications. The aim of this study was to characterise the anti-ESAT-6.CFP-10 aptamers, and to optimise them into more specific and affordable detection probes for the development of potential PoC TB diagnostic tools
    • 

    corecore