23 research outputs found

    Multiple sequence alignments of partially coding nucleic acid sequences

    Get PDF
    BACKGROUND: High quality sequence alignments of RNA and DNA sequences are an important prerequisite for the comparative analysis of genomic sequence data. Nucleic acid sequences, however, exhibit a much larger sequence heterogeneity compared to their encoded protein sequences due to the redundancy of the genetic code. It is desirable, therefore, to make use of the amino acid sequence when aligning coding nucleic acid sequences. In many cases, however, only a part of the sequence of interest is translated. On the other hand, overlapping reading frames may encode multiple alternative proteins, possibly with intermittent non-coding parts. Examples are, in particular, RNA virus genomes. RESULTS: The standard scoring scheme for nucleic acid alignments can be extended to incorporate simultaneously information on translation products in one or more reading frames. Here we present a multiple alignment tool, codaln, that implements a combined nucleic acid plus amino acid scoring model for pairwise and progressive multiple alignments that allows arbitrary weighting for almost all scoring parameters. Resource requirements of codaln are comparable with those of standard tools such as ClustalW. CONCLUSION: We demonstrate the applicability of codaln to various biologically relevant types of sequences (bacteriophage Levivirus and Vertebrate Hox clusters) and show that the combination of nucleic acid and amino acid sequence information leads to improved alignments. These, in turn, increase the performance of analysis tools that depend strictly on good input alignments such as methods for detecting conserved RNA secondary structure elements

    Can comprehensive background knowledge be incorporated into substitution models to improve phylogenetic analyses? A case study on major arthropod relationships

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Whenever different data sets arrive at conflicting phylogenetic hypotheses, only testable causal explanations of sources of errors in at least one of the data sets allow us to critically choose among the conflicting hypotheses of relationships. The large (28S) and small (18S) subunit rRNAs are among the most popular markers for studies of deep phylogenies. However, some nodes supported by this data are suspected of being artifacts caused by peculiarities of the evolution of these molecules. Arthropod phylogeny is an especially controversial subject dotted with conflicting hypotheses which are dependent on data set and method of reconstruction. We assume that phylogenetic analyses based on these genes can be improved further i) by enlarging the taxon sample and ii) employing more realistic models of sequence evolution incorporating non-stationary substitution processes and iii) considering covariation and pairing of sites in rRNA-genes.</p> <p>Results</p> <p>We analyzed a large set of arthropod sequences, applied new tools for quality control of data prior to tree reconstruction, and increased the biological realism of substitution models. Although the split-decomposition network indicated a high noise content in the data set, our measures were able to both improve the analyses and give causal explanations for some incongruities mentioned from analyses of rRNA sequences. However, misleading effects did not completely disappear.</p> <p>Conclusion</p> <p>Analyses of data sets that result in ambiguous phylogenetic hypotheses demand for methods, which do not only filter stochastic noise, but likewise allow to differentiate phylogenetic signal from systematic biases. Such methods can only rely on our findings regarding the evolution of the analyzed data. Analyses on independent data sets then are crucial to test the plausibility of the results. Our approach can easily be extended to genomic data, as well, whereby layers of quality assessment are set up applicable to phylogenetic reconstructions in general.</p

    ESCO1 and CTCF enable formation of long chromatin loops by protecting cohesinSTAG1 from WAPL.

    Get PDF
    Eukaryotic genomes are folded into loops. It is thought that these are formed by cohesin complexes via extrusion, either until loop expansion is arrested by CTCF or until cohesin is removed from DNA by WAPL. Although WAPL limits cohesin's chromatin residence time to minutes, it has been reported that some loops exist for hours. How these loops can persist is unknown. We show that during G1-phase, mammalian cells contain acetylated cohesinSTAG1 which binds chromatin for hours, whereas cohesinSTAG2 binds chromatin for minutes. Our results indicate that CTCF and the acetyltransferase ESCO1 protect a subset of cohesinSTAG1 complexes from WAPL, thereby enable formation of long and presumably long-lived loops, and that ESCO1, like CTCF, contributes to boundary formation in chromatin looping. Our data are consistent with a model of nested loop extrusion, in which acetylated cohesinSTAG1 forms stable loops between CTCF sites, demarcating the boundaries of more transient cohesinSTAG2 extrusion activity

    Accurate and efficient reconstruction of deep phylogenies from structured RNAs

    Get PDF
    Ribosomal RNA (rRNA) genes are probably the most frequently used data source in phylogenetic reconstruction. Individual columns of rRNA alignments are not independent as a consequence of their highly conserved secondary structures. Unless explicitly taken into account, these correlation can distort the phylogenetic signal and/or lead to gross overestimates of tree stability. Maximum likelihood and Bayesian approaches are of course amenable to using RNA-specific substitution models that treat conserved base pairs appropriately, but require accurate secondary structure models as input. So far, however, no accurate and easy-to-use tool has been available for computing structure-aware alignments and consensus structures that can deal with the large rRNAs. The RNAsalsa approach is designed to fill this gap. Capitalizing on the improved accuracy of pairwise consensus structures and informed by a priori knowledge of group-specific structural constraints, the tool provides both alignments and consensus structures that are of sufficient accuracy for routine phylogenetic analysis based on RNA-specific substitution models. The power of the approach is demonstrated using two rRNA data sets: a mitochondrial rRNA set of 26 Mammalia, and a collection of 28S nuclear rRNAs representative of the five major echinoderm groups

    Multiple sequence alignments of partially coding nucleic acid sequences

    No full text
    High quality sequence alignments of RNA and DNA sequences are an important prerequisite for the comparative analysis of genomic sequence data. Nucleic acid sequences, however, exhibit a much larger sequence heterogeneity compared to their encoded protein sequences due to the redundancy of the genetic code. It is desirable, therefore, to make use of the amino acid sequence when aligning coding nucleic acid sequences. In many cases, however, only a part of the sequence of interest is translated. On the other hand, overlapping reading frames may encode multiple alternative proteins, possibly with intermittent non-coding parts. Examples are, in particular, RNA virus genomes

    Centrala faktorer vid extern chefsrekrytering

    Get PDF
    En organisation kan anlita en extern rekryteringskonsult, att ansvara för rekryteringsprocessen när en ny medarbetare ska rekryteras. Det sker ofta i samband med chefsrekrytering. Syftet med studien var att öka förståelsen för centrala faktorer vid extern chefsrekrytering och vilka faktorer hos kandidaten som externa rekryteringskonsulter bedömer som viktiga för att kandidaten kan komma att presenteras för kunden. Studien har kommit fram till att kandidatens utbildning, yrkeserfarenhet och personliga egenskaper ska stämma in på kravprofil och rollbeskrivningen för tjänsten. Referenser och psykologiska test och personlighetsformulär används för att bekräfta kandidatens egna uppgifter och rekryteringskonsultens helhetsbedömning. Det används också som underlag vid intervjuer med kandidaten. Resultatet i studien kan kopplas till tidigare studier och litteratur på ämnesområdet. Studien är användbar för psykologiska studier inom ämnesområdet chefsrekrytering, för den som är intresserade av hur extern chefsrekrytering fungerar och för den som kan tänka sig en framtid inom rekryteringsbranschen

    HiCognition: a visual exploration and hypothesis testing tool for 3D genomics

    Full text link
    AbstractThe 3D organization of the genome and epigenetic marks play important roles in gene expression, DNA repair, and chromosome segregation. Understanding how structure and composition of the chromatin fiber contribute to function requires integrated analysis of multiple genomics datasets from various techniques, experimental conditions, and cell states. Genome browsers facilitate such analysis, yet currently visualize only a few regions at a time and lack statistical functions that are often necessary to extract meaningful information. Here, we present HiCognition, a visual exploration and machine-learning tool based on a new genomic region set concept, which enables detection of patterns and associations between 3D chromosome conformation and collections of 1D genomics profiles of any type. By revealing how transcriptional activity and cohesin subunit isoforms contribute to chromosome conformation, we showcase how the flexible user interface and machine learning tools of HiCognition can help understand the relationship between structure and function of the genome.</jats:p

    HiCognition: a visual exploration and hypothesis testing tool for 3D genomics

    No full text
    Abstract Genome browsers facilitate integrated analysis of multiple genomics datasets yet visualize only a few regions at a time and lack statistical functions for extracting meaningful information. We present HiCognition, a visual exploration and machine-learning tool based on a new genomic region set concept, enabling detection of patterns and associations between 3D chromosome conformation and collections of 1D genomics profiles of any type. By revealing how transcription and cohesion subunit isoforms contribute to chromosome conformation, we showcase how the flexible user interface and machine learning tools of HiCognition help to understand the relationship between the structure and function of the genome
    corecore