538 research outputs found

    Protein Sequence Alignment Analysis by Local Covariation: Coevolution Statistics Detect Benchmark Alignment Errors

    Get PDF
    The use of sequence alignments to understand protein families is ubiquitous in molecular biology. High quality alignments are difficult to build and protein alignment remains one of the largest open problems in computational biology. Misalignments can lead to inferential errors about protein structure, folding, function, phylogeny, and residue importance. Identifying alignment errors is difficult because alignments are built and validated on the same primary criteria: sequence conservation. Local covariation identifies systematic misalignments and is independent of conservation. We demonstrate an alignment curation tool, LoCo, that integrates local covariation scores with the Jalview alignment editor. Using LoCo, we illustrate how local covariation is capable of identifying alignment errors due to the reduction of positional independence in the region of misalignment. We highlight three alignments from the benchmark database, BAliBASE 3, that contain regions of high local covariation, and investigate the causes to illustrate these types of scenarios. Two alignments contain sequential and structural shifts that cause elevated local covariation. Realignment of these misaligned segments reduces local covariation; these alternative alignments are supported with structural evidence. We also show that local covariation identifies active site residues in a validated alignment of paralogous structures. Loco is available at https://sourceforge.net/projects/locoprotein/files

    Validation of differential gene expression algorithms: Application comparing fold-change estimation to hypothesis testing

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Sustained research on the problem of determining which genes are differentially expressed on the basis of microarray data has yielded a plethora of statistical algorithms, each justified by theory, simulation, or ad hoc validation and yet differing in practical results from equally justified algorithms. Recently, a concordance method that measures agreement among gene lists have been introduced to assess various aspects of differential gene expression detection. This method has the advantage of basing its assessment solely on the results of real data analyses, but as it requires examining gene lists of given sizes, it may be unstable.</p> <p>Results</p> <p>Two methodologies for assessing predictive error are described: a cross-validation method and a posterior predictive method. As a nonparametric method of estimating prediction error from observed expression levels, cross validation provides an empirical approach to assessing algorithms for detecting differential gene expression that is fully justified for large numbers of biological replicates. Because it leverages the knowledge that only a small portion of genes are differentially expressed, the posterior predictive method is expected to provide more reliable estimates of algorithm performance, allaying concerns about limited biological replication. In practice, the posterior predictive method can assess when its approximations are valid and when they are inaccurate. Under conditions in which its approximations are valid, it corroborates the results of cross validation. Both comparison methodologies are applicable to both single-channel and dual-channel microarrays. For the data sets considered, estimating prediction error by cross validation demonstrates that empirical Bayes methods based on hierarchical models tend to outperform algorithms based on selecting genes by their fold changes or by non-hierarchical model-selection criteria. (The latter two approaches have comparable performance.) The posterior predictive assessment corroborates these findings.</p> <p>Conclusions</p> <p>Algorithms for detecting differential gene expression may be compared by estimating each algorithm's error in predicting expression ratios, whether such ratios are defined across microarray channels or between two independent groups.</p> <p>According to two distinct estimators of prediction error, algorithms using hierarchical models outperform the other algorithms of the study. The fact that fold-change shrinkage performed as well as conventional model selection criteria calls for investigating algorithms that combine the strengths of significance testing and fold-change estimation.</p

    Identification of paediatric cancer patients with poor quality of life

    Get PDF
    The primary objective was to describe predictors of physical, emotional and social quality of life (QoL) in children receiving active treatment for cancer. This Canadian multi-institutional cross-sectional study included children with cancer receiving any type of active treatment. The primary caregiver provided information on child physical, emotional and social QoL according to the PedsQL 4.0 Generic Core scales. Between November 2004 and February 2007, 376 families provided the data. In multiple regression, children with acute lymphoblastic leukemia had better physical health (OR: 0.37, 95% CI 0.23, 0.60; P<0.0001) while intensive chemotherapy treatment (OR: 2.34, 95% CI: 1.42, 3.85; P=0.0008) and having a sibling with a chronic condition (OR: 2.53, 95% CI: 1.54, 4.15; P=0.0002) were associated with poor physical QoL. Better emotional health was associated with good prognosis, less intensive chemotherapy treatment and greater household savings, whereas female children and those with a sibling with a chronic condition had poor social QoL. Physical, emotional and social QoL are influenced by demographic, diagnostic and treatment variables. Sibling and household characteristics are associated with QoL. This information will help to identify children at higher risk of poor QoL during treatment for cancer

    Identifying and Seeing beyond Multiple Sequence Alignment Errors Using Intra-Molecular Protein Covariation

    Get PDF
    BACKGROUND: There is currently no way to verify the quality of a multiple sequence alignment that is independent of the assumptions used to build it. Sequence alignments are typically evaluated by a number of established criteria: sequence conservation, the number of aligned residues, the frequency of gaps, and the probable correct gap placement. Covariation analysis is used to find putatively important residue pairs in a sequence alignment. Different alignments of the same protein family give different results demonstrating that covariation depends on the quality of the sequence alignment. We thus hypothesized that current criteria are insufficient to build alignments for use with covariation analyses. METHODOLOGY/PRINCIPAL FINDINGS: We show that current criteria are insufficient to build alignments for use with covariation analyses as systematic sequence alignment errors are present even in hand-curated structure-based alignment datasets like those from the Conserved Domain Database. We show that current non-parametric covariation statistics are sensitive to sequence misalignments and that this sensitivity can be used to identify systematic alignment errors. We demonstrate that removing alignment errors due to 1) improper structure alignment, 2) the presence of paralogous sequences, and 3) partial or otherwise erroneous sequences, improves contact prediction by covariation analysis. Finally we describe two non-parametric covariation statistics that are less sensitive to sequence alignment errors than those described previously in the literature. CONCLUSIONS/SIGNIFICANCE: Protein alignments with errors lead to false positive and false negative conclusions (incorrect assignment of covariation and conservation, respectively). Covariation analysis can provide a verification step, independent of traditional criteria, to identify systematic misalignments in protein alignments. Two non-parametric statistics are shown to be somewhat insensitive to misalignment errors, providing increased confidence in contact prediction when analyzing alignments with erroneous regions because of an emphasis on they emphasize pairwise covariation over group covariation

    Transat—A Method for Detecting the Conserved Helices of Functional RNA Structures, Including Transient, Pseudo-Knotted and Alternative Structures

    Get PDF
    The prediction of functional RNA structures has attracted increased interest, as it allows us to study the potential functional roles of many genes. RNA structure prediction methods, however, assume that there is a unique functional RNA structure and also do not predict functional features required for in vivo folding. In order to understand how functional RNA structures form in vivo, we require sophisticated experiments or reliable prediction methods. So far, there exist only a few, experimentally validated transient RNA structures. On the computational side, there exist several computer programs which aim to predict the co-transcriptional folding pathway in vivo, but these make a range of simplifying assumptions and do not capture all features known to influence RNA folding in vivo. We want to investigate if evolutionarily related RNA genes fold in a similar way in vivo. To this end, we have developed a new computational method, Transat, which detects conserved helices of high statistical significance. We introduce the method, present a comprehensive performance evaluation and show that Transat is able to predict the structural features of known reference structures including pseudo-knotted ones as well as those of known alternative structural configurations. Transat can also identify unstructured sub-sequences bound by other molecules and provides evidence for new helices which may define folding pathways, supporting the notion that homologous RNA sequence not only assume a similar reference RNA structure, but also fold similarly. Finally, we show that the structural features predicted by Transat differ from those assuming thermodynamic equilibrium. Unlike the existing methods for predicting folding pathways, our method works in a comparative way. This has the disadvantage of not being able to predict features as function of time, but has the considerable advantage of highlighting conserved features and of not requiring a detailed knowledge of the cellular environment

    Continuous-time modeling of cell fate determination in Arabidopsis flowers

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The genetic control of floral organ specification is currently being investigated by various approaches, both experimentally and through modeling. Models and simulations have mostly involved boolean or related methods, and so far a quantitative, continuous-time approach has not been explored.</p> <p>Results</p> <p>We propose an ordinary differential equation (ODE) model that describes the gene expression dynamics of a gene regulatory network that controls floral organ formation in the model plant <it>Arabidopsis thaliana</it>. In this model, the dimerization of MADS-box transcription factors is incorporated explicitly. The unknown parameters are estimated from (known) experimental expression data. The model is validated by simulation studies of known mutant plants.</p> <p>Conclusions</p> <p>The proposed model gives realistic predictions with respect to independent mutation data. A simulation study is carried out to predict the effects of a new type of mutation that has so far not been made in <it>Arabidopsis</it>, but that could be used as a severe test of the validity of the model. According to our predictions, the role of dimers is surprisingly important. Moreover, the functional loss of any dimer leads to one or more phenotypic alterations.</p

    Identification and Characterization of a Mef2 Transcriptional Activator in Schistosome Parasites

    Get PDF
    Myocyte enhancer factor 2 protein (Mef2) is an evolutionarily conserved activator of transcription that is critical to induce and control complex processes in myogenesis and neurogenesis in vertebrates and insects, and osteogenesis in vertebrates. In Drosophila, Mef2 null mutants are unable to produce differentiated muscle cells, and in vertebrates, Mef2 mutants are embryonic lethal. Schistosome worms are responsible for over 200 million cases of schistosomiasis globally, but little is known about early development of schistosome parasites after infecting a vertebrate host. Understanding basic schistosome development could be crucial to delineating potential drug targets. Here, we identify and characterize Mef2 from the schistosome worm Schistosoma mansoni (SmMef2). We initially identified SmMef2 as a homolog to the yeast Mef2 homolog, Resistance to Lethality of MKK1P386 overexpression (Rlm1), and we show that SmMef2 is homologous to conserved Mef2 family proteins. Using a genetics approach, we demonstrate that SmMef2 is a transactivator that can induce transcription of four separate heterologous reporter genes by yeast one-hybrid analysis. We also show that Mef2 is expressed during several stages of schistosome development by quantitative PCR and that it can bind to conserved Mef2 DNA consensus binding sequences

    A Regulatory Network for Coordinated Flower Maturation

    Get PDF
    For self-pollinating plants to reproduce, male and female organ development must be coordinated as flowers mature. The Arabidopsis transcription factors AUXIN RESPONSE FACTOR 6 (ARF6) and ARF8 regulate this complex process by promoting petal expansion, stamen filament elongation, anther dehiscence, and gynoecium maturation, thereby ensuring that pollen released from the anthers is deposited on the stigma of a receptive gynoecium. ARF6 and ARF8 induce jasmonate production, which in turn triggers expression of MYB21 and MYB24, encoding R2R3 MYB transcription factors that promote petal and stamen growth. To understand the dynamics of this flower maturation regulatory network, we have characterized morphological, chemical, and global gene expression phenotypes of arf, myb, and jasmonate pathway mutant flowers. We found that MYB21 and MYB24 promoted not only petal and stamen development but also gynoecium growth. As well as regulating reproductive competence, both the ARF and MYB factors promoted nectary development or function and volatile sesquiterpene production, which may attract insect pollinators and/or repel pathogens. Mutants lacking jasmonate synthesis or response had decreased MYB21 expression and stamen and petal growth at the stage when flowers normally open, but had increased MYB21 expression in petals of older flowers, resulting in renewed and persistent petal expansion at later stages. Both auxin response and jasmonate synthesis promoted positive feedbacks that may ensure rapid petal and stamen growth as flowers open. MYB21 also fed back negatively on expression of jasmonate biosynthesis pathway genes to decrease flower jasmonate level, which correlated with termination of growth after flowers have opened. These dynamic feedbacks may promote timely, coordinated, and transient growth of flower organs

    An Observational Cohort Study of the Kynurenine to Tryptophan Ratio in Sepsis: Association with Impaired Immune and Microvascular Function

    Get PDF
    Both endothelial and immune dysfunction contribute to the high mortality rate in human sepsis, but the underlying mechanisms are unclear. In response to infection, interferon-γ activates indoleamine 2,3-dioxygenase (IDO) which metabolizes the essential amino acid tryptophan to the toxic metabolite kynurenine. IDO can be expressed in endothelial cells, hepatocytes and mononuclear leukocytes, all of which contribute to sepsis pathophysiology. Increased IDO activity (measured by the kynurenine to tryptophan [KT] ratio in plasma) causes T-cell apoptosis, vasodilation and nitric oxide synthase inhibition. We hypothesized that IDO activity in sepsis would be related to plasma interferon-γ, interleukin-10, T cell lymphopenia and impairment of microvascular reactivity, a measure of endothelial nitric oxide bioavailability. In an observational cohort study of 80 sepsis patients (50 severe and 30 non-severe) and 40 hospital controls, we determined the relationship between IDO activity (plasma KT ratio) and selected plasma cytokines, sepsis severity, nitric oxide-dependent microvascular reactivity and lymphocyte subsets in sepsis. Plasma amino acids were measured by high performance liquid chromatography and microvascular reactivity by peripheral arterial tonometry. The plasma KT ratio was increased in sepsis (median 141 [IQR 64–235]) compared to controls (36 [28–52]); p<0.0001), and correlated with plasma interferon-γ and interleukin-10, and inversely with total lymphocyte count, CD8+ and CD4+ T-lymphocytes, systolic blood pressure and microvascular reactivity. In response to treatment of severe sepsis, the median KT ratio decreased from 162 [IQR 100–286] on day 0 to 89 [65–139] by day 7; p = 0.0006) and this decrease in KT ratio correlated with a decrease in the Sequential Organ Failure Assessment score (p<0.0001). IDO-mediated tryptophan catabolism is associated with dysregulated immune responses and impaired microvascular reactivity in sepsis and may link these two fundamental processes in sepsis pathophysiology

    The wonders of flap endonucleases: structure, function, mechanism and regulation.

    Get PDF
    Processing of Okazaki fragments to complete lagging strand DNA synthesis requires coordination among several proteins. RNA primers and DNA synthesised by DNA polymerase α are displaced by DNA polymerase δ to create bifurcated nucleic acid structures known as 5'-flaps. These 5'-flaps are removed by Flap Endonuclease 1 (FEN), a structure-specific nuclease whose divalent metal ion-dependent phosphodiesterase activity cleaves 5'-flaps with exquisite specificity. FENs are paradigms for the 5' nuclease superfamily, whose members perform a wide variety of roles in nucleic acid metabolism using a similar nuclease core domain that displays common biochemical properties and structural features. A detailed review of FEN structure is undertaken to show how DNA substrate recognition occurs and how FEN achieves cleavage at a single phosphate diester. A proposed double nucleotide unpairing trap (DoNUT) is discussed with regards to FEN and has relevance to the wider 5' nuclease superfamily. The homotrimeric proliferating cell nuclear antigen protein (PCNA) coordinates the actions of DNA polymerase, FEN and DNA ligase by facilitating the hand-off intermediates between each protein during Okazaki fragment maturation to maximise through-put and minimise consequences of intermediates being released into the wider cellular environment. FEN has numerous partner proteins that modulate and control its action during DNA replication and is also controlled by several post-translational modification events, all acting in concert to maintain precise and appropriate cleavage of Okazaki fragment intermediates during DNA replication
    corecore