88 research outputs found

    Normalized Affymetrix expression data are biased by G-quadruplex formation

    Get PDF
    Probes with runs of four or more guanines (G-stacks) in their sequences can exhibit a level of hybridization that is unrelated to the expression levels of the mRNA that they are intended to measure. This is most likely caused by the formation of G-quadruplexes, where inter-probe guanines form Hoogsteen hydrogen bonds, which probes with G-stacks are capable of forming. We demonstrate that for a specific microarray data set using the Human HG-U133A Affymetrix GeneChip and RMA normalization there is significant bias in the expression levels, the fold change and the correlations between expression levels. These effects grow more pronounced as the number of G-stack probes in a probe set increases. Approximately 14 of the probe sets are directly affected. The analysis was repeated for a number of other normalization pipelines and two, FARMS and PLIER, minimized the bias to some extent. We estimate that ∼15 of the data sets deposited in the GEO database are susceptible to the effect. The inclusion of G-stack probes in the affected data sets can bias key parameters used in the selection and clustering of genes. The elimination of these probes from any analysis in such affected data sets outweighs the increase of noise in the signal. © 2011 The Author(s)

    Visualization of the protein-coding regions with a self adaptive spectral rotation approach

    Get PDF
    Identifying protein-coding regions in DNA sequences is an active issue in computational biology. In this study, we present a self adaptive spectral rotation (SASR) approach, which visualizes coding regions in DNA sequences, based on investigation of the Triplet Periodicity property, without any preceding training process. It is proposed to help with the rough coding regions prediction when there is no extra information for the training required by other outstanding methods. In this approach, at each position in the DNA sequence, a Fourier spectrum is calculated from the posterior subsequence. Following the spectrums, a random walk in complex plane is generated as the SASR's graphic output. Applications of the SASR on real DNA data show that patterns in the graphic output reveal locations of the coding regions and the frame shifts between them: arcs indicate coding regions, stable points indicate non-coding regions and corners’ shapes reveal frame shifts. Tests on genomic data set from Saccharomyces Cerevisiae reveal that the graphic patterns for coding and non-coding regions differ to a great extent, so that the coding regions can be visually distinguished. Meanwhile, a time cost test shows that the SASR can be easily implemented with the computational complexity of O(N)

    Visualization of the protein-coding regions with a self adaptive spectral rotation approach

    Get PDF
    Identifying protein-coding regions in DNA sequences is an active issue in computational biology. In this study, we present a self adaptive spectral rotation (SASR) approach, which visualizes coding regions in DNA sequences, based on investigation of the Triplet Periodicity property, without any preceding training process. It is proposed to help with the rough coding regions prediction when there is no extra information for the training required by other outstanding methods. In this approach, at each position in the DNA sequence, a Fourier spectrum is calculated from the posterior subsequence. Following the spectrums, a random walk in complex plane is generated as the SASR's graphic output. Applications of the SASR on real DNA data show that patterns in the graphic output reveal locations of the coding regions and the frame shifts between them: arcs indicate coding regions, stable points indicate non-coding regions and corners’ shapes reveal frame shifts. Tests on genomic data set from Saccharomyces Cerevisiae reveal that the graphic patterns for coding and non-coding regions differ to a great extent, so that the coding regions can be visually distinguished. Meanwhile, a time cost test shows that the SASR can be easily implemented with the computational complexity of O(N)

    RNA secondary structure prediction from multi-aligned sequences

    Full text link
    It has been well accepted that the RNA secondary structures of most functional non-coding RNAs (ncRNAs) are closely related to their functions and are conserved during evolution. Hence, prediction of conserved secondary structures from evolutionarily related sequences is one important task in RNA bioinformatics; the methods are useful not only to further functional analyses of ncRNAs but also to improve the accuracy of secondary structure predictions and to find novel functional RNAs from the genome. In this review, I focus on common secondary structure prediction from a given aligned RNA sequence, in which one secondary structure whose length is equal to that of the input alignment is predicted. I systematically review and classify existing tools and algorithms for the problem, by utilizing the information employed in the tools and by adopting a unified viewpoint based on maximum expected gain (MEG) estimators. I believe that this classification will allow a deeper understanding of each tool and provide users with useful information for selecting tools for common secondary structure predictions.Comment: A preprint of an invited review manuscript that will be published in a chapter of the book `Methods in Molecular Biology'. Note that this version of the manuscript may differ from the published versio

    Predictors of Pulmonary Function Response to Treatment with Salmeterol/fluticasone in Patients with Chronic Obstructive Pulmonary Disease

    Get PDF
    Chronic obstructive pulmonary disease (COPD) is a heterogeneous disease and responses to therapies are highly variable. The aim of this study was to identify the predictors of pulmonary function response to 3 months of treatment with salmeterol/fluticasone in patients with COPD. A total of 127 patients with stable COPD from the Korean Obstructive Lung Disease (KOLD) Cohort, which were prospectively recruited from June 2005 to September 2009, were analyzed retrospectively. The prediction models for the FEV1, FVC and IC/TLC changes after 3 months of treatment with salmeterol/fluticasone were constructed by using multiple, stepwise, linear regression analysis. The prediction model for the FEV1 change after 3 months of treatment included wheezing history, pre-bronchodilator FEV1, post-bronchodilator FEV1 change and emphysema extent on CT (R = 0.578). The prediction models for the FVC change after 3 months of treatment included pre-bronchodilator FVC, post-bronchodilator FVC change (R = 0.533), and those of IC/ TLC change after 3 months of treatment did pre-bronchodilator IC/TLC and post-bronchodilator FEV1 change (R = 0.401). Wheezing history, pre-bronchodilator pulmonary function, bronchodilator responsiveness, and emphysema extent may be used for predicting the pulmonary function response to 3 months of treatment with salmeterol/fluticasone in patients with COPD

    \u3ci\u3eSenecio Conrathii\u3c/i\u3e N.E.Br. (Asteraceae), a New Hyperaccumulator of Nickel from Serpentinite Outcrops of the Barberton Greenstone Belt, South Africa

    Get PDF
    Five nickel hyperaccumulators belonging to the Asteraceae are known from ultramafic outcrops in South Africa. Phytoremediation applications of the known hyperaccumulators in the Asteraceae, such as the indigenous Berkheya coddii Roessler, are well reported and necessitate further exploration to find additional species with such traits. This study targeted the most frequently occurring species of the Asteraceae on eight randomly selected serpentinite outcrops of the Barberton Greenstone Belt. Twenty species were sampled, including 12 that were tested for nickel accumulation for the first time. Although the majority of the species were excluders, the known hyperaccumulators Berkheya nivea N.E.Br. and B. zeyheri (Sond. & Harv.) Oliv. & Hiern subsp. rehmannii (Thell.) Roessler var. rogersiana (Thell.) Roessler hyperaccumulated nickel in the leaves at expected levels. A new hyperaccumulator of nickel was discovered, Senecio conrathii N.E.Br., which accumulated the element in its leaves at 1695 ± 637 µg g−1 on soil with a total and exchangeable nickel content of 503 mg kg−1 and 0.095 µg g−1, respectively. This makes it the third known species in the Senecioneae of South Africa to hyperaccumulate nickel after Senecio anomalochrous Hilliard and Senecio coronatus (Thunb.) Harv., albeit it being a weak accumulator compared with the latter. Seven tribes in the Asteraceae have now been screened for hyperaccumulation in South Africa, with hyperaccumulators only recorded for the Arctoteae and Senecioneae. This suggests that further exploration for hyperaccumulators should focus on these tribes as they comprise all six species (of 68 Asteraceae taxa screened thus far) to hyperaccumulate nickel

    Stacking of G-quadruplexes: NMR structure of a G-rich oligonucleotide with potential anti-HIV and anticancer activity†

    Get PDF
    G-rich oligonucleotides T30695 (or T30923), with the sequence of (GGGT)4, and T40214, with the sequence of (GGGC)4, have been reported to exhibit anti-HIV and anticancer activity. Here we report on the structure of a dimeric G-quadruplex adopted by a derivative of these sequences in K+ solution. It comprises two identical propeller-type parallel-stranded G-quadruplex subunits each containing three G-tetrad layers that are stacked via the 5′-5′ interface. We demonstrated control over the stacking of the two monomeric subunits by sequence modifications. Our analysis of possible structures at the stacking interface provides a general principle for stacking of G-quadruplexes, which could have implications for the assembly and recognition of higher-order G-quadruplex structures

    Domain architecture evolution of pattern-recognition receptors

    Get PDF
    In animals, the innate immune system is the first line of defense against invading microorganisms, and the pattern-recognition receptors (PRRs) are the key components of this system, detecting microbial invasion and initiating innate immune defenses. Two families of PRRs, the intracellular NOD-like receptors (NLRs) and the transmembrane Toll-like receptors (TLRs), are of particular interest because of their roles in a number of diseases. Understanding the evolutionary history of these families and their pattern of evolutionary changes may lead to new insights into the functioning of this critical system. We found that the evolution of both NLR and TLR families included massive species-specific expansions and domain shuffling in various lineages, which resulted in the same domain architectures evolving independently within different lineages in a process that fits the definition of parallel evolution. This observation illustrates both the dynamics of the innate immune system and the effects of “combinatorially constrained” evolution, where existence of the limited numbers of functionally relevant domains constrains the choices of domain architectures for new members in the family, resulting in the emergence of independently evolved proteins with identical domain architectures, often mistaken for orthologs

    Transcriptomic profiling of host-parasite interactions in the microsporidian <i>Trachipleistophora hominis</i>

    Get PDF
    BACKGROUND: Trachipleistophora hominis was isolated from an HIV/AIDS patient and is a member of a highly successful group of obligate intracellular parasites. METHODS: Here we have investigated the evolution of the parasite and the interplay between host and parasite gene expression using transcriptomics of T. hominis-infected rabbit kidney cells. RESULTS: T. hominis has about 30 % more genes than small-genome microsporidians. Highly expressed genes include those involved in growth, replication, defence against oxidative stress, and a large fraction of uncharacterised genes. Chaperones are also highly expressed and may buffer the deleterious effects of the large number of non-synonymous mutations observed in essential T. hominis genes. Host expression suggests a general cellular shutdown upon infection, but ATP, amino sugar and nucleotide sugar production appear enhanced, potentially providing the parasite with substrates it cannot make itself. Expression divergence of duplicated genes, including transporters used to acquire host metabolites, demonstrates ongoing functional diversification during microsporidian evolution. We identified overlapping transcription at more than 100 loci in the sparse T. hominis genome, demonstrating that this feature is not caused by genome compaction. The detection of additional transposons of insect origin strongly suggests that the natural host for T. hominis is an insect. CONCLUSIONS: Our results reveal that the evolution of contemporary microsporidian genomes is highly dynamic and innovative. Moreover, highly expressed T. hominis genes of unknown function include a cohort that are shared among all microsporidians, indicating that some strongly conserved features of the biology of these enormously successful parasites remain uncharacterised. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12864-015-1989-z) contains supplementary material, which is available to authorized users

    One thousand plant transcriptomes and the phylogenomics of green plants

    Get PDF
    Abstract: Green plants (Viridiplantae) include around 450,000–500,000 species1, 2 of great diversity and have important roles in terrestrial and aquatic ecosystems. Here, as part of the One Thousand Plant Transcriptomes Initiative, we sequenced the vegetative transcriptomes of 1,124 species that span the diversity of plants in a broad sense (Archaeplastida), including green plants (Viridiplantae), glaucophytes (Glaucophyta) and red algae (Rhodophyta). Our analysis provides a robust phylogenomic framework for examining the evolution of green plants. Most inferred species relationships are well supported across multiple species tree and supermatrix analyses, but discordance among plastid and nuclear gene trees at a few important nodes highlights the complexity of plant genome evolution, including polyploidy, periods of rapid speciation, and extinction. Incomplete sorting of ancestral variation, polyploidization and massive expansions of gene families punctuate the evolutionary history of green plants. Notably, we find that large expansions of gene families preceded the origins of green plants, land plants and vascular plants, whereas whole-genome duplications are inferred to have occurred repeatedly throughout the evolution of flowering plants and ferns. The increasing availability of high-quality plant genome sequences and advances in functional genomics are enabling research on genome evolution across the green tree of life
    corecore