264 research outputs found

    Combinatorial motif analysis in yeast gene promoters: the benefits of a biological consideration of motifs

    Get PDF
    There are three main categories of algorithms for identifying small transcription regulatory sequences in the promoters of genes, phylogenetic comparison, expectation maximization and combinatorial. For convenience, the combinatorial methods typically define motifs in terms of a canonical sequence and a set of sequences that have a small number of differences compared to the canonical sequence. Such motifs are referred to as (l, d)-motifs where l is the length of the motif and d indicates how many mismatches are allowed between an instance of the motif and the canonical motif sequence. There are limits to the complexity of the patterns of motifs that can be found by combinatorial methods. For some values of l and d, there will exist many sets of random words in a cluster of gene promoters that appear to form an (l, d)-motif. For these motifs, it will be impossible to distinguish biological motifs from randomly generated motifs. A better formalization of motifs is the (l, f, d)-motif that is derived from a biological consideration of motifs. The motivation for (l, f, d)-motifs comes from an examination of known transcription factor binding sites where typically a few positions in the motif are invariant. It is shown that there exist (l, f, d)-motifs that can be found in the promoters of gene clusters that would not be recognizable from random sequences if they were described as (l, d)-motifs. The inclusion of the f-value in the definition of motifs suggests that the sequence space that is occupied by a motif will consist of a several clusters of closely related sequences. An algorithm, CM, has been developed that identifies small sets of overabundant sequences in the promoters from a cluster of genes and then combines these simple sets of sequences to form complex (l, f, d)-motif models. A dataset from a yeast gene expression experiment is analyzed with CM. Known biological motifs and novel motifs are identified by CM. The performance of CM is compared to that of a popular expectation maximization algorithm, AlginACE, and to that from a simple combinatorial motif finding program

    Quantum walks on circles in phase space via superconducting circuit quantum electrodynamics

    Full text link
    We show how a quantum walk can be implemented for the first time in a quantum quincunx created via superconducting circuit quantum electrodynamics (QED), and how interpolation from quantum to random walk is implemented by controllable decoherence using a two resonator system. Direct control over the coin qubit is difficult to achieve in either cavity or circuit QED, but we show that a Hadamard coin flip can be effected via direct driving of the cavity, with the result that the walker jumps between circles in phase space but still exhibits quantum walk behavior over 15 steps.Comment: 8 pages, 4 figures, 2 table

    COMPUTATIONAL RESOURCES FOR BIOFUEL FEEDSTOCK SPECIES

    Get PDF
    While current production of ethanol as a biofuel relies on starch and sugar inputs, it is anticipated that sustainable production of ethanol for biofuel use will utilize lignocellulosic feedstocks. Candidate plant species to be used for lignocellulosic ethanol production include a large number of species within the Grass, Pine and Birch plant families. For these biofuel feedstock species, there are variable amounts of genome sequence resources available, ranging from complete genome sequences (e.g. sorghum, poplar) to transcriptome data sets (e.g. switchgrass, pine). These data sets are not only dispersed in location but also disparate in content. It will be essential to leverage and improve these genomic data sets for the improvement of biofuel feedstock production. The objectives of this project were to provide computational tools and resources for data-mining genome sequence/annotation and large-scale functional genomic datasets available for biofuel feedstock species. We have created a Bioenergy Feedstock Genomics Resource that provides a web-based portal or âÂÂclearing houseâ for genomic data for plant species relevant to biofuel feedstock production. Sequence data from a total of 54 plant species are included in the Bioenergy Feedstock Genomics Resource including model plant species that permit leveraging of knowledge across taxa to biofuel feedstock species.We have generated additional computational analyses of these data, including uniform annotation, to facilitate genomic approaches to improved biofuel feedstock production. These data have been centralized in the publicly available Bioenergy Feedstock Genomics Resource (http://bfgr.plantbiology.msu.edu/)

    Expression Profiling of Cucumis sativus in Response to Infection by Pseudoperonospora cubensis

    Get PDF
    The oomycete pathogen, Pseudoperonospora cubensis, is the causal agent of downy mildew on cucurbits, and at present, no effective resistance to this pathogen is available in cultivated cucumber (Cucumis sativus). To better understand the host response to a virulent pathogen, we performed expression profiling throughout a time course of a compatible interaction using whole transcriptome sequencing. As described herein, we were able to detect the expression of 15,286 cucumber genes, of which 14,476 were expressed throughout the infection process from 1 day post-inoculation (dpi) to 8 dpi. A large number of genes, 1,612 to 3,286, were differentially expressed in pair-wise comparisons between time points. We observed the rapid induction of key defense related genes, including catalases, chitinases, lipoxygenases, peroxidases, and protease inhibitors within 1 dpi, suggesting detection of the pathogen by the host. Co-expression network analyses revealed transcriptional networks with distinct patterns of expression including down-regulation at 2 dpi of known defense response genes suggesting coordinated suppression of host responses by the pathogen. Comparative analyses of cucumber gene expression patterns with that of orthologous Arabidopsis thaliana genes following challenge with Hyaloperonospora arabidopsidis revealed correlated expression patterns of single copy orthologs suggesting that these two dicot hosts have similar transcriptional responses to related pathogens. In total, the work described herein presents an in-depth analysis of the interplay between host susceptibility and pathogen virulence in an agriculturally important pathosystem

    Genetic Regulation of Development in Sorghum bicolor

    Full text link

    Computational and transcriptional evidence for microRNAs in the honey bee genome

    Get PDF
    A total of 68 non-redundant candidate honey bee miRNAs were identified computationally; several of them appear to have previously unrecognized orthologs in the Drosophila genome. Several miRNAs showed caste- or age-related differences in transcript abundance and are likely to be involved in regulating honey bee development

    The TIGR Rice Genome Annotation Resource: improvements and new features

    Get PDF
    In The Institute for Genomic Research Rice Genome Annotation project (), we have continued to update the rice genome sequence with new data and improve the quality of the annotation. In our current release of annotation (Release 4.0; January 12, 2006), we have identified 42 653 non-transposable element-related genes encoding 49 472 gene models as a result of the detection of alternative splicing. We have refined our identification methods for transposable element-related genes resulting in 13 237 genes that are related to transposable elements. Through incorporation of multiple transcript and proteomic expression data sets, we have been able to annotate 24 799 genes (31 739 gene models), representing ∼50% of the total gene models, as expressed in the rice genome. All structural and functional annotation is viewable through our Rice Genome Browser which currently supports 59 tracks. Enhanced data access is available through web interfaces, FTP downloads and a Data Extractor tool developed in order to support discrete dataset downloads

    Nucleotide polymorphism and copy number variant detection using exome capture and next-generation sequencing in the polyploid grass \u3ci\u3ePanicum virgatum\u3c/i\u3e

    Get PDF
    Switchgrass (Panicum virgatum) is a polyploid, outcrossing grass species native to North America and has recently been recognized as a potential biofuel feedstock crop. Significant phenotypic variation including ploidy is present across the two primary ecotypes of switchgrass, referred to as upland and lowland switchgrass. The tetraploid switchgrass genome is approximately 1400 Mbp, split between two subgenomes, with significant repetitive sequence content limiting the efficiency of re-sequencing approaches for determining genome diversity. To characterize genetic diversity in upland and lowland switchgrass as a first step in linking genotype to phenotype, we designed an exome capture probe set based on transcript assemblies that represent approximately 50 Mb of annotated switchgrass exome sequences. We then evaluated and optimized the probe set using solid phase comparative genome hybridization and liquid phase exome capture followed by next-generation sequencing. Using the optimized probe set, we assessed variation in the exomes of eight switchgrass genotypes representing tetraploid lowland and octoploid upland cultivars to benchmark our exome capture probe set design. We identified ample variation in the switchgrass genome including 1 395 501 single nucleotide polymorphisms (SNPs), 8173 putative copy number variants and 3336 presence/absence variants. While the majority of the SNPs (84%) detected was bi-allelic, a substantial number was tri-allelic with limited occurrence of tetra-allelic polymorphisms consistent with the heterozygous and polyploid nature of the switchgrass genome. Collectively, these data demonstrate the efficacy of exome capture for discovery of genome variation in a polyploid species with a large, repetitive and heterozygous genome
    • …
    corecore