98 research outputs found

    4DXpress: a database for cross-species expression pattern comparisons

    Get PDF
    In the major animal model species like mouse, fish or fly, detailed spatial information on gene expression over time can be acquired through whole mount in situ hybridization experiments. In these species, expression patterns of many genes have been studied and data has been integrated into dedicated model organism databases like ZFIN for zebrafish, MEPD for medaka, BDGP for Drosophila or GXD for mouse. However, a central repository that allows users to query and compare gene expression patterns across different species has not yet been established. Therefore, we have integrated expression patterns for zebrafish, Drosophila, medaka and mouse into a central public repository called 4DXpress (expression database in four dimensions). Users can query anatomy ontology-based expression annotations across species and quickly jump from one gene to the orthologues in other species. Genes are linked to public microarray data in ArrayExpress. We have mapped developmental stages between the species to be able to compare developmental time phases. We store the largest collection of gene expression patterns available to date in an individual resource, reflecting 16 505 annotated genes. 4DXpress will be an invaluable tool for developmental as well as for computational biologists interested in gene regulation and evolution. 4DXpress is available at http://ani.embl.de/4DXpress

    An intuitionistic approach to scoring DNA sequences against transcription factor binding site motifs

    Get PDF
    Background: Transcription factors (TFs) control transcription by binding to specific regions of DNA called transcription factor binding sites (TFBSs). The identification of TFBSs is a crucial problem in computational biology and includes the subtask of predicting the location of known TFBS motifs in a given DNA sequence. It has previously been shown that, when scoring matches to known TFBS motifs, interdependencies between positions within a motif should be taken into account. However, this remains a challenging task owing to the fact that sequences similar to those of known TFBSs can occur by chance with a relatively high frequency. Here we present a new method for matching sequences to TFBS motifs based on intuitionistic fuzzy sets (IFS) theory, an approach that has been shown to be particularly appropriate for tackling problems that embody a high degree of uncertainty. Results: We propose SCintuit, a new scoring method for measuring sequence-motif affinity based on IFS theory. Unlike existing methods that consider dependencies between positions, SCintuit is designed to prevent overestimation of less conserved positions of TFBSs. For a given pair of bases, SCintuit is computed not only as a function of their combined probability of occurrence, but also taking into account the individual importance of each single base at its corresponding position. We used SCintuit to identify known TFBSs in DNA sequences. Our method provides excellent results when dealing with both synthetic and real data, outperforming the sensitivity and the specificity of two existing methods in all the experiments we performed. Conclusions: The results show that SCintuit improves the prediction quality for TFs of the existing approaches without compromising sensitivity. In addition, we show how SCintuit can be successfully applied to real research problems. In this study the reliability of the IFS theory for motif discovery tasks is proven

    Reticulated origin of domesticated emmer wheat supports a dynamic model for the emergence of agriculture in the fertile crescent

    Get PDF
    We used supernetworks with datasets of nuclear gene sequences and novel markers detecting retrotransposon insertions in ribosomal DNA loci to reassess the evolutionary relationships among tetraploid wheats. We show that domesticated emmer has a reticulated genetic ancestry, sharing phylogenetic signals with wild populations from all parts of the wild range. The extent of the genetic reticulation cannot be explained by post-domestication gene flow between cultivated emmer and wild plants, and the phylogenetic relationships among tetraploid wheats are incompatible with simple linear descent of the domesticates from a single wild population. A more parsimonious explanation of the data is that domesticated emmer originates from a hybridized population of different wild lineages. The observed diversity and reticulation patterns indicate that wild emmer evolved in the southern Levant, and that the wild emmer populations in south-eastern Turkey and the Zagros Mountains are relatively recent reticulate descendants of a subset of the Levantine wild populations. Based on our results we propose a new model for the emergence of domesticated emmer. During a pre-domestication period, diverse wild populations were collected from a large area west of the Euphrates and cultivated in mixed stands. Within these cultivated stands, hybridization gave rise to lineages displaying reticulated genealogical relationships with their ancestral populations. Gradual movement of early farmers out of the Levant introduced the pre-domesticated reticulated lineages to the northern and eastern parts of the Fertile Crescent, giving rise to the local wild populations but also facilitating fixation of domestication traits. Our model is consistent with the protracted and dispersed transition to agriculture indicated by the archaeobotanical evidence, and also with previous genetic data affiliating domesticated emmer with the wild populations in southeast Turkey. Unlike other protracted models, we assume that humans played an intuitive role throughout the process.Natural Environment Research Council [NE/E015948/1]; Slovak Research and Development Agency [APVV-0661-10, APVV-0197-10]info:eu-repo/semantics/publishedVersio

    Genomic analysis of European Drosophila melanogaster populations reveals longitudinal structure, continent-wide selection, and previously unknown DNA viruses

    Get PDF
    Genetic variation is the fuel of evolution, with standing genetic variation especially important for short-term evolution and local adaptation. To date, studies of spatiotemporal patterns of genetic variation in natural populations have been challenging, as comprehensive sampling is logistically difficult, and sequencing of entire populations costly. Here, we address these issues using a collaborative approach, sequencing 48 pooled population samples from 32 locations, and perform the first continent-wide genomic analysis of genetic variation in European Drosophila melanogaster. Our analyses uncover longitudinal population structure, provide evidence for continent-wide selective sweeps, identify candidate genes for local climate adaptation, and document clines in chromosomal inversion and transposable element frequencies. We also characterize variation among populations in the composition of the fly microbiome, and identify five new DNA viruses in our samples.Publisher PDFPeer reviewe

    Combining Computational Prediction of Cis-Regulatory Elements with a New Enhancer Assay to Efficiently Label Neuronal Structures in the Medaka Fish

    Get PDF
    The developing vertebrate nervous system contains a remarkable array of neural cells organized into complex, evolutionarily conserved structures. The labeling of living cells in these structures is key for the understanding of brain development and function, yet the generation of stable lines expressing reporter genes in specific spatio-temporal patterns remains a limiting step. In this study we present a fast and reliable pipeline to efficiently generate a set of stable lines expressing a reporter gene in multiple neuronal structures in the developing nervous system in medaka. The pipeline combines both the accurate computational genome-wide prediction of neuronal specific cis-regulatory modules (CRMs) and a newly developed experimental setup to rapidly obtain transgenic lines in a cost-effective and highly reproducible manner. 95% of the CRMs tested in our experimental setup show enhancer activity in various and numerous neuronal structures belonging to all major brain subdivisions. This pipeline represents a significant step towards the dissection of embryonic neuronal development in vertebrates

    The Cell Cycle Regulated Transcriptome of Trypanosoma brucei

    Get PDF
    Progression of the eukaryotic cell cycle requires the regulation of hundreds of genes to ensure that they are expressed at the required times. Integral to cell cycle progression in yeast and animal cells are temporally controlled, progressive waves of transcription mediated by cell cycle-regulated transcription factors. However, in the kinetoplastids, a group of early-branching eukaryotes including many important pathogens, transcriptional regulation is almost completely absent, raising questions about the extent of cell-cycle regulation in these organisms and the mechanisms whereby regulation is achieved. Here, we analyse gene expression over the Trypanosoma brucei cell cycle, measuring changes in mRNA abundance on a transcriptome-wide scale. We developed a “double-cut” elutriation procedure to select unperturbed, highly synchronous cell populations from log-phase cultures, and compared this to synchronization by starvation. Transcriptome profiling over the cell cycle revealed the regulation of at least 430 genes. While only a minority were homologous to known cell cycle regulated transcripts in yeast or human, their functions correlated with the cellular processes occurring at the time of peak expression. We searched for potential target sites of RNA-binding proteins in these transcripts, which might earmark them for selective degradation or stabilization. Over-represented sequence motifs were found in several co-regulated transcript groups and were conserved in other kinetoplastids. Furthermore, we found evidence for cell-cycle regulation of a flagellar protein regulon with a highly conserved sequence motif, bearing similarity to consensus PUF-protein binding motifs. RNA sequence motifs that are functional in cell-cycle regulation were more widespread than previously expected and conserved within kinetoplastids. These findings highlight the central importance of post-transcriptional regulation in the proliferation of parasitic kinetoplastids

    Population structure and genetic bottleneck in sweet cherry estimated with SSRs and the gametophytic self-incompatibility locus

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Domestication and breeding involve the selection of particular phenotypes, limiting the genomic diversity of the population and creating a bottleneck. These effects can be precisely estimated when the location of domestication is established. Few analyses have focused on understanding the genetic consequences of domestication and breeding in fruit trees. In this study, we aimed to analyse genetic structure and changes in the diversity in sweet cherry <it>Prunus avium </it>L.</p> <p>Results</p> <p>Three subgroups were detected in sweet cherry, with one group of landraces genetically very close to the analysed wild cherry population. A limited number of SSR markers displayed deviations from the frequencies expected under neutrality. After the removal of these markers from the analysis, a very limited bottleneck was detected between wild cherries and sweet cherry landraces, with a much more pronounced bottleneck between sweet cherry landraces and modern sweet cherry varieties. The loss of diversity between wild cherries and sweet cherry landraces at the <it>S</it>-locus was more significant than that for microsatellites. Particularly high levels of differentiation were observed for some <it>S</it>-alleles.</p> <p>Conclusions</p> <p>Several domestication events may have happened in sweet cherry or/and intense gene flow from local wild cherry was probably maintained along the evolutionary history of the species. A marked bottleneck due to breeding was detected, with all markers, in the modern sweet cherry gene pool. The microsatellites did not detect the bottleneck due to domestication in the analysed sample. The vegetative propagation specific to some fruit trees may account for the differences in diversity observed at the <it>S</it>-locus. Our study provides insights into domestication events of cherry, however, requires confirmation on a larger sampling scheme for both sweet cherry landraces and wild cherry.</p

    An atlas of over 90.000 conserved noncoding sequences provides insight into crucifer regulatory regions

    Get PDF
    Despite the central importance of noncoding DNA to gene regulation and evolution, understanding of the extent of selection on plant noncoding DNA remains limited compared to that of other organisms. Here we report sequencing of genomes from three Brassicaceae species (Leavenworthia alabamica, Sisymbrium irio and Aethionema arabicum) and their joint analysis with six previously sequenced crucifer genomes. Conservation across orthologous bases suggests that at least 17% of the Arabidopsis thaliana genome is under selection, with nearly one-quarter of the sequence under selection lying outside of coding regions. Much of this sequence can be localized to approximately 90,000 conserved noncoding sequences (CNSs) that show evidence of transcriptional and post-transcriptional regulation. Population genomics analyses of two crucifer species, A. thaliana and Capsella grandiflora, confirm that most of the identified CNSs are evolving under medium to strong purifying selection. Overall, these CNSs highlight both similarities and several key differences between the regulatory DNA of plants and other species
    corecore