2,966 research outputs found

    Application of regulatory sequence analysis and metabolic network analysis to the interpretation of gene expression data

    Get PDF
    We present two complementary approaches for the interpretation of clusters of co-regulated genes, such as those obtained from DNA chips and related methods. Starting from a cluster of genes with similar expression profiles, two basic questions can be asked: 1. Which mechanism is responsible for the coordinated transcriptional response of the genes? This question is approached by extracting motifs that are shared between the upstream sequences of these genes. The motifs extracted are putative cis-acting regulatory elements. 2. What is the physiological meaning for the cell to express together these genes? One way to answer the question is to search for potential metabolic pathways that could be catalyzed by the products of the genes. This can be done by selecting the genes from the cluster that code for enzymes, and trying to assemble the catalyzed reactions to form metabolic pathways. We present tools to answer these two questions, and we illustrate their use with selected examples in the yeast Saccharomyces cerevisiae. The tools are available on the web (http://ucmb.ulb.ac.be/bioinformatics/rsa-tools/; http://www.ebi.ac.uk/research/pfbp/; http://www.soi.city.ac.uk/~msch/)

    Computational identification of developmental enhancers: conservation and function of transcription factor binding-site clusters in Drosophila melanogaster and Drosophila pseudoobscura

    Get PDF
    BACKGROUND: The identification of sequences that control transcription in metazoans is a major goal of genome analysis. In a previous study, we demonstrated that searching for clusters of predicted transcription factor binding sites could discover active regulatory sequences, and identified 37 regions of the Drosophila melanogaster genome with high densities of predicted binding sites for five transcription factors involved in anterior-posterior embryonic patterning. Nine of these clusters overlapped known enhancers. Here, we report the results of in vivo functional analysis of 27 remaining clusters. RESULTS: We generated transgenic flies carrying each cluster attached to a basal promoter and reporter gene, and assayed embryos for reporter gene expression. Six clusters are enhancers of adjacent genes: giant, fushi tarazu, odd-skipped, nubbin, squeeze and pdm2; three drive expression in patterns unrelated to those of neighboring genes; the remaining 18 do not appear to have enhancer activity. We used the Drosophila pseudoobscura genome to compare patterns of evolution in and around the 15 positive and 18 false-positive predictions. Although conservation of primary sequence cannot distinguish true from false positives, conservation of binding-site clustering accurately discriminates functional binding-site clusters from those with no function. We incorporated conservation of binding-site clustering into a new genome-wide enhancer screen, and predict several hundred new regulatory sequences, including 85 adjacent to genes with embryonic patterns. CONCLUSIONS: Measuring conservation of sequence features closely linked to function - such as binding-site clustering - makes better use of comparative sequence data than commonly used methods that examine only sequence identity

    Probabilistic Clustering of Time-Evolving Distance Data

    Full text link
    We present a novel probabilistic clustering model for objects that are represented via pairwise distances and observed at different time points. The proposed method utilizes the information given by adjacent time points to find the underlying cluster structure and obtain a smooth cluster evolution. This approach allows the number of objects and clusters to differ at every time point, and no identification on the identities of the objects is needed. Further, the model does not require the number of clusters being specified in advance -- they are instead determined automatically using a Dirichlet process prior. We validate our model on synthetic data showing that the proposed method is more accurate than state-of-the-art clustering methods. Finally, we use our dynamic clustering model to analyze and illustrate the evolution of brain cancer patients over time

    Stagnant ice and age modelling in the Dome C region, Antarctica

    Get PDF
    The European Beyond EPICA project aims to extract a continuous ice core of up to 1.5 Ma, with a maximum age density of 20 kyr m-1 at Little Dome C (LDC). We present a 1D numerical model which calculates the age of the ice around Dome C. The model inverts for basal conditions and accounts either for melting or for a layer of stagnant ice above the bedrock. It is constrained by internal reflecting horizons traced in radargrams and dated using the EPICA Dome C (EDC) ice core age profile. We used three different radar datasets ranging from a 10 000 km2 airborne survey down to 5 km long ground-based radar transects over LDC. We find that stagnant ice exists in many places, including above the LDC relief where the new Beyond EPICA drill site (BELDC) is located. The modelled thickness of this layer of stagnant ice roughly corresponds to the thickness of the basal unit observed in one of the radar surveys and in the autonomous phase-sensitive radio-echo sounder (ApRES) dataset. At BELDC, the modelled stagnant ice thickness is 198±44 m and the modelled oldest age of ice is 1.45±0.16 Ma at a depth of 2494±30 m. This is very similar to all sites situated on the LDC relief, including that of the Million Year Ice Core project being conducted by the Australian Antarctic Division. The model was also applied to radar data in the area 10-15 km north of EDC (North Patch), where we find either a thin layer of stagnant ice (generally <60 m) or a negligible melt rate (<0.1 mm yr-1). The modelled maximum age at North Patch is over 2 Ma in most places, with ice at 1.5 Ma having a resolution of 9-12 kyr m-1, making it an exciting prospect for a future Oldest Ice drill site

    Limited polymorphism in Plasmodium falciparum ookinete surface antigen, von Willebrand factor A domain-related protein from clinical isolates

    Get PDF
    BACKGROUND: As malaria becomes increasingly drug resistant and more costly to treat, there is increasing urgency to develop effective vaccines. In comparison to other stages of the malaria lifecycle, sexual stage antigens are under less immune selection pressure and hence are likely to have limited antigenic diversity. METHODS: Clinical isolates from a wide range of geographical regions were collected. Direct sequencing of PCR products was then used to determine the extent of polymorphisms for the novel Plasmodium falciparum sexual stage antigen von Willebrand Factor A domain-related Protein (PfWARP). These isolates were also used to confirm the extent of diversity of sexual stage antigen Pfs28. RESULTS: PfWARP was shown to have non-synonymous substitutions at 3 positions and Pfs28 was confirmed to have a single non-synonymous substitution as previously described. CONCLUSION: This study demonstrates the limited antigenic diversity of two prospective P. falciparum sexual stage antigens, PfWARP and Pfs28. This provides further encouragement for the proceeding with vaccine trials based on these antigens

    Comparison of measurements from different radio-echo sounding systems and synchronization with the ice core at Dome C, Antarctica

    Get PDF
    We present a compilation of radio-echo sounding (RES) measurements of five radar systems (AWI, BAS, CReSIS, INGV and UTIG) around the EPICA Dome C (EDC) drill site, East Antarctica. The aim of our study is to investigate the differences of the various systems in their resolution of internal reflection horizons (IRHs) and bedrock topography, penetration depth, and quality of imaging the basal layer. We address the questions of the compatibility of existing radar data for common interpretation, and the suitability of the individual systems for Oldest Ice reconnaissance surveys. We find that the most distinct IRHs and IRH patterns can be identified and transferred between most data sets. Considerable differences between the RES systems exist in range resolution and depiction of the basal layer. Considering both aspects, which we judge as crucial factors in the search for old ice, the CReSIS and the UTIG systems are the most valuable ones. In addition to the RES data set comparison we calculate a synthetic radar trace from EDC density and conductivity profiles. We identify ten common IRHs in the measured RES data and the synthetic trace. The reflection-causing conductivity sections are determined by sensitivity studies with the synthetic trace. In this way, we accomplish an accurate two-way travel time to depth conversion for the reflectors, without having to use a precise velocity-depth function that would accumulate depth uncertainties with increasing depth. The identified IRHs are assigned with the AICC2012 time scale age. Due to the isochronous character of these conductivity-caused IRHs, they are a means to extend the Dome C age structure by tracing the IRHs along the RES profiles

    The effects of weather and climate change on dengue

    Get PDF
    There is much uncertainty about the future impact of climate change on vector-borne diseases. Such uncertainty reflects the difficulties in modelling the complex interactions between disease, climatic and socioeconomic determinants. We used a comprehensive panel dataset from Mexico covering 23 years of province-specific dengue reports across nine climatic regions to estimate the impact of weather on dengue, accounting for the effects of non-climatic factors

    Bayesian hierarchical clustering for studying cancer gene expression data with unknown statistics

    Get PDF
    Clustering analysis is an important tool in studying gene expression data. The Bayesian hierarchical clustering (BHC) algorithm can automatically infer the number of clusters and uses Bayesian model selection to improve clustering quality. In this paper, we present an extension of the BHC algorithm. Our Gaussian BHC (GBHC) algorithm represents data as a mixture of Gaussian distributions. It uses normal-gamma distribution as a conjugate prior on the mean and precision of each of the Gaussian components. We tested GBHC over 11 cancer and 3 synthetic datasets. The results on cancer datasets show that in sample clustering, GBHC on average produces a clustering partition that is more concordant with the ground truth than those obtained from other commonly used algorithms. Furthermore, GBHC frequently infers the number of clusters that is often close to the ground truth. In gene clustering, GBHC also produces a clustering partition that is more biologically plausible than several other state-of-the-art methods. This suggests GBHC as an alternative tool for studying gene expression data. The implementation of GBHC is available at https://sites. google.com/site/gaussianbhc

    Affinity Inequality among Serum Antibodies That Originate in Lymphoid Germinal Centers

    Get PDF
    Upon natural infection with pathogens or vaccination, antibodies are produced by a process called affinity maturation. As affinity maturation ensues, average affinity values between an antibody and ligand increase with time. Purified antibodies isolated from serum are invariably heterogeneous with respect to their affinity for the ligands they bind, whether macromolecular antigens or haptens (low molecular weight approximations of epitopes on antigens). However, less is known about how the extent of this heterogeneity evolves with time during affinity maturation. To shed light on this issue, we have taken advantage of previously published data from Eisen and Siskind (1964). Using the ratio of the strongest to the weakest binding subsets as a metric of heterogeneity (or affinity inequality), we analyzed antibodies isolated from individual serum samples. The ratios were initially as high as 50-fold, and decreased over a few weeks after a single injection of small antigen doses to around unity. This decrease in the effective heterogeneity of antibody affinities with time is consistent with Darwinian evolution in the strong selection limit. By contrast, neither the average affinity nor the heterogeneity evolves much with time for high doses of antigen, as competition between clones of the same affinity is minimal.Ragon Institute of MGH, MIT and HarvardSamsung Scholarship FoundationNational Science Foundation (U.S.). Graduate Research Fellowship (Grant 1122374
    • …
    corecore