349 research outputs found

    A performance focused, development friendly and model aided parallelization strategy for scientific applications

    Get PDF
    The amelioration of high performance computing platforms has provided unprecedented computing power with the evolution of multi-core CPUs, massively parallel architectures such as General Purpose Graphics Processing Units (GPGPUs) and Many Integrated Core (MIC) architectures such as Intel\u27s Xeon phi coprocessor. However, it is a great challenge to leverage capabilities of such advanced supercomputing hardware, as it requires efficient and effective parallelization of scientific applications. This task is difficult mainly due to complexity of scientific algorithms coupled with the variety of available hardware and disparate programming models. To address the aforementioned challenges, this thesis presents a parallelization strategy to accelerate scientific applications that maximizes the opportunities of achieving speedup while minimizing the development efforts. Parallelization is a three step process (1) choose a compatible combination of architecture and parallel programming language, (2) translate base code/algorithm to a parallel language and (3) optimize and tune the application. In this research, a quantitative comparison of run time for various implementations of k-means algorithm, is used to establish that native languages (OpenMP, MPI, CUDA) perform better on respective architectures as opposed to vendor-neutral languages such as OpenCL. A qualitative model is used to select an optimal architecture for a given application by aligning the capabilities of accelerators with characteristics of the application. Once the optimal architecture is chosen, the corresponding native language is employed. This approach provides the best performance with reasonable accuracy (78%) of predicting a fitting combination, while eliminating the need for exploring different architectures individually. It reduces the required development efforts considerably as the application need not be re-written in multiple languages. The focus can be solely on optimization and tuning to achieve the best performance on available architectures with minimized investment in terms of cost and efforts. To verify the prediction accuracy of the qualitative model, the OpenDwarfs benchmark suite, which implements the Berkeley\u27s dwarfs in OpenCL, is used. A dwarf is an algorithmic method that captures a pattern of computation and communication. For the purpose of this research, the focus is on 9 application from various algorithmic domains that cover the seven dwarfs of symbolic computation, which were identified by Phillip Colella, as omnipresent in scientific and engineering applications. To validate the parallelization strategy collectively, a case study is undertaken. This case study involves parallelization of the Lower Upper Decomposition for the Gaussian Elimination algorithm from the linear algebra domain, using conventional trial and error methods as well as the proposed \u27Architecture First, Language Later\u27\u27 strategy. The development efforts incurred are contrasted for both methods. The aforesaid proposed strategy is observed to reduce the development efforts by an average of 50%

    A new RNASeq-based reference transcriptome for sugar beet and its application in transcriptome-scale analysis of vernalization and gibberellin responses.

    Get PDF
    BACKGROUND: Sugar beet (Beta vulgaris sp. vulgaris) crops account for about 30% of world sugar. Sugar yield is compromised by reproductive growth hence crops must remain vegetative until harvest. Prolonged exposure to cold temperature (vernalization) in the range 6 °C to 12 °C induces reproductive growth, leading to bolting (rapid elongation of the main stem) and flowering. Spring cultivation of crops in cool temperate climates makes them vulnerable to vernalization and hence bolting, which is initiated in the apical shoot meristem in processes involving interaction between gibberellin (GA) hormones and vernalization. The underlying mechanisms are unknown and genome scale next generation sequencing approaches now offer comprehensive strategies to investigate them; enabling the identification of novel targets for bolting control in sugar beet crops. In this study, we demonstrate the application of an mRNA-Seq based strategy for this purpose. RESULTS: There is no sugar beet reference genome, or public expression array platforms. We therefore used RNA-Seq to generate the first reference transcriptome. We next performed digital gene expression profiling using shoot apex mRNA from two sugar beet cultivars with and without applied GA, and also a vernalized cultivar with and without applied GA. Subsequent bioinformatics analyses identified transcriptional changes associated with genotypic difference and experimental treatments. Analysis of expression profiles in response to vernalization and GA treatment suggested previously unsuspected roles for a RAV1-like AP2/B3 domain protein in vernalization and efflux transporters in the GA response. CONCLUSIONS: Next generation RNA-Seq enabled the generation of the first reference transcriptome for sugar beet and the study of global transcriptional responses in the shoot apex to vernalization and GA treatment, without the need for a reference genome or established array platforms. Comprehensive bioinformatic analysis identified transcriptional programmes associated with different sugar beet genotypes as well as biological treatments; thus providing important new opportunities for basic scientists and sugar beet breeders. Transcriptome-scale identification of agronomically important traits as used in this study should be widely applicable to all crop plants where genomic resources are limiting.RIGHTS : This article is licensed under the BioMed Central licence at http://www.biomedcentral.com/about/license which is similar to the 'Creative Commons Attribution Licence'. In brief you may : copy, distribute, and display the work; make derivative works; or make commercial use of the work - under the following conditions: the original author must be given credit; for any reuse or distribution, it must be made clear to others what the license terms of this work are

    To evaluate and compare the efficacy of alcoholic and aqueous extract of Lagenaria siceraria in high fat diet model in wistar rats

    Get PDF
    Background: Obesity is not only affecting the affluent society but also affecting developing countries like India. The incidence of obesity is rapidly increasing throughout the world. However, the current anti-obesity drugs have numerous limitations.Methods: The obesity was induced in male wistar rats by giving high-fat diet over 12 weeks. The variables assessed were body weight, abdominal girth, blood triglyceride level, liver weight and fat mass and histopathology of liver. Aqueous and alcoholic extracts of Lagenaria siceraria (200mg/kg and 400mg/kg Doses) were compared to orlistat (treatment control) and high-fat diet group (disease control) for different variables.Results: Alcoholic and aqueous extracts high dose (400mg/kg) of Lagenaria siceraria significantly reduced total body weight (p<0.05), abdominal girth (p <0.05) at week 10 and 12 compared to high fat diet group. Alcoholic extract (400mg/kg) significantly reduced total blood triglyceride level (p <0.05) and total liver weight (p <0.05) compared to high-fat diet group. None of the study drugs reduced % liver weight. Alcoholic extract high dose (p <0.05) has shown improvement in histopathological score. Both aqueous and alcoholic extracts have shown reduced fat mass compared to high-fat diet group.Conclusions: The alcoholic extract (400mg/kg) of Lagenaria siceraria alleviated high fat diet induced obesity and dyslipidemic changes in rats. The alcoholic extract of Lagenaria siceraria is having better anti-obesity potential than aqueous extract

    Comparative analysis of module-based versus direct methods for reverse-engineering transcriptional regulatory networks

    Get PDF
    We have compared a recently developed module-based algorithm LeMoNe for reverse-engineering transcriptional regulatory networks to a mutual information based direct algorithm CLR, using benchmark expression data and databases of known transcriptional regulatory interactions for Escherichia coli and Saccharomyces cerevisiae. A global comparison using recall versus precision curves hides the topologically distinct nature of the inferred networks and is not informative about the specific subtasks for which each method is most suited. Analysis of the degree distributions and a regulator specific comparison show that CLR is 'regulator-centric', making true predictions for a higher number of regulators, while LeMoNe is 'target-centric', recovering a higher number of known targets for fewer regulators, with limited overlap in the predicted interactions between both methods. Detailed biological examples in E. coli and S. cerevisiae are used to illustrate these differences and to prove that each method is able to infer parts of the network where the other fails. Biological validation of the inferred networks cautions against over-interpreting recall and precision values computed using incomplete reference networks.Comment: 13 pages, 1 table, 6 figures + 6 pages supplementary information (1 table, 5 figures

    Validating module network learning algorithms using simulated data

    Get PDF
    In recent years, several authors have used probabilistic graphical models to learn expression modules and their regulatory programs from gene expression data. Here, we demonstrate the use of the synthetic data generator SynTReN for the purpose of testing and comparing module network learning algorithms. We introduce a software package for learning module networks, called LeMoNe, which incorporates a novel strategy for learning regulatory programs. Novelties include the use of a bottom-up Bayesian hierarchical clustering to construct the regulatory programs, and the use of a conditional entropy measure to assign regulators to the regulation program nodes. Using SynTReN data, we test the performance of LeMoNe in a completely controlled situation and assess the effect of the methodological changes we made with respect to an existing software package, namely Genomica. Additionally, we assess the effect of various parameters, such as the size of the data set and the amount of noise, on the inference performance. Overall, application of Genomica and LeMoNe to simulated data sets gave comparable results. However, LeMoNe offers some advantages, one of them being that the learning process is considerably faster for larger data sets. Additionally, we show that the location of the regulators in the LeMoNe regulation programs and their conditional entropy may be used to prioritize regulators for functional validation, and that the combination of the bottom-up clustering strategy with the conditional entropy-based assignment of regulators improves the handling of missing or hidden regulators.Comment: 13 pages, 6 figures + 2 pages, 2 figures supplementary informatio

    Genome-wide Analysis of Simultaneous GATA1/2, RUNX1, FLI1, and SCL Binding in Megakaryocytes Identifies Hematopoietic Regulators

    Get PDF
    SummaryHematopoietic differentiation critically depends on combinations of transcriptional regulators controlling the development of individual lineages. Here, we report the genome-wide binding sites for the five key hematopoietic transcription factors—GATA1, GATA2, RUNX1, FLI1, and TAL1/SCL—in primary human megakaryocytes. Statistical analysis of the 17,263 regions bound by at least one factor demonstrated that simultaneous binding by all five factors was the most enriched pattern and often occurred near known hematopoietic regulators. Eight genes not previously appreciated to function in hematopoiesis that were bound by all five factors were shown to be essential for thrombocyte and/or erythroid development in zebrafish. Moreover, one of these genes encoding the PDZK1IP1 protein shared transcriptional enhancer elements with the blood stem cell regulator TAL1/SCL. Multifactor ChIP-Seq analysis in primary human cells coupled with a high-throughput in vivo perturbation screen therefore offers a powerful strategy to identify essential regulators of complex mammalian differentiation processes

    Mammalian transcriptional hotspots are enriched for tissue specific enhancers near cell type specific highly expressed genes and are predicted to act as transcriptional activator hubs

    Get PDF
    BACKGROUND: Transcriptional hotspots are defined as genomic regions bound by multiple factors. They have been identified recently as cell type specific enhancers regulating developmentally essential genes in many species such as worm, fly and humans. The in-depth analysis of hotspots across multiple cell types in same species still remains to be explored and can bring new biological insights. RESULTS: We therefore collected 108 transcription-related factor (TF) ChIP sequencing data sets in ten murine cell types and classified the peaks in each cell type in three groups according to binding occupancy as singletons (low-occupancy), combinatorials (mid-occupancy) and hotspots (high-occupancy). The peaks in the three groups clustered largely according to the occupancy, suggesting priming of genomic loci for mid occupancy irrespective of cell type. We then characterized hotspots for diverse structural functional properties. The genes neighbouring hotspots had a small overlap with hotspot genes in other cell types and were highly enriched for cell type specific function. Hotspots were enriched for sequence motifs of key TFs in that cell type and more than 90% of hotspots were occupied by pioneering factors. Though we did not find any sequence signature in the three groups, the H3K4me1 binding profile had bimodal peaks at hotspots, distinguishing hotspots from mono-modal H3K4me1 singletons. In ES cells, differentially expressed genes after perturbation of activators were enriched for hotspot genes suggesting hotspots primarily act as transcriptional activator hubs. Finally, we proposed that ES hotspots might be under control of SetDB1 and not DNMT for silencing. CONCLUSION: Transcriptional hotspots are enriched for tissue specific enhancers near cell type specific highly expressed genes. In ES cells, they are predicted to act as transcriptional activator hubs and might be under SetDB1 control for silencing. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12859-014-0412-0) contains supplementary material, which is available to authorized users

    Insights into mammalian transcription control by systematic analysis of ChIP sequencing data

    Get PDF
    Abstract Background Transcription regulation is a major controller of gene expression dynamics during development and disease, where transcription factors (TFs) modulate expression of genes through direct or indirect DNA interaction. ChIP sequencing has become the most widely used technique to get a genome wide view of TF occupancy in a cell type of interest, mainly due to established standard protocols and a rapid decrease in the cost of sequencing. The number of available ChIP sequencing data sets in public domain is therefore ever increasing, including data generated by individual labs together with consortia such as the ENCODE project. Results A total of 1735 ChIP-sequencing datasets in mouse and human cell types and tissues were used to perform bioinformatic analyses to unravel diverse features of transcription control. 1- We used the Heat*seq webtool to investigate global relations across the ChIP-seq samples. 2- We demonstrated that factors have a specific genomic location preferences that are, for most factors, conserved across species. 3- Promoter proximal binding of factors was more conserved across cell types while the distal binding sites are more cell type specific. 4- We identified combinations of factors preferentially acting together in a cellular context. 5- Finally, by integrating the data with disease-associated gene loci from GWAS studies, we highlight the value of this data to associate novel regulators to disease. Conclusion In summary, we demonstrate how ChIP sequencing data integration and analysis is powerful to get new insights into mammalian transcription control and demonstrate the utility of various bioinformatic tools to generate novel testable hypothesis using this public resource

    Characterization of transcriptional networks in blood stem and progenitor cells using high-throughput single-cell gene expression analysis

    Get PDF
    Cellular decision-making is mediated by a complex interplay of external stimuli with the intracellular environment, in particular transcription factor regulatory networks. Here we have determined the expression of a network of 18 key haematopoietic transcription factors in 597 single primary blood stem and progenitor cells isolated from mouse bone marrow. We demonstrate that different stem/progenitor populations are characterized by distinctive transcription factor expression states, and through comprehensive bioinformatic analysis reveal positively and negatively correlated transcription factor pairings, including previously unrecognized relationships between Gata2, Gfi1 and Gfi1b. Validation using transcriptional and transgenic assays confirmed direct regulatory interactions consistent with a regulatory triad in immature blood stem cells, where Gata2 may function to modulate cross-inhibition between Gfi1 and Gfi1b. Single-cell expression profiling therefore identifies network states and allows reconstruction of network hierarchies involved in controlling stem cell fate choices, and provides a blueprint for studying both normal development and human disease

    Dynamics of promoter bivalency and RNAP II pausing in mouse stem and differentiated cells

    Get PDF
    Mammalian embryonic stem cells display a unique epigenetic and transcriptional state to facilitate pluripotency by maintaining lineage-specification genes in a poised state. Two epigenetic and transcription processes involved in maintaining poised state are bivalent chromatin, characterized by the simultaneous presence of activating and repressive histone methylation marks, and RNA polymerase II (RNAPII) promoter proximal pausing. However, the dynamics of histone modifications and RNAPII at promoters in diverse cellular contexts remains underexplored. We collected genome wide data for bivalent chromatin marks H3K4me3 and H3K27me3, and RNAPII (8WG16) occupancy together with expression profiling in eight different cell types, including ESCs, in mouse. The epigenetic and transcription profiles at promoters grouped in over thirty clusters with distinct functional identities and transcription control. The clustering analysis identified distinct bivalent clusters where genes in one cluster retained bivalency across cell types while in the other were mostly cell type specific, but neither showed a high RNAPII pausing. We noted that RNAPII pausing is more associated with active genes than bivalent genes in a cell type, and was globally reduced in differentiated cell types compared to multipotent
    corecore