13 research outputs found

    The Effects of Microprocessor Architecture on Speedup in Distrbuted Memory Supercomputers

    Get PDF
    Amdahl\u27s Law states that speedup in moving from one processor to N identical processors can never be greater than N, and in fact usually is lower than N because of operations that must be done sequentially. Amdahl\u27s Law gives us the following formula for speedup: Speedup \u3c or = (S+P)/(S+(P/N)) where is the number of processors, S is the percentage of the code that is serial (i.e., cannot be parallelized), and P is the percentage of code that is parallelizable. We can substitute 1 - S for P in the above formula and we see that as S approaches zero speedup approaches N. It can also be shown that seemingly small values of S can severely limit the maximum speedup. Researchers at the University of Maine saw speedups that seemed to contradict Amdahl\u27s Law, and identified an assumption made by the law that is not always true. When this assumption is not true, it is possible to achieve speedups that are larger than the theoretical maximum speedup of N given by Amdahl\u27s Law. The assumption in question is that the computer performance scales linearly as the size of the problem is reduced by dividing it over a larger number of processors. This assumption is not valid for computers with tiered memory. In this thesis we investigate superlinear speedup through a series of test programs specifically designed to exhibit superlinear speedup. After demonstrating these programs show superlinear speedup, we suggest methods for detecting the potential for superlinear speedup in a variety of algorithms

    An imputed genotype resource for the laboratory mouse

    Get PDF
    We have created a high-density SNP resource encompassing 7.87 million polymorphic loci across 49 inbred mouse strains of the laboratory mouse by combining data available from public databases and training a hidden Markov model to impute missing genotypes in the combined data. The strong linkage disequilibrium found in dense sets of SNP markers in the laboratory mouse provides the basis for accurate imputation. Using genotypes from eight independent SNP resources, we empirically validated the quality of the imputed genotypes and demonstrate that they are highly reliable for most inbred strains. The imputed SNP resource will be useful for studies of natural variation and complex traits. It will facilitate association study designs by providing high density SNP genotypes for large numbers of mouse strains. We anticipate that this resource will continue to evolve as new genotype data become available for laboratory mouse strains. The data are available for bulk download or query at http://cgd.jax.org/

    Genomic data analysis workflows for tumors from patient-derived xenografts (PDXs): challenges and guidelines.

    Get PDF
    BACKGROUND: Patient-derived xenograft (PDX) models are in vivo models of human cancer that have been used for translational cancer research and therapy selection for individual patients. The Jackson Laboratory (JAX) PDX resource comprises 455 models originating from 34 different primary sites (as of 05/08/2019). The models undergo rigorous quality control and are genomically characterized to identify somatic mutations, copy number alterations, and transcriptional profiles. Bioinformatics workflows for analyzing genomic data obtained from human tumors engrafted in a mouse host (i.e., Patient-Derived Xenografts; PDXs) must address challenges such as discriminating between mouse and human sequence reads and accurately identifying somatic mutations and copy number alterations when paired non-tumor DNA from the patient is not available for comparison. RESULTS: We report here data analysis workflows and guidelines that address these challenges and achieve reliable identification of somatic mutations, copy number alterations, and transcriptomic profiles of tumors from PDX models that lack genomic data from paired non-tumor tissue for comparison. Our workflows incorporate commonly used software and public databases but are tailored to address the specific challenges of PDX genomics data analysis through parameter tuning and customized data filters and result in improved accuracy for the detection of somatic alterations in PDX models. We also report a gene expression-based classifier that can identify EBV-transformed tumors. We validated our analytical approaches using data simulations and demonstrated the overall concordance of the genomic properties of xenograft tumors with data from primary human tumors in The Cancer Genome Atlas (TCGA). CONCLUSIONS: The analysis workflows that we have developed to accurately predict somatic profiles of tumors from PDX models that lack normal tissue for comparison enable the identification of the key oncogenic genomic and expression signatures to support model selection and/or biomarker development in therapeutic studies. A reference implementation of our analysis recommendations is available at https://github.com/TheJacksonLaboratory/PDX-Analysis-Workflows

    CRISPRtools: a flexible computational platform for performing CRISPR/Cas9 experiments in the mouse.

    No full text
    Genome editing using the CRISPR/Cas9 RNA-guided endonuclease system has rapidly become a driving force for discovery in modern biomedical research. This simple yet elegant system has been widely used to generate both loss-of-function alleles and precision knock-in mutations using single-stranded donor oligonucleotides. Our CRISPRtools platform supports both of these applications in order to facilitate the use of CRISPR/Cas9. While there are several tools that facilitate CRISPR/Cas9 design and screen for potential off-target sites, the process is typically performed sequentially on single genes, limiting scalability for large-scale programs. Here, the design principle underlying gene ablation is based upon using paired guides flanking a critical region/exon of interest to create deletions. Guide pairs are rank ordered based upon published efficiency scores and off-target analyses, and reported in a concise format for downstream implementation. The exon deletion strategy simplifies characterization of founder animals and is the strategy employed for the majority of knockouts in the mouse. In proof-of-principle experiments, the effectiveness of this approach is demonstrated using microinjection and electroporation to introduce CRISPR/Cas9 components into mouse zygotes to delete critical exons. Mamm Genome 2017 Aug; 28(7-8):283-290

    Hierarchical analysis of RNA-seq reads improves the accuracy of allele-specific expression.

    No full text
    Motivation: Allele-specific expression (ASE) refers to the differential abundance of the allelic copies of a transcript. RNA sequencing (RNA-seq) can provide quantitative estimates of ASE for genes with transcribed polymorphisms. When short-read sequences are aligned to a diploid transcriptome, read-mapping ambiguities confound our ability to directly count reads. Multi-mapping reads aligning equally well to multiple genomic locations, isoforms or alleles can comprise the majority (\u3e85%) of reads. Discarding them can result in biases and substantial loss of information. Methods have been developed that use weighted allocation of read counts but these methods treat the different types of multi-reads equivalently. We propose a hierarchical approach to allocation of read counts that first resolves ambiguities among genes, then among isoforms, and lastly between alleles. We have implemented our model in EMASE software (Expectation-Maximization for Allele Specific Expression) to estimate total gene expression, isoform usage and ASE based on this hierarchical allocation. Results: Methods that align RNA-seq reads to a diploid transcriptome incorporating known genetic variants improve estimates of ASE and total gene expression compared to methods that use reference genome alignments. Weighted allocation methods outperform methods that discard multi-reads. Hierarchical allocation of reads improves estimation of ASE even when data are simulated from a non-hierarchical model. Analysis of RNA-seq data from F1 hybrid mice using EMASE reveals widespread ASE associated with cis-acting polymorphisms and a small number of parent-of-origin effects. Availability and implementation: EMASE software is available at https://github.com/churchill-lab/emase. Supplementary information: Supplementary data are available at Bioinformatics online
    corecore