39 research outputs found

    Similarities and Differences Between Variants Called With Human Reference Genome HG19 or HG38

    Get PDF
    Background: Reference genome selection is a prerequisite for successful analysis of next generation sequencing (NGS) data. Current practice employs one of the two most recent human reference genome versions: HG19 or HG38. To date, the impact of genome version on SNV identification has not been rigorously assessed. Results: We conducted analysis comparing the SNVs identified based on HG19 vs HG38, leveraging whole genome sequencing (WGS) data from the genome-in-a-bottle (GIAB) project. First, SNVs were called using 26 different bioinformatics pipelines with either HG19 or HG38. Next, two tools were used to convert the called SNVs between HG19 and HG38. Lastly we calculated conversion rates, analyzed discordant rates between SNVs called with HG19 or HG38, and characterized the discordant SNVs. Conclusions: The conversion rates from HG38 to HG19 (average 95%) were lower than the conversion rates from HG19 to HG38 (average 99%). The conversion rates varied slightly among the various calling pipelines. Around 1.5% SNVs were discordantly converted between HG19 or HG38. The conversions from HG38 to HG19 had more SNVs which failed conversion and more discordant SNVs than the opposite conversion (HG19 to HG38). Most of the discordant SNVs had low read depth, were low confidence SNVs as defined by GIAB, and/or were predominated by G/C alleles (52% observed versus 42% expected)

    A Novel Non-Volatile Inverter-based CiM: Continuous Sign Weight Transition and Low Power on-Chip Training

    Full text link
    In this work, we report a novel design, one-transistor-one-inverter (1T1I), to satisfy high speed and low power on-chip training requirements. By leveraging doped HfO2 with ferroelectricity, a non-volatile inverter is successfully demonstrated, enabling desired continuous weight transition between negative and positive via the programmable threshold voltage (VTH) of ferroelectric field-effect transistors (FeFETs). Compared with commonly used designs with the similar function, 1T1I uniquely achieves pure on-chip-based weight transition at an optimized working current without relying on assistance from off-chip calculation units for signed-weight comparison, facilitating high-speed training at low power consumption. Further improvements in linearity and training speed can be obtained via a two-transistor-one-inverter (2T1I) design. Overall, focusing on energy and time efficiencies, this work provides a valuable design strategy for future FeFET-based computing-in-memory (CiM)

    The Reproducibility of Lists of Differentially Expressed Genes in Microarray Studies

    Get PDF
    Reproducibility is a fundamental requirement in scientific experiments and clinical contexts. Recent publications raise concerns about the reliability of microarray technology because of the apparent lack of agreement between lists of differentially expressed genes (DEGs). In this study we demonstrate that (1) such discordance may stem from ranking and selecting DEGs solely by statistical significance (P) derived from widely used simple t-tests; (2) when fold change (FC) is used as the ranking criterion, the lists become much more reproducible, especially when fewer genes are selected; and (3) the instability of short DEG lists based on P cutoffs is an expected mathematical consequence of the high variability of the t-values. We recommend the use of FC ranking plus a non-stringent P cutoff as a baseline practice in order to generate more reproducible DEG lists. The FC criterion enhances reproducibility while the P criterion balances sensitivity and specificity

    Assessing Reproducibility of Inherited Variants Detected With Short-Read Whole Genome Sequencing

    Get PDF
    Background: Reproducible detection of inherited variants with whole genome sequencing (WGS) is vital for the implementation of precision medicine and is a complicated process in which each step affects variant call quality. Systematically assessing reproducibility of inherited variants with WGS and impact of each step in the process is needed for understanding and improving quality of inherited variants from WGS. Results: To dissect the impact of factors involved in detection of inherited variants with WGS, we sequence triplicates of eight DNA samples representing two populations on three short-read sequencing platforms using three library kits in six labs and call variants with 56 combinations of aligners and callers. We find that bioinformatics pipelines (callers and aligners) have a larger impact on variant reproducibility than WGS platform or library preparation. Single-nucleotide variants (SNVs), particularly outside difficult-to-map regions, are more reproducible than small insertions and deletions (indels), which are least reproducible when \u3e 5 bp. Increasing sequencing coverage improves indel reproducibility but has limited impact on SNVs above 30×. Conclusions: Our findings highlight sources of variability in variant detection and the need for improvement of bioinformatics pipelines in the era of precision medicine with WGS

    Assessing reproducibility of inherited variants detected with short-read whole genome sequencing

    Get PDF
    Background: Reproducible detection of inherited variants with whole genome sequencing (WGS) is vital for the implementation of precision medicine and is a complicated process in which each step affects variant call quality. Systematically assessing reproducibility of inherited variants with WGS and impact of each step in the process is needed for understanding and improving quality of inherited variants from WGS. Results: To dissect the impact of factors involved in detection of inherited variants with WGS, we sequence triplicates of eight DNA samples representing two populations on three short-read sequencing platforms using three library kits in six labs and call variants with 56 combinations of aligners and callers. We find that bioinformatics pipelines (callers and aligners) have a larger impact on variant reproducibility than WGS platform or library preparation. Single-nucleotide variants (SNVs), particularly outside difficult-to-map regions, are more reproducible than small insertions and deletions (indels), which are least reproducible when > 5 bp. Increasing sequencing coverage improves indel reproducibility but has limited impact on SNVs above 30x. Conclusions: Our findings highlight sources of variability in variant detection and the need for improvement of bioinformatics pipelines in the era of precision medicine with WGS.Peer reviewe

    Differential gene expression in mouse primary hepatocytes exposed to the peroxisome proliferator-activated receptor α agonists

    Get PDF
    BACKGROUND: Fibrates are a unique hypolipidemic drugs that lower plasma triglyceride and cholesterol levels through their action as peroxisome proliferator-activated receptor alpha (PPARα) agonists. The activation of PPARα leads to a cascade of events that result in the pharmacological (hypolipidemic) and adverse (carcinogenic) effects in rodent liver. RESULTS: To understand the molecular mechanisms responsible for the pleiotropic effects of PPARα agonists, we treated mouse primary hepatocytes with three PPARα agonists (bezafibrate, fenofibrate, and WY-14,643) at multiple concentrations (0, 10, 30, and 100 μM) for 24 hours. When primary hepatocytes were exposed to these agents, transactivation of PPARα was elevated as measured by luciferase assay. Global gene expression profiles in response to PPARα agonists were obtained by microarray analysis. Among differentially expressed genes (DEGs), there were 4, 8, and 21 genes commonly regulated by bezafibrate, fenofibrate, and WY-14,643 treatments across 3 doses, respectively, in a dose-dependent manner. Treatments with 100 μM of bezafibrate, fenofibrate, and WY-14,643 resulted in 151, 149, and 145 genes altered, respectively. Among them, 121 genes were commonly regulated by at least two drugs. Many genes are involved in fatty acid metabolism including oxidative reaction. Some of the gene changes were associated with production of reactive oxygen species, cell proliferation of peroxisomes, and hepatic disorders. In addition, 11 genes related to the development of liver cancer were observed. CONCLUSION: Our results suggest that treatment of PPARα agonists results in the production of oxidative stress and increased peroxisome proliferation, thus providing a better understanding of mechanisms underlying PPARα agonist-induced hepatic disorders and hepatocarcinomas

    Assessing batch effects of genotype calling algorithm BRLMM for the Affymetrix GeneChip Human Mapping 500 K array set using 270 HapMap samples

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Genome-wide association studies (GWAS) aim to identify genetic variants (usually single nucleotide polymorphisms [SNPs]) across the entire human genome that are associated with phenotypic traits such as disease status and drug response. Highly accurate and reproducible genotype calling are paramount since errors introduced by calling algorithms can lead to inflation of false associations between genotype and phenotype. Most genotype calling algorithms currently used for GWAS are based on multiple arrays. Because hundreds of gigabytes (GB) of raw data are generated from a GWAS, the samples are typically partitioned into batches containing subsets of the entire dataset for genotype calling. High call rates and accuracies have been achieved. However, the effects of batch size (i.e., number of chips analyzed together) and of batch composition (i.e., the choice of chips in a batch) on call rate and accuracy as well as the propagation of the effects into significantly associated SNPs identified have not been investigated. In this paper, we analyzed both the batch size and batch composition for effects on the genotype calling algorithm BRLMM using raw data of 270 HapMap samples analyzed with the Affymetrix Human Mapping 500 K array set.</p> <p>Results</p> <p>Using data from 270 HapMap samples interrogated with the Affymetrix Human Mapping 500 K array set, three different batch sizes and three different batch compositions were used for genotyping using the BRLMM algorithm. Comparative analysis of the calling results and the corresponding lists of significant SNPs identified through association analysis revealed that both batch size and composition affected genotype calling results and significantly associated SNPs. Batch size and batch composition effects were more severe on samples and SNPs with lower call rates than ones with higher call rates, and on heterozygous genotype calls compared to homozygous genotype calls.</p> <p>Conclusion</p> <p>Batch size and composition affect the genotype calling results in GWAS using BRLMM. The larger the differences in batch sizes, the larger the effect. The more homogenous the samples in the batches, the more consistent the genotype calls. The inconsistency propagates to the lists of significantly associated SNPs identified in downstream association analysis. Thus, uniform and large batch sizes should be used to make genotype calls for GWAS. In addition, samples of high homogeneity should be placed into the same batch.</p

    The balance of reproducibility, sensitivity, and specificity of lists of differentially expressed genes in microarray studies

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Reproducibility is a fundamental requirement in scientific experiments. Some recent publications have claimed that microarrays are unreliable because lists of differentially expressed genes (DEGs) are not reproducible in similar experiments. Meanwhile, new statistical methods for identifying DEGs continue to appear in the scientific literature. The resultant variety of existing and emerging methods exacerbates confusion and continuing debate in the microarray community on the appropriate choice of methods for identifying reliable DEG lists.</p> <p>Results</p> <p>Using the data sets generated by the MicroArray Quality Control (MAQC) project, we investigated the impact on the reproducibility of DEG lists of a few widely used gene selection procedures. We present comprehensive results from inter-site comparisons using the same microarray platform, cross-platform comparisons using multiple microarray platforms, and comparisons between microarray results and those from TaqMan – the widely regarded "standard" gene expression platform. Our results demonstrate that (1) previously reported discordance between DEG lists could simply result from ranking and selecting DEGs solely by statistical significance (<it>P</it>) derived from widely used simple <it>t</it>-tests; (2) when fold change (FC) is used as the ranking criterion with a non-stringent <it>P</it>-value cutoff filtering, the DEG lists become much more reproducible, especially when fewer genes are selected as differentially expressed, as is the case in most microarray studies; and (3) the instability of short DEG lists solely based on <it>P</it>-value ranking is an expected mathematical consequence of the high variability of the <it>t</it>-values; the more stringent the <it>P</it>-value threshold, the less reproducible the DEG list is. These observations are also consistent with results from extensive simulation calculations.</p> <p>Conclusion</p> <p>We recommend the use of FC-ranking plus a non-stringent <it>P </it>cutoff as a straightforward and baseline practice in order to generate more reproducible DEG lists. Specifically, the <it>P</it>-value cutoff should not be stringent (too small) and FC should be as large as possible. Our results provide practical guidance to choose the appropriate FC and <it>P</it>-value cutoffs when selecting a given number of DEGs. The FC criterion enhances reproducibility, whereas the <it>P </it>criterion balances sensitivity and specificity.</p

    Genomic analysis of microRNA time-course expression in liver of mice treated with genotoxic carcinogen N-ethyl-N-nitrosourea

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Dysregulated expression of microRNAs (miRNAs) has been previously observed in human cancer tissues and shown promise in defining tumor status. However, there is little information as to if or when expression changes of miRNAs occur in normal tissues after carcinogen exposure.</p> <p>Results</p> <p>To explore the possible time-course changes of miRNA expression induced by a carcinogen, we treated mice with one dose of 120 mg/kg <it>N</it>-ethyl-<it>N</it>-nitrosourea (ENU), a model genotoxic carcinogen, and vehicle control. The miRNA expression profiles were assessed in the mouse livers in a time-course design. miRNAs were isolated from the livers at days 1, 3, 7, 15, 30 and 120 after the treatment and their expression was determined using a miRNA PCR Array. Principal component analysis of the miRNA expression profiles showed that miRNA expression at post-treatment days (PTDs) 7 and 15 were different from those at the other time points and the control. The number of differentially expressed miRNAs (DEMs) changed over time (3, 5, 14, 32, 5 and 5 at PTDs 1, 3, 7, 15, 30 and 120, respectively). The magnitude of the expression change varied with time with the highest changes at PTDs 7 or 15 for most of the DEMs. In silico functional analysis of the DEMs at PTDs 7 and 15 indicated that the major functions of these ENU-induced DEMs were associated with DNA damage, DNA repair, apoptosis and other processes related to carcinogenesis.</p> <p>Conclusion</p> <p>Our results showed that many miRNAs changed their expression to respond the exposure of the genotoxic carcinogen ENU and the number and magnitude of the changes were highest at PTDs 7 to 15. Thus, one to two weeks after the exposure is the best time for miRNA expression sampling.</p
    corecore