56 research outputs found

    Identification of errors introduced during high throughput sequencing of the T cell receptor repertoire

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent advances in massively parallel sequencing have increased the depth at which T cell receptor (TCR) repertoires can be probed by >3log10, allowing for saturation sequencing of immune repertoires. The resolution of this sequencing is dependent on its accuracy, and direct assessments of the errors formed during high throughput repertoire analyses are limited.</p> <p>Results</p> <p>We analyzed 3 monoclonal TCR from TCR transgenic, Rag<sup>-/- </sup>mice using Illumina<sup>® </sup>sequencing. A total of 27 sequencing reactions were performed for each TCR using a trifurcating design in which samples were divided into 3 at significant processing junctures. More than 20 million complementarity determining region (CDR) 3 sequences were analyzed. Filtering for lower quality sequences diminished but did not eliminate sequence errors, which occurred within 1-6% of sequences. Erroneous sequences were pre-dominantly of correct length and contained single nucleotide substitutions. Rates of specific substitutions varied dramatically in a position-dependent manner. Four substitutions, all purine-pyrimidine transversions, predominated. Solid phase amplification and sequencing rather than liquid sample amplification and preparation appeared to be the primary sources of error. Analysis of polyclonal repertoires demonstrated the impact of error accumulation on data parameters.</p> <p>Conclusions</p> <p>Caution is needed in interpreting repertoire data due to potential contamination with mis-sequence reads. However, a high association of errors with phred score, high relatedness of erroneous sequences with the parental sequence, dominance of specific nt substitutions, and skewed ratio of forward to reverse reads among erroneous sequences indicate approaches to filter erroneous sequences from repertoire data sets.</p

    Routes for breaching and protecting genetic privacy

    Full text link
    We are entering the era of ubiquitous genetic information for research, clinical care, and personal curiosity. Sharing these datasets is vital for rapid progress in understanding the genetic basis of human diseases. However, one growing concern is the ability to protect the genetic privacy of the data originators. Here, we technically map threats to genetic privacy and discuss potential mitigation strategies for privacy-preserving dissemination of genetic data.Comment: Draft for comment

    Allele-Specific Gene Expression Is Widespread Across the Genome and Biological Processes

    Get PDF
    Allelic specific gene expression (ASGE) appears to be an important factor in human phenotypic variability and as a consequence, for the development of complex traits and diseases. In order to study ASGE across the human genome, we have performed a study in which genotyping was coupled with an analysis of ASGE by screening 11,500 SNPs using the Mapping 10 K Array to identify differential allelic expression. We found that from the 5,133 SNPs that were suitable for analysis (heterozygous in our sample and expressed in peripheral blood mononuclear cells), 2,934 (57%) SNPs had differential allelic expression. Such SNPs were equally distributed along human chromosomes and biological processes. We validated the presence or absence of ASGE in 18 out 20 SNPs (90%) randomly selected by real time PCR in 48 human subjects. In addition, we observed that SNPs close to -but not included in- segmental duplications had increased levels of ASGE. Finally, we found that transcripts of unknown function or non-coding RNAs, also display ASGE: from a total of 2,308 intronic SNPs, 1510 (65%) SNPs underwent differential allelic expression. In summary, ASGE is a widespread mechanism in the human genome whose regulation seems to be far more complex than expected

    Down-Regulation of ZnT8 Expression in INS-1 Rat Pancreatic Beta Cells Reduces Insulin Content and Glucose-Inducible Insulin Secretion

    Get PDF
    The SLC30A8 gene codes for a pancreatic beta-cell-expressed zinc transporter, ZnT8. A polymorphism in the SLC30A8 gene is associated with susceptibility to type 2 diabetes, although the molecular mechanism through which this phenotype is manifest is incompletely understood. Such polymorphisms may exert their effect via impacting expression level of the gene product. We used an shRNA-mediated approach to reproducibly downregulate ZnT8 mRNA expression by >90% in the INS-1 pancreatic beta cell line. The ZnT8-downregulated cells exhibited diminished uptake of exogenous zinc, as determined using the zinc-sensitive reporter dye, zinquin. ZnT8-downregulated cells showed reduced insulin content and decreased insulin secretion (expressed as percent of total insulin content) in response to hyperglycemic stimulus, as determined by insulin immunoassay. ZnT8-depleted cells also showed fewer dense-core vesicles via electron microscopy. These data indicate that reduced ZnT8 expression in cultured pancreatic beta cells gives rise to a reduced insulin response to hyperglycemia. In addition, although we provide no direct evidence, these data suggest that an SLC30A8 expression-level polymorphism could affect insulin secretion and the glycemic response in vivo

    Pseudomonas aeruginosa 4-Amino-4-Deoxychorismate Lyase: Spatial Conservation of an Active Site Tyrosine and Classification of Two Types of Enzyme

    Get PDF
    4-Amino-4-deoxychorismate lyase (PabC) catalyzes the formation of 4-aminobenzoate, and release of pyruvate, during folate biosynthesis. This is an essential activity for the growth of Gram-negative bacteria, including important pathogens such as Pseudomonas aeruginosa. A high-resolution (1.75 Å) crystal structure of PabC from P. aeruginosa has been determined, and sequence-structure comparisons with orthologous structures are reported. Residues around the pyridoxal 5′-phosphate cofactor are highly conserved adding support to aspects of a mechanism generic for enzymes carrying that cofactor. However, we suggest that PabC can be classified into two groups depending upon whether an active site and structurally conserved tyrosine is provided from the polypeptide that mainly forms an active site or from the partner subunit in the dimeric assembly. We considered that the conserved tyrosine might indicate a direct role in catalysis: that of providing a proton to reduce the olefin moiety of substrate as pyruvate is released. A threonine had previously been suggested to fulfill such a role prior to our observation of the structurally conserved tyrosine. We have been unable to elucidate an experimentally determined structure of PabC in complex with ligands to inform on mechanism and substrate specificity. Therefore we constructed a computational model of the catalytic intermediate docked into the enzyme active site. The model suggests that the conserved tyrosine helps to create a hydrophobic wall on one side of the active site that provides important interactions to bind the catalytic intermediate. However, this residue does not appear to participate in interactions with the C atom that undergoes an sp2 to sp3 conversion as pyruvate is produced. The model and our comparisons rather support the hypothesis that an active site threonine hydroxyl contributes a proton used in the reduction of the substrate methylene to pyruvate methyl in the final stage of the mechanism

    Sequential Use of Transcriptional Profiling, Expression Quantitative Trait Mapping, and Gene Association Implicates MMP20 in Human Kidney Aging

    Get PDF
    Kidneys age at different rates, such that some people show little or no effects of aging whereas others show rapid functional decline. We sequentially used transcriptional profiling and expression quantitative trait loci (eQTL) mapping to narrow down which genes to test for association with kidney aging. We first performed whole-genome transcriptional profiling to find 630 genes that change expression with age in the kidney. Using two methods to detect eQTLs, we found 101 of these age-regulated genes contain expression-associated SNPs. We tested the eQTLs for association with kidney aging, measured by glomerular filtration rate (GFR) using combined data from the Baltimore Longitudinal Study of Aging (BLSA) and the InCHIANTI study. We found a SNP association (rs1711437 in MMP20) with kidney aging (uncorrected p = 3.6×10−5, empirical p = 0.01) that explains 1%–2% of the variance in GFR among individuals. The results of this sequential analysis may provide the first evidence for a gene association with kidney aging in humans

    GWAS and colocalization analyses implicate carotid intima-media thickness and carotid plaque loci in cardiovascular outcomes

    Get PDF
    Carotid artery intima media thickness (cIMT) and carotid plaque are measures of subclinical atherosclerosis associated with ischemic stroke and coronary heart disease (CHD). Here, we undertake meta-analyses of genome-wide association studies (GWAS) in 71,128 individuals for cIMT, and 48,434 individuals for carotid plaque traits. We identify eight novel susceptibility loci for cIMT, one independent association at the previously-identified PINX1 locus, and one novel locus for carotid plaque. Colocalization analysis with nearby vascular expression quantitative loci (cis-eQTLs) derived from arterial wall and metabolic tissues obtained from patients with CHD identifies candidate genes at two potentially additional loci, ADAMTS9 and LOXL4. LD score regression reveals significant genetic correlations between cIMT and plaque traits, and both cIMT and plaque with CHD, any stroke subtype and ischemic stroke. Our study provides insights into genes and tissue-specific regulatory mechanisms linking atherosclerosis both to its functional genomic origins and its clinical consequences in humans

    The genetic heterogeneity of colorectal cancer predisposition - guidelines for gene discovery

    Get PDF
    corecore