8 research outputs found

    Identification of Subject-Specific Immunoglobulin Alleles From Expressed Repertoire Sequencing Data

    Get PDF
    The adaptive immune receptor repertoire (AIRR) contains information on an individuals' immune past, present and potential in the form of the evolving sequences that encode the B cell receptor (BCR) repertoire. AIRR sequencing (AIRR-seq) studies rely on databases of known BCR germline variable (V), diversity (D), and joining (J) genes to detect somatic mutations in AIRR-seq data via comparison to the best-aligning database alleles. However, it has been shown that these databases are far from complete, leading to systematic misidentification of mutated positions in subsets of sample sequences. We previously presented TIgGER, a computational method to identify subject-specific V gene genotypes, including the presence of novel V gene alleles, directly from AIRR-seq data. However, the original algorithm was unable to detect alleles that differed by more than 5 single nucleotide polymorphisms (SNPs) from a database allele. Here we present and apply an improved version of the TIgGER algorithm which can detect alleles that differ by any number of SNPs from the nearest database allele, and can construct subject-specific genotypes with minimal prior information. TIgGER predictions are validated both computationally (using a leave-one-out strategy) and experimentally (using genomic sequencing), resulting in the addition of three new immunoglobulin heavy chain V (IGHV) gene alleles to the IMGT repertoire. Finally, we develop a Bayesian strategy to provide a confidence estimate associated with genotype calls. All together, these methods allow for much higher accuracy in germline allele assignment, an essential step in AIRR-seq studies

    Germline polymorphisms and alternative splicing of human immunoglobulin light chain genes

    No full text
    Inference of germline polymorphisms in immunoglobulin genes from B cell receptor repertoires is complicated by somatic hypermutations, sequencing/PCR errors, and by varying length of reference alleles. The light chain inference is particularly challenging owing to large gene duplications and absence of D genes. We analyzed the light chain cDNA sequences from naïve B cell receptor repertoires from 100 individuals. We optimized light chain allele inference by tweaking parameters of the TIgGER functions, extending the germline reference sequences, and establishing mismatch frequency patterns at polymorphic positions to filter out false-positive candidates. We identified 48 previously unreported variants of light chain variable genes. We selected 14 variants for validation and successfully validated 11 by Sanger sequencing. Clustering of light chain 5′UTR, L-PART1, and L-PART2 revealed partial intron retention in 11 kappa and 9 lambda V alleles. Our results provide insight into germline variation in human light chain immunoglobulin loci

    Polymorphisms in human immunoglobulin heavy chain variable genes and their upstream regions

    No full text
    Abstract Germline variations in immunoglobulin genes influence the repertoire of B cell receptors and antibodies, and such polymorphisms may impact disease susceptibility. However, the knowledge of the genomic variation of the immunoglobulin loci is scarce. Here, we report 25 potential novel germline IGHV alleles as inferred from rearranged naïve B cell cDNA repertoires of 98 individuals. Thirteen novel alleles were selected for validation, out of which ten were successfully confirmed by targeted amplification and Sanger sequencing of non-B cell DNA. Moreover, we detected a high degree of variability upstream of the V-REGION in the 5′UTR, L-PART1 and L-PART2 sequences, and found that identical V-REGION alleles can differ in upstream sequences. Thus, we have identified a large genetic variation not only in the V-REGION but also in the upstream sequences of IGHV genes. Our findings provide a new perspective for annotating immunoglobulin repertoire sequencing data

    Stereotyped antibody responses target posttranslationally modified gluten in celiac disease

    No full text
    The role of B cells and posttranslational modifications in pathogenesis of organ-specific immune diseases is increasingly envisioned but remains poorly understood, particularly in human disorders. In celiac disease, transglutaminase 2–modified (TG2-modified; deamidated) gluten peptides drive disease-specific T cell and B cell responses, and antibodies to deamidated gluten peptides are excellent diagnostic markers. Here, we substantiate by high-throughput sequencing of IGHV genes that antibodies to a disease-specific, deamidated, and immunodominant B cell epitope of gluten (PLQPEQPFP) have biased and stereotyped usage of IGHV3-23 and IGHV3-15 gene segments with modest somatic mutations. X-ray crystal structures of 2 prototype IGHV3-15/IGKV4-1 and IGHV3-23/IGLV4-69 antibodies reveal peptide interaction mainly via germline-encoded residues. In-depth mutational analysis showed restricted selection and substitution patterns at positions involved in antigen binding. While the IGHV3-15/IGKV4-1 antibody interacts with Glu5 and Gln6, the IGHV3-23/IGLV4-69 antibody interacts with Gln3, Pro4, Pro7, and Phe8 — residues involved in substrate recognition by TG2. Hence, both antibodies, despite different interaction with the epitope, recognize signatures of TG2 processing that facilitates B cell presentation of deamidated gluten peptides to T cells, thereby providing a molecular framework for the generation of these clinically important antibodies. The study provides essential insight into the pathogenic mechanism of celiac disease. © 2018 American Society for Clinical Investigatio

    Breast cancer is marked by specific, Public T-cell receptor CDR3 regions shared by mice and humans.

    No full text
    The partial success of tumor immunotherapy induced by checkpoint blockade, which is not antigen-specific, suggests that the immune system of some patients contain antigen receptors able to specifically identify tumor cells. Here we focused on T-cell receptor (TCR) repertoires associated with spontaneous breast cancer. We studied the alpha and beta chain CDR3 domains of TCR repertoires of CD4 T cells using deep sequencing of cell populations in mice and applied the results to published TCR sequence data obtained from human patients. We screened peripheral blood T cells obtained monthly from individual mice spontaneously developing breast tumors by 5 months. We then looked at identical TCR sequences in published human studies; we used TCGA data from tumors and healthy tissues of 1,256 breast cancer resections and from 4 focused studies including sequences from tumors, lymph nodes, blood and healthy tissues, and from single cell dataset of 3 breast cancer subjects. We now report that mice spontaneously developing breast cancer manifest shared, Public CDR3 regions in both their alpha and beta and that a significant number of women with early breast cancer manifest identical CDR3 sequences. These findings suggest that the development of breast cancer is associated, across species, with biomarker, exclusive TCR repertoires

    Mosaic deletion patterns of the human antibody heavy chain gene locus shown by Bayesian haplotyping

    No full text
    Analysis of antibody repertoires by high-throughput sequencing is of major importance in understanding adaptive immune responses. Our knowledge of variations in the genomic loci encoding immunoglobulin genes is incomplete, resulting in conflicting VDJ gene assignments and biased genotype and haplotype inference. Haplotypes can be inferred using IGHJ6 heterozygosity, observed in one third of the people. Here, we propose a robust novel method for determining VDJ haplotypes by adapting a Bayesian framework. Our method extends haplotype inference to IGHD- and IGHV-based analysis, enabling inference of deletions and copy number variations in the entire population. To test this method, we generated a multi-individual data set of naive B-cell repertoires, and found allele usage bias, as well as a mosaic, tiled pattern of deleted IGHD and IGHV genes. The inferred haplotypes may have clinical implications for genetic disease predispositions. Our findings expand the knowledge that can be extracted from antibody repertoire sequencing data
    corecore