28 research outputs found

    Synthetic Standards Combined With Error and Bias Correction Improve the Accuracy and Quantitative Resolution of Antibody Repertoire Sequencing in Human Naïve and Memory B Cells

    Get PDF
    High-throughput sequencing of immunoglobulin (Ig) repertoires (Ig-seq) is a powerful method for quantitatively interrogating B cell receptor sequence diversity. When applied to human repertoires, Ig-seq provides insight into fundamental immunological questions, and can be implemented in diagnostic and drug discovery projects. However, a major challenge in Ig-seq is ensuring accuracy, as library preparation protocols and sequencing platforms can introduce substantial errors and bias that compromise immunological interpretation. Here, we have established an approach for performing highly accurate human Ig-seq by combining synthetic standards with a comprehensive error and bias correction pipeline. First, we designed a set of 85 synthetic antibody heavy-chain standards (in vitro transcribed RNA) to assess correction workflow fidelity. Next, we adapted a library preparation protocol that incorporates unique molecular identifiers (UIDs) for error and bias correction which, when applied to the synthetic standards, resulted in highly accurate data. Finally, we performed Ig-seq on purified human circulating B cell subsets (naïve and memory), combined with a cellular replicate sampling strategy. This strategy enabled robust and reliable estimation of key repertoire features such as clonotype diversity, germline segment, and isotype subclass usage, and somatic hypermutation. We anticipate that our standards and error and bias correction pipeline will become a valuable tool for researchers to validate and improve accuracy in human Ig-seq studies, thus leading to potentially new insights and applications in human antibody repertoire profiling

    Computational Strategies for Dissecting the High-Dimensional Complexity of Adaptive Immune Repertoires

    No full text
    The adaptive immune system recognizes antigens via an immense array of antigen-binding antibodies and T-cell receptors, the immune repertoire. The interrogation of immune repertoires is of high relevance for understanding the adaptive immune response in disease and infection (e.g., autoimmunity, cancer, HIV). Adaptive immune receptor repertoire sequencing (AIRR-seq) has driven the quantitative and molecular-level profiling of immune repertoires, thereby revealing the high-dimensional complexity of the immune receptor sequence landscape. Several methods for the computational and statistical analysis of large-scale AIRR-seq data have been developed to resolve immune repertoire complexity and to understand the dynamics of adaptive immunity. Here, we review the current research on (i) diversity, (ii) clustering and network, (iii) phylogenetic, and (iv) machine learning methods applied to dissect, quantify, and compare the architecture, evolution, and specificity of immune repertoires. We summarize outstanding questions in computational immunology and propose future directions for systems immunology toward coupling AIRR-seq with the computational discovery of immunotherapeutics, vaccines, and immunodiagnostics

    Large-scale network analysis reveals the sequence space architecture of antibody repertoires

    No full text
    The architecture of mouse and human antibody repertoires is defined by the sequence similarity networks of the clones that compose them. The major principles that define the architecture of antibody repertoires have remained largely unknown. Here, we establish a high-performance computing platform to construct large-scale networks from comprehensive human and murine antibody repertoire sequencing datasets (>100,000 unique sequences). Leveraging a network-based statistical framework, we identify three fundamental principles of antibody repertoire architecture: reproducibility, robustness and redundancy. Antibody repertoire networks are highly reproducible across individuals despite high antibody sequence dissimilarity. The architecture of antibody repertoires is robust to the removal of up to 50–90% of randomly selected clones, but fragile to the removal of public clones shared among individuals. Finally, repertoire architecture is intrinsically redundant. Our analysis provides guidelines for the large-scale network analysis of immune repertoires and may be used in the future to define disease-associated and synthetic repertoires

    Maturation of the Human Immunoglobulin Heavy Chain Repertoire With Age

    Full text link
    B cells play a central role in adaptive immune processes, mainly through the production of antibodies. The maturation of the B cell system with age is poorly studied. We extensively investigated age-related alterations of naïve and antigen-experienced immunoglobulin heavy chain (IgH) repertoires. The most significant changes were observed in the first 10 years of life, and were characterized by altered immunoglobulin gene usage and an increased frequency of mutated antibodies structurally diverging from their germline precursors. Older age was associated with an increased usage of downstream IgH constant region genes and fewer antibodies with self-reactive properties. As mutations accumulated with age, the frequency of germline-encoded self-reactive antibodies decreased, indicating a possible beneficial role of self-reactive B cells in the developing immune system. Our results suggest a continuous process of change through childhood across a broad range of parameters characterizing IgH repertoires and stress the importance of using well-selected, age-appropriate controls in IgH studies

    High-throughput sequencing error and bias correction increases the quantitative resolution of human naïve and memory B-cell receptor repertoires

    No full text
    Accurate high-throughput sequencing of immunoglobulin (Ig) chains (Ig-Seq) is often problematic due to primer bias and sequencing errors. Human Ig sequencing is further complicated by factors such as greater population-level germline allelic diversity, longer CDR3 regions relative to murine sequences, and a more complex antigenic history combined with higher frequency of somatic hypermutation (SHM), particularly in affinity-matured memory B-cell subsets. As a result, Ig heavy chain repertoire analysis tends to underestimate combinatorial diversity while simultaneously overestimating SHM. To overcome these issues, we developed a workflow for highly accurate human antibody heavy chain sequencing. First, we designed a set of 85 synthetic (in vitro transcribed RNA) Ig heavy chain standards representing all known IGHV and IGHJ alleles, unique CDR3s, and incorporating point mutations to mimic SHM. These standards are used in both isotype-dependent and -independent manners at predetermined ratios as spike-ins with biological samples to control for sequencing accuracy. Next, we prepared antibody libraries from purified circulating human B cells and spike-in RNA using a protocol known as molecular amplification fingerprinting (MAF), which incorporates unique molecular identifiers before and during multiplexed PCR amplification. We then performed MAF-based error and bias correction, and cellular replicate sampling to generate a robust, reliable, and highly accurate analysis of human antibody repertoires. We applied the workflow to estimate clonal diversity, gene segment usage, and SHM in naïve (IgM+ CD27-) and memory (IgG+ CD27+) B-cell subsets isolated from three different donors. Based on the sampling size, we are able to estimate the clonal diversity of the human naïve B-cell repertoire and that of the IgG memory B-cell repertoire combined with the level of SHM

    Benchmarking immunoinformatic tools for the analysis of antibody repertoire sequences

    Get PDF
    Antibody repertoires reveal insights into the biology of the adaptive immune system and empower diagnostics and therapeutics. There are currently multiple tools available for the annotation of antibody sequences. All downstream analyses such as choosing lead drug candidates depend on the correct annotation of these sequences; however, a thorough comparison of the performance of these tools has not been investigated. Here, we benchmark the performance of commonly used immunoinformatic tools, i.e. IMGT/HighV-QUEST, IgBLAST and MiXCR, in terms of reproducibility of annotation output, accuracy and speed using simulated and experimental high-throughput sequencing datasets
    corecore