2 research outputs found

    Presentation_1_Exploring the impact of clonal definition on B-cell diversity: implications for the analysis of immune repertoires.pdf

    No full text
    The adaptive immune system has the extraordinary ability to produce a broad range of immunoglobulins that can bind a wide variety of antigens. During adaptive immune responses, activated B cells duplicate and undergo somatic hypermutation in their B-cell receptor (BCR) genes, resulting in clonal families of diversified B cells that can be related back to a common ancestor. Advances in high-throughput sequencing technologies have enabled the high-throughput characterization of B-cell repertoires, however, the accurate identification of clonally related BCR sequences remains a major challenge. In this study, we compare three different clone identification methods on both simulated and experimental data, and investigate their impact on the characterization of B-cell diversity. We observe that different methods lead to different clonal definitions, which affects the quantification of clonal diversity in repertoire data. Our analyses show that direct comparisons between clonal clusterings and clonal diversity of different repertoires should be avoided if different clone identification methods were used to define the clones. Despite this variability, the diversity indices inferred from the repertoires’ clonal characterization across samples show similar patterns of variation regardless of the clonal identification method used. We find the Shannon entropy to be the most robust in terms of the variability of diversity rank across samples. Our analysis also suggests that the traditional germline gene alignment-based method for clonal identification remains the most accurate when the complete information about the sequence is known, but that alignment-free methods may be preferred for shorter sequencing read lengths. We make our implementation freely available as a Python library cdiversity.</p

    Additional file 1 of Systematic evaluation of B-cell clonal family inference approaches

    No full text
    Additional file 1: Supplementary Figure 1. Data simulation pipeline. Simulation approach is an integration of ImmuneSim, Alakazam and SHazaM tools and equally use the data of CF groupings obtained from each of the 10 CF inference approaches. Supplementary Figure 2. Determination of the number of TP, TN, FP, and FN. Three simulated CFs (2 singletons) and two inferred CFs are shown. Supplementary Figure 3. Overall correlation between the log10(number of CFs) and the standardized sequence depth for all combinations of approach (except SCOPer; A7, A8) and dataset. Supplementary Figure 4. Overall trend between the log10(number of CFs) and the standardized mutation load for all combinations of approach (except SCOPer; A7, A8) and dataset. Supplementary Figure 5. Summary of significant pairwise comparisons between Approaches. Supplementary Figure 6. Number of TP, TN, FP, and FN cases produced by the ten approaches when applied to six samples from three simulated datasets (D10, D11, D12). Supplementary Figure 7. Normalized number of TP, TN, FP, and FN cases produced by the ten approaches when applied to six samples from three simulated datasets (D10, D11, D12)
    corecore