Search CORE

23 research outputs found

Quantifying selection in immune receptor repertoires

Author: Callan Jr. Curtis G.
Elhanati Yuval
Mora Thierry
Murugan Anand
Walczak Aleksandra M.
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 20/04/2014
Field of study

The efficient recognition of pathogens by the adaptive immune system relies on the diversity of receptors displayed at the surface of immune cells. T-cell receptor diversity results from an initial random DNA editing process, called VDJ recombination, followed by functional selection of cells according to the interaction of their surface receptors with self and foreign antigenic peptides. To quantify the effect of selection on the highly variable elements of the receptor, we apply a probabilistic maximum likelihood approach to the analysis of high-throughput sequence data from the

\beta

-chain of human T-cell receptors. We quantify selection factors for V and J gene choice, and for the length and amino-acid composition of the variable region. Our approach is necessary to disentangle the effects of selection from biases inherent in the recombination process. Inferred selection factors differ little between donors, or between naive and memory repertoires. The number of sequences shared between donors is well-predicted by the model, indicating a purely stochastic origin of such "public" sequences. We find a significant correlation between biases induced by VDJ recombination and our inferred selection factors, together with a reduction of diversity during selection. Both effects suggest that natural selection acting on the recombination process has anticipated the selection pressures experienced during somatic evolution

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

PubMed Central

OLGA: fast computation of generation probabilities of B- and T-cell receptor amino acid sequences and motifs

Author: Callan Jr. Curtis G.
Elhanati Yuval
Mora Thierry
Sethna Zachary
Walczak Aleksandra M.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 13/11/2018
Field of study

Motivation: High-throughput sequencing of large immune repertoires has enabled the development of methods to predict the probability of generation by V(D)J recombination of T- and B-cell receptors of any specific nucleotide sequence. These generation probabilities are very non-homogeneous, ranging over 20 orders of magnitude in real repertoires. Since the function of a receptor really depends on its protein sequence, it is important to be able to predict this probability of generation at the amino acid level. However, brute-force summation over all the nucleotide sequences with the correct amino acid translation is computationally intractable. The purpose of this paper is to present a solution to this problem. Results: We use dynamic programming to construct an efficient and flexible algorithm, called OLGA (Optimized Likelihood estimate of immunoGlobulin Amino-acid sequences), for calculating the probability of generating a given CDR3 amino acid sequence or motif, with or without V/J restriction, as a result of V(D)J recombination in B or T cells. We apply it to databases of epitope-specific T-cell receptors to evaluate the probability that a typical human subject will possess T cells responsive to specific disease-associated epitopes. The model prediction shows an excellent agreement with published data. We suggest that OLGA may be a useful tool to guide vaccine design. Availability: Source code is available at https://github.com/zsethna/OLG

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

Hal-Diderot

On generative models of T-cell receptor sequences

Author: Elhanati Yuval
Isacchini Giulio
Mora Thierry
Nourmohammad Armita
Sethna Zachary
Walczak Aleksandra M.
Publication venue: 'American Physical Society (APS)'
Publication date: 01/01/2020
Field of study

T-cell receptors (TCR) are key proteins of the adaptive immune system, generated randomly in each individual, whose diversity underlies our ability to recognize infections and malignancies. Modeling the distribution of TCR sequences is of key importance for immunology and medical applications. Here, we compare two inference methods trained on high-throughput sequencing data: a knowledge-guided approach, which accounts for the details of sequence generation, supplemented by a physics-inspired model of selection; and a knowledge-free Variational Auto-Encoder based on deep artificial neural networks. We show that the knowledge-guided model outperforms the deep network approach at predicting TCR probabilities, while being more interpretable, at a lower computational cost

arXiv.org e-Print Archive

MPG.PuRe

Hal-Diderot

Inferring processes underlying B-cell repertoire diversity

Author: Aleksandra M. Walczak
Curtis G. Callan
Dunn-Walters DK
Janeway C
Quentin Marcou
Shapiro GS
Shiokawa S
Stryer L
Thierry Mora
Yuval Elhanati
Zachary Sethna
Publication venue: 'The Royal Society'
Publication date: 11/02/2015
Field of study

We quantify the VDJ recombination and somatic hypermutation processes in human B-cells using probabilistic inference methods on high-throughput DNA sequence repertoires of human B-cell receptor heavy chains. Our analysis captures the statistical properties of the naive repertoire, first after its initial generation via VDJ recombination and then after selection for functionality. We also infer statistical properties of the somatic hypermutation machinery (exclusive of subsequent effects of selection). Our main results are the following: the B-cell repertoire is substantially more diverse than T-cell repertoires, due to longer junctional insertions; sequences that pass initial selection are distinguished by having a higher probability of being generated in a VDJ recombination event; somatic hypermutations have a non-uniform distribution along the V gene that is well explained by an independent site model for the sequence context around the hypermutation site.Comment: acknowledgement adde

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

PubMed Central

Relating trajectories of uniform and variable yield populations.

Author: Naama Brenner (28701)
Yuval Elhanati (107362)
Publication venue
Publication date
Field of study

<p>All gray trajectories, ending on the solid red line (variable-yield stopping line) at points corresponding to , build up the cumulative probability for the final population to have less than cells of type 1. Due to the monotone property of trajectories, they all cross also the dashed line (uniform-yield stopping line) that passes through the point obeying the same ocnstraint and thus the cumulative probability is the same. The parameters of the two lines are simply related through (see Eq. (11)).</p

FigShare

Final population size for a heterogeneous micro-population with metabolic tradeoff.

Author: Naama Brenner (28701)
Yuval Elhanati (107362)
Publication venue
Publication date
Field of study

<p>(A) Distributions of the final populations size from simulations with division rate ratio and yield ratio , for different initial population sizes - cells (solid line), cells (dashed line), cells (dotted line), cells (dash-dot line). In (B) we can see the Standard deviation of the final population size as function of initial population size, in good agreement with the analytic approximation.</p

FigShare

Average final vs. initial population size in micro-populations grown to saturation of resource.

Author: Naama Brenner (28701)
Yuval Elhanati (107362)
Publication venue
Publication date
Field of study

<p>Dotted line: a population with a uniform yield. Symbols: Monte Carlo results for two-state populations with variability in yield and in growth rate. Dashed lines: analytic approximations relevant only for special parameter values. Crosses: Monte Carlo simulation for “metabolic tradeoff” (lower crosses), (upper crosses). circles: variable yield positively correlated with division rate (upper circles), (lower circles).</p

FigShare

Scaled distribution of the number of cells of metabolic type 1 in the final population.

Author: Naama Brenner (28701)
Yuval Elhanati (107362)
Publication venue
Publication date
Field of study

<p>All distributions are for symmetric initial composition, equal yields and a large number of divisions. Different distributions in a plot are for different initial populations (Blue - , Green - , Red - ). These distributions are plotted as a function of the scaling variable (see text for details), and their shape does not depend on the number of divisions but does depend on the initial number of cells. (A) The two types have the same growth rate and are therefore equal in all their properties. Population composition varies only because individual trajectories are composed of different sequences of divisions of the two types. Because of the symmetry between types, all distributions are symmetric around . (B) The two types have different growth rates, and the distribution of final composition becomes skewed.</p

FigShare