12 research outputs found
We Continue Each Other
Three female voices with different cultural backgrounds and practices explore the concept and possibilities of the we-narrative. Starting from a position of critical reflection, we dive into the question of how to speak as a female WE. WE is used to differentiate the particular collective dynamic that operates throughout this text from a more general use of the word ‘we’. Our framework is to work with the personal and vulnerable, but at the same time remain open to a dialogue that invites the other, through the concept of empathy. Our overarching aim is to look at what it means when we speak together collectively: whether it brings strength or dilution, and how speaking poly-vocally from a position of lived first person collective experience impacts current ideas around authorship. Is it possible to speak as a WE and write subjectively in a way that does not become a generalisation or a compromise? Guided by Virginia Woolf’s A Room of One’s Own, our text uses the format of autotheoretical writing, drawing on our creative–critical writing practices in the context of visual art. We seek to encompass our female ancestors in visual art. The text generates a dialogue that creates room for the articulation of one’s own voice and hand, whilst intending to leave space or gaps for the other to insert themselves. Appearing in seemingly disparate fragments, the text weaves together to form a tapestry, sometimes performative, sometimes narrative, incorporating both visual and language-based elements
Detection of enriched T cell epitope specificity in full T cell receptor sequence repertoires
High-throughput T cell receptor (TCR) sequencing allows the characterization of an individual's TCR repertoire and directly queries their immune state. However, it remains a non-trivial task to couple these sequenced TCRs to their antigenic targets. In this paper, we present a novel strategy to annotate full TCR sequence repertoires with their epitope specificities. The strategy is based on a machine learning algorithm to learn the TCR patterns common to the recognition of a specific epitope. These results are then combined with a statistical analysis to evaluate the occurrence of specific epitope-reactive TCR sequences per epitope in repertoire data. In this manner, we can directly study the capacity of full TCR repertoires to target specific epitopes of the relevant vaccines or pathogens. We demonstrate the usability of this approach on three independent datasets related to vaccine monitoring and infectious disease diagnostics by independently identifying the epitopes that are targeted by the TCR repertoire. The developed method is freely available as a web tool for academic use at tcrex.biodatamining.be
Current challenges for unseen-epitope TCR interaction prediction and a new perspective derived from image classification
The prediction of epitope recognition by T-cell receptors (TCRs) has seen many advancements in recent years, with several methods now available that can predict recognition for a specific set of epitopes. However, the generic case of evaluating all possible TCR-epitope pairs remains challenging, mainly due to the high diversity of the interacting sequences and the limited amount of currently available training data. In this work, we provide an overview of the current state of this unsolved problem. First, we examine appropriate validation strategies to accurately assess the generalization performance of generic TCR-epitope recognition models when applied to both seen and unseen epitopes. In addition, we present a novel feature representation approach, which we call ImRex (interaction map recognition). This approach is based on the pairwise combination of physicochemical properties of the individual amino acids in the CDR3 and epitope sequences, which provides a convolutional neural network with the combined representation of both sequences. Lastly, we highlight various challenges that are specific to TCR-epitope data and that can adversely affect model performance. These include the issue of selecting negative data, the imbalanced epitope distribution of curated TCR-epitope datasets and the potential exchangeability of TCR alpha and beta chains. Our results indicate that while extrapolation to unseen epitopes remains a difficult challenge, ImRex makes this feasible for a subset of epitopes that are not too dissimilar from the training data. We show that appropriate feature engineering methods and rigorous benchmark standards are required to create and validate TCR-epitope predictive models
Data for "Current challenges for unseen-epitope TCR interaction prediction and a new perspective derived from image classification" (ImRex)
Repository containing the different experiments described in the manuscript titled: "Current challenges for epitope-agnostic TCR interaction prediction and a new perspective derived from image classification".
Publication DOI: TBA
Originally appeared as a preprint on bioRxiv: https://doi.org/10.1101/2019.12.18.880146.
Contains:
Trained model files (.h5)
Associated train and validation datasets for each model.
Learning curves and evaluation metrics.
Log files with training and data arguments (full training scripts are available in GitHub repository).
Comparisons between different models.
Complete raw and processed datasets (also available in the associated GitHub repository).
Please refer to the associated GitHub repository (https://github.com/pmoris/ImRex) for more information on the directory structure and contents, as well as the scripts that generated these output files.
Contents:
data.zip: Contains raw and preprocessed datasets. READMEs in subdirectory describe the data sources and preprocessing steps. Please refer to the associated GitHub repository for the specific scripts that generated these files. Note that the full training and test sets (i.e. containing both positive and negative examples) are stored separately for each model/CV iteration in the models archives.
models-main.zip: contains the trained models and evaluation metrics for the main different experiments described in the bash and pbs scripts in ./src/scripts/hpc_scripts. Log files for the experiments outlined here can be found in ./src/scripts/hpc_scripts.
models-full.zip: contains models that were trained on the complete VDJdb dataset without cross-validation, filtered on human TRB data, no 10x data and restricted to 10-20 (CDR3) or 8-11 (epitope) amino acid residues, with negatives that were generated by shuffling (i.e. sampling an negative epitope for each positive CDR3 sequence). One set of models uses downsampling to reduce the most abundant epitopes down to 400 pairs each, the other one does not use any downsampling. These models were also used for evaluating on the external Adaptive dataset, as outlined in ./src/scripts/evaluate/evaluate_adaptive.sh, and the TRA subset of sequences (./src/scripts/evaluate/evaluate_tra.sh).
models-decoyfit.zip: contains models that were trained on true data, but evaluated on data where epitopes were replaced by decoys.
models-padded-epitoperatio.zip: contains a quick test of trained models (padded/interaction map) that use a different type of negative shuffling, see docstrings in ./src/processing/negative_sampler.py for more info.
models-repeat-local.zip: contains a number of repeated runs from models-main, used to estimate variability in model performance for multiple identical runs.
comparisons.zip: contains comparison directories, each consisting of two or more model output directories, that contrast the performance metrics of the models. These outputs were generated by using the ./src/scripts/evaluate/visualize.py script, or by using the oneliners in ./src/scripts/evaluate/visualise.sh, which can operate on the entire comparisons directory at once.
Note that any file paths described here are in reference to the associated GitHub repository (https://github.com/pmoris/ImRex).
Overview of different experiments:
Two main architectures were compared: the interaction map (or padded) CNN and a dual input CNN based on NetTCR (nettcr).
Two different cross-validation strategies were used: a 5x repeated 5-fold CV (repeated5fold) and an epitope-grouped CV (epitope_grouped).
The different dataset subsets are labelled as follows. Check the Makefile's preprocess-vdjdb-aug-2019 command (and the underlying script ./src/scripts/preprocessing/preprocess_vdjdb.py) for a more thorough overview of the different filtering options.
mhci: only MHCI class presented epitopes.
trb: only TRB CDR3 sequences.
tra: only TRA CDR3 sequences.
tratrb: both types of CDR3 sequences.
down: moderate downsampling of most abundant epitopes to 1000 pairs.
down400: strong downsampling of most abundant epitopes to 400 pairs.
decoy: decoy epitope data.
reg001: regularization factor 0.01 (only for padded/interaction type models, fixed value)
Two different methods of generating negative TCR-epitope pairs were used: shuffling of positive pairs, i.e. sampling a single epitope from the positive pairs for each CDR3 sequence (shuffle), and sampling CDR3s from a reference repertoire (negref).
The batch size is labelled as b32 = a batch size of 32.
The learning rate was always 0.0001 (lre4) or 0.001 (lre3)
Analysis of Wilms\u2019 tumor protein 1 specific TCR repertoire in AML patients uncovers higher diversity in patients in remission than in relapsed
Abstract: The Wilms\u2019 tumor protein 1 (WT1) is a well-known and prioritized tumor-associated antigen expressed in numerous solid and blood tumors. Its abundance and immunogenicity have led to the development of different WT1-specific immune therapies. The driving player in these therapies, the WT1-specific T-cell receptor (TCR) repertoire, has received much less attention. Importantly, T cells with high affinity against the WT1 self-antigen are normally eliminated after negative selection in the thymus and are thus rare in peripheral blood. Here, we developed computational models for the robust and fast identification of WT1-specific TCRs from TCR repertoire data. To this end, WT137-45 (WT1-37) and WT1126-134 (WT1-126)-specific T cells were isolated from WT1 peptide-stimulated blood of healthy individuals. The TCR repertoire from these WT1-specific T cells was sequenced and used to train a pattern recognition model for the identification of WT1-specific TCR patterns for the WT1-37 or WT1-126 epitopes. The resulting computational models were applied on an independent published dataset from acute myeloid leukemia (AML) patients, treated with hematopoietic stem cell transplantation, to track WT1-specific TCRs in silico. Several WT1-specific TCRs were found in AML patients. Subsequent clustering analysis of all repertoires indicated the presence of more diverse TCR patterns within the WT1-specific TCR repertoires of AML patients in complete remission in contrast to relapsing patients. We demonstrate the possibility of tracking WT1-37 and WT1-126-specific TCRs directly from TCR repertoire data using computational methods, eliminating the need for additional blood samples and experiments for the two studied WT1 epitopes