Human TCRs and CDR3s sequenced from healthy volunteers and HIV-infected patients, before and after 14 weeks of therapy

Abstract

<p>We have employed an error-correcting, high-throughput transcript sequencing protocol to profile impact of HIV infection upon the T-cell receptor repertoire.</p> <p>We have sequences both alpha and beta chain TCR repertoires from 2.5ml of peripheral blood for 16 antiretroviral (ART)-naive HIV patients (de-identified reference numbers prefixed 'P0'). Two samples per patient were processed: one immediately before starting therapy ('v1') and another shortly after starting ('v2', after an average 14 weeks). We additionally sequenced the repertoires of ten healthy volunteers ('HV'), taking two blood samples three months apart for four of those donors (HV01 to HV04) in order to compare TCR dynamics over a comparable time frame. One of these healthy donors (HV01) and a further healthy donor (HVD1) also gave a larger blood sample, which was separated into CD4+ and CD8+ T-cell populations by FACS, and their TCR repertoires were seqeunced.</p> <p>Having amplified and sequenced the TCR repertoires of these 100 samples, we identified their VJ recombinations using a modified version of Decombinator (called vDCR), TCR analysis software designed in our lab. We then made use of random barcode sequences introduced before amplification to error- and frequency-correct our Decombinator assignations (DCRs), before translating them and extracting their complementarity determining region 3 (CDR3) sequences.</p> <p>Here we present the results of these analyses. The .dcrcdr3 files in this fileset consist of a unique DCR assignation per line, the CDR3 sequence it encodes, and the frequency with which that TCR appeared in the data following error-correction. Each line follows the format:</p> <p>'V, J, Vdel, Jdel, insert: CDR3, freq'</p> <p>(V = V gene, J = J gene, Vdel = number of deletions from V, Jdel = number of deletions from J, insert = string of nucleotides from the end of the deleted V to the start of the deleted J, CDR3 = translation CDR3 sequence, from the second conserved cysteine residue in the V to the conserved phenylalanine of the FGXG motif of the J, freq = error-corrected frequency of that assignation.)</p> <p>The raw sequence data fastq files from which these TCRs were extracted is available in the Sequence Read Archive (SRA) under the Study accession number SRP045430. The AccessionKey.xls spreadsheet cross-references all filenames with their appropriate SRA individual accession numbers.</p> <p>The Python scripts which were used to generate this data from that raw fastq data are also available on figshare (see links below).</p

    Similar works

    Full text

    thumbnail-image

    Available Versions