<p>(A) The Illumina sequencer outputs fastq files that are separated by barcodes. For each of these files, the portion of DNA corresponding to the displayed peptides was isolated and translated. The number of times each sequence was read in a run was summed to obtain the frequency associated with that sequence, which was subsequently divided by the total number of reads from the run and then by the frequency of that sequence in the reference library. This processing resulted in a normalized frequency for each sequence of a run. (B) Sequences present in one screen but absent in another were set to the non-zero mode of the absent screen rather than zero to prevent later division by zero. The normalized frequencies across all positive screens were averaged as well as across all negative screens. The average positive normalized frequency was divided by the average negative normalized frequency and this ratio was used to sort the sequences so that sequences high across positive screens and low across negative screens distilled to the top fraction. Sequences ordered by ratio created the rows of the comparison matrix showing all of the normalized frequencies for each sequence across all screens, facilitating identification of the most selective sequences. * PhD libraries from NEB are generated with constrained codons. When using this library, sequences containing codons not represented in the library are removed.</p