16 research outputs found

    Next-Generation TLC: A Quantitative Platform for Parallel Spotting and Imaging

    Get PDF
    A high-throughput screening approach for simultaneous analysis and quantification of the percent conversion of up to 48 reactions has been developed using a thin-layer chromatography (TLC) imaging method. As a test-bed reaction, we monitored 48 thiol conjugate additions to a Meldrum's acid derivative (1) in parallel using TLC. The TLC elutions were imaged using a cell phone and a LEGO brick-constructed UV/vis light box. Further, a spotting device was constructed from LEGO bricks that allows simple transfer of the samples from a well-plate to the TLC plate. Using software that was developed to detect "blobs" and report their intensity, we were able to quantitatively determine the extent of completion of the 48 reactions with one analysis

    A theoretical justification for single molecule peptide sequencing.

    No full text
    The proteomes of cells, tissues, and organisms reflect active cellular processes and change continuously in response to intracellular and extracellular cues. Deep, quantitative profiling of the proteome, especially if combined with mRNA and metabolite measurements, should provide an unprecedented view of cell state, better revealing functions and interactions of cell components. Molecular diagnostics and biomarker discovery should benefit particularly from the accurate quantification of proteomes, since complex diseases like cancer change protein abundances and modifications. Currently, shotgun mass spectrometry is the primary technology for high-throughput protein identification and quantification; while powerful, it lacks high sensitivity and coverage. We draw parallels with next-generation DNA sequencing and propose a strategy, termed fluorosequencing, for sequencing peptides in a complex protein sample at the level of single molecules. In the proposed approach, millions of individual fluorescently labeled peptides are visualized in parallel, monitoring changing patterns of fluorescence intensity as N-terminal amino acids are sequentially removed, and using the resulting fluorescence signatures (fluorosequences) to uniquely identify individual peptides. We introduce a theoretical foundation for fluorosequencing and, by using Monte Carlo computer simulations, we explore its feasibility, anticipate the most likely experimental errors, quantify their potential impact, and discuss the broad potential utility offered by a high-throughput peptide sequencing technology

    Monte Carlo sampling reveals the confidence with which fluorosequences can be attributed to specific source proteins.

    No full text
    <p>(A) and (B) represent two example fluorosequences, illustrating opposite extremes in terms of the number of proteins capable of yielding each sequence. In (A), the frequencies with which rival source proteins yield fluorosequence “xxxxExxKxK” in the Monte Carlo simulations indicates low confidence in attributing that fluorosequence to any one protein. In (B), a single protein is by far the most likely source of fluorosequence “EEEEExxKxK”. (X-axes represent incomplete lists of proteins, ordered by the frequencyies with which they are observed to generate the given fluorosequence in the simulations.)</p

    A simple example of the trie structure for storing and attributing fluorosequences to peptides or proteins.

    No full text
    <p>Consider a toy peptide mixture with peptide X (sequence GK*EGC, where K* represents fluorescently-labeled lysine; the sequence can be simplified to (K,2)) and peptide Y (GK*GK*EC; represented as (K,2),(K,4)). Panels (A) and (B) summarize populating the trie with fluorosequences from 500 copies each of Peptide X and Y, respectively. For example, peptide X might generate fluorosequence xK*, incorporated into the trie as a new node (K,2), indicated by the dashed blue lines and arrows in panel (A). (B) Simulations on Peptide Y add additional nodes to the trie. For example, the fluorosequence xK*xK* yields an additional node (K,2),(K,4) after traversing node (K,2). Additional fluorosequences are incorporated into the trie in a similar fashion, along with a tally of the number of observations of each fluorosequence, stored for each trie node along with the source peptide identities. Following the Monte Carlo simulation, the frequency of each source protein or peptide can be calculated for each trie node. To simplify data analysis and visualization, thresholds can be applied (see <b><a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004080#sec013" target="_blank">Methods</a></b>) to identify and count those source proteins most confidently identified by the observed fluorosequences. Here, fluorosequences ((K,2),(K,5)) and ((K,2),(K,4)) confidently identify peptide Y, while Peptide X is less confidently identified by fluorosequences (K,2) or (K,3).</p

    A strategy for single-molecule peptide sequencing.

    No full text
    <p>Proteins are extracted and digested into peptides by a sequence-specific endo-peptidase. All occurrences of particular amino acids are selectively labeled by fluorescent dyes (e.g., yellow for tyrosine, green for tryptophan, and blue for lysine residues), and the peptides are surface immobilized for single-molecule imaging (e.g. by anchoring <i>via</i> cysteine). The peptides are subjected to cycles of Edman degradation; in each cycle, a fluorescent Edman reagent (pink trace) couples to and removes the most N-terminal amino acid. The step drop of fluorescent intensity indicates when labeled amino acids are removed, which in combination with the Edman cycle completion signal, gives the resulting <i>fluorosequence</i> (e.g., “WKKxY…”). Matching this partial sequence to a reference protein database identifies the peptide.</p

    Simulations of ideal experimental conditions suggest relatively simple labeling schemes are sufficient to identify most proteins in the human proteome.

    No full text
    <p>Each curve summarizes the fraction of human proteins uniquely identified by at least one peptide as a function of the number of sequential experimental cycles (a paired Edman degradation reaction and TIRF observation). Here, we consider peptides generated by different proteases (<i>e</i>.<i>g</i>. Glu represents cleavage C-terminal to glutamic acid residues by GluC, Met represents cleavage after methionine residues by cyanogen bromide) and under different labeling schemes (<i>e</i>.<i>g</i>. Lys + Tyr indicates Lys and Tyr selectively labeled with two distinguishable fluorophores. Asp/Glu indicates both residues are labeled with identical fluorophores). Peptides are immobilized as indicated, with Cys representing anchoring by cysteines (thus, only cysteine-containing peptides are sequenced) and C-term representing anchoring by C-terminal amino acids. Increasing the number of distinct label types improves identification up to 80% within only 20 experimental cycles even when only Cys-containing peptides are sequenced; near total proteome coverage is theoretically achievable when cyanogen bromide generated peptides are anchored by their C-termini and labeled by a combination of four different fluorophores. Cycle numbers denote upper bounds, since each fluorosequence is not allowed to proceed past the anchoring residue (cysteine or C-terminus). Note also that the peptide length distributions change depending on the enzyme used for cleavage, with median lengths of 26 amino acids for cyanogen bromide, 8 for GluC and 10 for trypsin digests.</p

    Typical proteolytic peptides have counts of labelable amino acids sufficiently low to sequence.

    No full text
    <p>Frequency histograms of amino acids in <i>in silico</i> proteolytic peptides for lysine <b>(A)</b>, tyrosine <b>(B)</b>, tryptophan <b>(C)</b>, and glutamic acid/ aspartic acid <b>(D)</b> indicate low median values. Peptide sequences in A-C were generated <i>in silico</i> from the human proteome by GluC digestion, and those in D by cyanogen bromide digestion. Low counts of labelable amino acids per peptide are expected to increase the ability to discriminate removal of one fluorophore amongst many on a peptide.</p

    Overview of a Monte Carlo simulation of fluorosequencing with errors.

    No full text
    <p>In detail, protein sequences are read as amino-acid character strings from the UniProt database. For each protein sequence, the subsequent steps are repeated: proteolysis was simulated and peptides lacking the residue for surface attachment (e.g. cysteine) were discarded. All remaining peptides were encoded as fluorosequences and subsequent steps were repeated in accordance to the desired sampling depth: The fluorosequences were altered <i>via</i> random functions modeling experimental errors—(1) labels were removed modeling failed fluorophores or failed fluorophore attachment, (2) positions of the remaining labels were randomly dilated modeling Edman reaction failures, and (3) fluorophores were shifted upstream from their positions, modeling photobleaching. Each resulting fluorosequence was sorted based on its position and label type and merged into a prefix trie to tally the frequencies of observing each fluorosequence from a given source protein.</p

    Surface plots illustrate the consequences of differing rates of Edman efficiency, photobleaching, and fluorophore failure rates.

    No full text
    <p>Each panel summarizes the consequences of varying rates of photobleaching and Edman failures for a different fixed fluorophore failure rate, ranging from 0% to 25%, as calculated after simulating 30 experimental cycles on the complete human proteome at a simulation depth of 10,000 copies per protein. Photobleaching shows the strongest negative impact on proteome coverage when compared to other errors; increasing the number of distinguishable labels strongly increases proteome coverage. Labeling and immobilization schemes are denoted as in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004080#pcbi.1004080.g002" target="_blank">Fig. 2</a>. For comparison, literature evidence suggests that common failure rates of fluorophores may be about 15–20% [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004080#pcbi.1004080.ref018" target="_blank">18</a>,<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004080#pcbi.1004080.ref032" target="_blank">32</a>], Edman degradation proceeds with about 94% efficiency [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004080#pcbi.1004080.ref033" target="_blank">33</a>], and the mean photobleaching lifetime of a typical Atto680 dye is about 30 minutes [<a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1004080#pcbi.1004080.ref023" target="_blank">23</a>], corresponding to 1800 Edman cycles, assuming 1 sec exposure per Edman cycle. Thus, we expect error rates to be sufficiently low for effective fluorosequencing.</p
    corecore