10 research outputs found

    Regular Patterns for Proteome-Wide Distribution of Protein Abundance across Species

    Get PDF
    A proteome of the bio-entity, including cell, tissue, organ, and organism, consists of proteins of diverse abundance. The principle that determines the abundance of different proteins in a proteome is of fundamental significance for an understanding of the building blocks of the bio-entity. Here, we report three regular patterns in the proteome-wide distribution of protein abundance across species such as human, mouse, fly, worm, yeast, and bacteria: in most cases, protein abundance is positively correlated with the protein's origination time or sequence conservation during evolution; it is negatively correlated with the protein's domain number and positively correlated with domain coverage in protein structure, and the correlations became stronger during the course of evolution; protein abundance can be further stratified by the function of the protein, whereby proteins that act on material conversion and transportation (mass category) are more abundant than those that act on information modulation (information category). Thus, protein abundance is intrinsically related to the protein's inherent characters of evolution, structure, and function

    Proteome-wide correlation of proteins' abundance with their functional categorization across six species.

    No full text
    <p>Abundance distribution of proteins in the mass and the information categories was compared by cumulative curves in <i>H. sapiens</i> (<b>A</b>), <i>M. musculus</i> (<b>B</b>), <i>D. melanogaster</i> (<b>C</b>), <i>C. elegans</i> (<b>D</b>), <i>S. cerevisiae</i> (<b>E</b>), and <i>E. coli</i> (<b>F</b>). Stratified comparison: mass <i>vs.</i> information processing activities in <i>H. sapiens</i> (<b>G</b>) and <i>S. cerevisiae</i> (<b>H</b>); metabolism subclasses in <i>H. sapiens</i> (<b>I</b>) and <i>S. cerevisiae</i> (<b>J</b>). Comparison among biogenesis machines of three bio-molecules in <i>H. sapiens</i> (<b>K</b>), <i>M. musculus</i> (<b>L</b>), <i>D. melanogaster</i> (M), <i>C. elegans</i> (<b>N</b>), <i>S. cerevisiae</i> (<b>O</b>), and <i>E. coli</i> (<b>P</b>).</p

    Proteome-wide correlation of proteins' abundance with their origin time across six species.

    No full text
    <p>The relationship between origin time and abundance of proteins in <i>H. sapiens</i> (<b>A</b>), <i>M. musculus</i> (<b>B</b>), <i>D. melanogaster</i> (<b>C</b>), <i>C. elegans</i> (<b>D</b>), <i>S. cerevisiae</i> (<b>E</b>), and <i>E. coli</i> (<b>F</b>) were analyzed by Spearman rank correlation method. Protein origin time are categorized according to the data in OrthoMCL database. For (A)–(E), I, <1 Gya; II, 1–1.58 Gya; III, 1.58–1.84 Gya; IV, 1.84–2.23 Gya; V, 2.23–4 Gya; VI, >4 Gya. For (F), I, <2.6 Gya; II, 2.6–4 Gya; III, >4 Gya. <i>R</i> represents Spearman rank correlation coefficient and <i>P</i> represents its <i>P</i>-value. The values of upper and lower quartile are indicated as upper and lower edges of the box, and the values of median are indicated as a red bar in the box. The maximum whisker length is set as 1.5, which means points are drawn as outliers (dotted individually outside the bars) if they are larger than q3+1.5Γ—(q3βˆ’q1) (shown as the upper bar) or smaller than q1βˆ’1.5Γ—(q3βˆ’q1) (shown as the lower bar), where q1 and q3 are the 25th and 75th percentiles respectively. (<b><i>SCIN</i></b>: Spectral Count Index Normalized (11); <b><i>SCI</i></b>: Spectral Count Index (11); <b><i>NSAF</i></b>: Normalized Spectral Abundance Factor (12)).</p

    Proteome-wide correlation of proteins' abundance with their sequence conservation across six species.

    No full text
    <p>The abundance of a given <i>H. sapiens</i> protein was plotted with its orthologue sequence similarity: <i>H. sapiens vs. E. coli</i> (<b>A1</b>), <i>S. cerevisiae</i> (<b>A2</b>), <i>C. elegans</i> (<b>A3</b>), <i>D. melanogaster</i> (<b>A4</b>), and <i>M. musculus</i> (<b>A5</b>), respectively. The next five rows are for <i>M. musculus</i> (<b>B1–5</b>), <i>D. melanogaster</i> (<b>C1–5</b>), <i>C. elegans</i> (<b>D1–5</b>), <i>S. cerevisiae</i> (<b>E1–5</b>), and <i>E. coli</i> (<b>F1–5</b>), respectively. Medians of equal-sized bins are indicated as crosses; whiskers encompass the range from 25% to 75% of values. The orthologs of every two species are indicated as dots in the background. The correlation coefficients (<i>R</i>) between sequence conservation and abundance are shown in the inset.</p

    Proteome-wide correlation between proteins' abundance and their domain characters.

    No full text
    <p>Three parameters, namely, domain number (<i>DN</i>) (<b>A</b>), domain coverage (<i>DC</i>) (<b>B</b>), and <i>DC</i>/<i>DN</i> (<b>C</b>), were employed in the analyses. <i>R</i> represents Spearman rank correlation coefficient and <i>P</i> represents its <i>P</i>-value. Medians are indicated as black dots (<b>A</b>) or crosses (<b>B</b>, <b>C</b>), and whiskers encompass the range from 25% to 75% of values.</p
    corecore