146 research outputs found

    Noun incorporation in Ainu

    Get PDF
    アイヌ語では名詞を抱合した動詞形というのが見られる。先行研究では他動詞の主語抱合、他動詞の目的語抱合、自動詞の主語抱合があり、抱合される名詞の意味役割は他動詞の対象、自動詞の対象、充当接頭辞によって道具や場所も可能であり、名詞+自動詞の場合、その動詞は非対格自動詞であることがすでに指摘されている。本稿ではこれを踏まえ、動詞に名詞が抱合され結果として、名詞句を一つ取る一項動詞が形成される場合、残りの一つを埋める名詞はその意味役割が動作主である場合、対象である場合、(抱合された名詞の)所有主である場合もあり、ときには基本形の項ではない場合もあるが、いずれの場合も格表示は主格となり、主語となることを述べる。そして、一項動詞が使役接尾辞を取らずに名詞を抱合する際には、一項動詞に名詞が直接抱合される場合であっても、動詞に充当接頭辞が接頭した上で名詞が抱合される場合であっても、その動詞は対象を主語とする非対格動詞であることを述べる

    Correspondence between codon-usage in highly expressed genes and tRNA gene copy number

    No full text
    <p><b>Copyright information:</b></p><p>Taken from "Codon usage suggests that translational selection has a major impact on protein expression in trypanosomatids"</p><p>http://www.biomedcentral.com/1471-2164/9/2</p><p>BMC Genomics 2008;9():2-2.</p><p>Published online 3 Jan 2008</p><p>PMCID:PMC2217535.</p><p></p> Correspondence between amino acid frequency and cognate tRNA gene copy number in ; patterns were broadly similar in and (data not shown). Correspondence between synonymous codon usage (lower charts with black bars) and cognate tRNA gene copy number (upper charts with grey bars). The GA3 codon-pairs with >30% bias in all three trypanosomatids are shown

    Systematic Analysis of Compositional Order of Proteins Reveals New Characteristics of Biological Functions and a Universal Correlate of Macroevolution

    Get PDF
    <div><p>We present a novel analysis of compositional order (CO) based on the occurrence of Frequent amino-acid Triplets (FTs) that appear much more than random in protein sequences. The method captures all types of proteomic compositional order including single amino-acid runs, tandem repeats, periodic structure of motifs and otherwise low complexity amino-acid regions. We introduce new order measures, distinguishing between ‘regularity’, ‘periodicity’ and ‘vocabulary’, to quantify these phenomena and to facilitate the identification of evolutionary effects. Detailed analysis of representative species across the tree-of-life demonstrates that CO proteins exhibit numerous functional enrichments, including a wide repertoire of particular patterns of dependencies on regularity and periodicity. Comparison between human and mouse proteomes further reveals the interplay of CO with evolutionary trends, such as faster substitution rate in mouse leading to decrease of periodicity, while innovation along the human lineage leads to larger regularity. Large-scale analysis of 94 proteomes leads to systematic ordering of all major taxonomic groups according to FT-vocabulary size. This is measured by the count of Different Frequent Triplets (DFT) in proteomes. The latter provides a clear hierarchical delineation of vertebrates, invertebrates, plants, fungi and prokaryotes, with thermophiles showing the lowest level of FT-vocabulary. Among eukaryotes, this ordering correlates with phylogenetic proximity. Interestingly, in all kingdoms CO accumulation in the proteome has universal characteristics. We suggest that CO is a genomic-information correlate of both macroevolution and various protein functions. The results indicate a mechanism of genomic ‘innovation’ at the peptide level, involved in protein elongation, shaped in a universal manner by mutational and selective forces.</p></div

    Frequent Triplets – Theory and simulation.

    No full text
    <p>Expected values of Frequent Triplets (FTs) in random proteins as function of sequence length. Length range is up to 35,000 amino-acids, approximately the length of the longest proteins found among the proteomes of the 94 species studied (TITIN in human, and beta-helical in <i>Chlorobium</i>). A) Blue curve is the theoretical expected value given by the Bernoulli probability, for <i>n = 5</i>. Dark circles are the corresponding results of a numerical search of triplets showing perfect match to the theoretical estimation. Red circles are the numerical results for restrictive FTs defined by <i>n = 5</i> and <i>M = 2000</i>. Inset: same data is shown up to <i>L = 8000</i> for clarity. Additional black curves represent the theoretical estimation for <i>n = 4–6</i>. B) <i>P</i>-value for FT misidentification as function of length on log-scale. C) Length distribution of human proteins showing log-normal characteristics. Length of CO proteins is right-shifted (see also <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003346#pcbi.1003346.s026" target="_blank">Text S1</a> -section 3, <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003346#pcbi.1003346.s006" target="_blank">figure S6d</a>). Further analysis based on a human “unigram” reference model is provided in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003346#pcbi.1003346.s026" target="_blank">Text S1</a> - sections 1 and 2, where the few very long proteins are analyzed in detail.</p

    List of 94 species.

    No full text
    <p>List of the 94 species distributed across the tree-of-life studied in the large-scale analysis and their taxonomic identities, Eukaryotes (1–39) and Prokaryotes (49–94). The ordering of species is according to the tree-of life <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003346#pcbi.1003346-Ciccarelli1" target="_blank">[47]</a>. Within Eukaryotes, kingdoms are first ordered from Animalia to Plantae (P) to Fungi (F). Animalia are classified as vertebrates (V), and invertebrates (IV). Within each kingdom ordering is according the phylogenetic distance from the first species, i.e. Human within Animalia, <i>A. thaliana</i> within Plantae and <i>Nectria</i> within Fungi. Protista (PRT) are added at the end with no phylogenetic analysis. Bacteria are also ordered according to the Phylum as presented in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003346#pcbi.1003346-Ciccarelli1" target="_blank">[47]</a>, where within each Phylum the ordering is according to DFT counts. Archaea are ordered by DFT counts. Mesophiles (M) and Thremophiles (T) are indicated.</p

    Universal dependence of RP and DFT on protein length.

    No full text
    <p>The relationship, on a log-log scale, between the CO measures RP, RC and DFT and protein length, L. Upper panel (A–C) display human proteins indicating strong correlation of RP (A) and DFT (C) but not RC (B), ρ indicated the Pearson correlation coefficient. A clear linear boundary in RC is due to its lower bound <i>3/L</i>. Linear regression analysis shows excellent power-law fits of RP and DFT dependence on L. Data was binned to 50 equally spaced intervals along the y-axis. ‘X’ symbols denote the average of L in each bin, error (SD) on the mean is at the size of the symbol and therefore not shown. The blue line is the result of a linear regression fit. Middle Panel (D–F) shows a superposition of RP-L data for all species (D) and the quality of its linear regression fits in (E,F). Slopes increase from Eukaryote to Prokaryotes (E) coupled with a decrease in the goodness of fit (F). Lower panel (G–I) is the same type of analysis for DFT-L dependence. Note that the slope trends are opposite. The ratio of the RP-L and DFT-L slopes is close to −1 in all species: it is −1.11±0.05 in eukaryotes. In prokaryotes, excluding 9 outliers, the ratio is −0.85±0.05.</p

    Functional enrichment in <i>A. Thaliana</i> and <i>S. cerevisiae</i>.

    No full text
    <p>Similarly to <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003346#pcbi-1003346-g002" target="_blank">figure 2</a>, functional enrichment in <i>A. Thaliana</i> (A–C) and <i>S. cerevisiae</i> (D–F) are shown with respect to RC (black) or RP (red). Portions of cell wall genes (A, D) and extracellular related genes (C, F) are enriched with increasing the threshold of RC, while portions of response related genes (B, E) are enriched with RP in <i>A. thaliana</i> but RC in yeast.</p

    Examples of compositional order and functional enrichment.

    No full text
    <p>Examples of selected functional groups with high CO in human. Based on Swiss-Prot records, the portions of each functional group in the entire proteome and within the CO set (i.e., proteins containing FTs) are given in numbers and percentages. Last columns indicate the average RC and RP, which should be compared with the overall mean values of 0.1 (RC) and 0.35 (RP) in the CO set (n = 5511).</p

    Typical Examples of proteins containing FTs.

    No full text
    <p>Typical examples of order patterns, as obtained by FT search in the human proteome. For each protein, Swiss-Prot entry name and main function is given in the first column, and then follow the protein length, the number of different frequent-triplets (DFT), the leading FTs, defined by the maximal number of occurrences of a FT, and the CO measures MFI, RC, RP. The leading FTs are highlighted within the protein sequence, displayed in the last column; in some cases they form runs of amino-acids (A–B), while in other cases they form large repetitive motifs of various purities (C–F). See <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003346#s4" target="_blank">Methods</a> for more details.</p

    DFT enrichment in prokaryotes.

    No full text
    <p>DFT count and correlation <i>C<sub>IJ</sub></i> of the 55 studied prokaryotes. Bacteria are grouped into phyla which are ordered according to their phylogenetic distance, from firmicutes to proteobacteria, and within each phylum species are ordered by DFT counts. Archaea are ordered by DFT counts. Upper panel displays the heatmap of <i>C<sub>IJ</sub></i>, lower panel displays DFT counts (red points indicate thermophiles). Color scale is different from <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003346#pcbi-1003346-g006" target="_blank">figure 6</a>, in order to be able to trace trends which extend over several orders of magnitude. Abbreviations: Firmicutes (Firm); Actinobacteria (Act); Bacteriodes (Bac); Chlamydiae (Ch); Cyanobacteria (Cya), Protobacteria (Proto), Mesophiles (M), Thermophiles (T).</p
    corecore