1,013 research outputs found

    Role of neutral evolution in word turnover during centuries of english word popularity

    Get PDF
    © The Author(s). Here, we test Neutral models against the evolution of English word frequency and vocabulary at the corpus scale, as recorded in annual word frequencies from three centuries of English language books. Against these data, we test both static and dynamic predictions of two neutral models, including the relation between corpus size and vocabulary size, frequency distributions, and turnover within those frequency distributions. Although a commonly used Neutral model fails to replicate all these emergent properties at once, we find that modified two-stage Neutral model does replicate the static and dynamic properties of the corpus data. This two-stage model is meant to represent a relatively small corpus of English books, analogous to a ‘canon’, sampled by an exponentially increasing corpus of books among the wider population of authors. More broadly, this model — a smaller neutral model within a larger neutral model — could represent more broadly those situations where mass attention is focused on a small subset of the cultural variants

    The Logic of Fashion Cycles

    Get PDF
    Many cultural traits exhibit volatile dynamics, commonly dubbed fashions or fads. Here we show that realistic fashion-like dynamics emerge spontaneously if individuals can copy others' preferences for cultural traits as well as traits themselves. We demonstrate this dynamics in simple mathematical models of the diffusion, and subsequent abandonment, of a single cultural trait which individuals may or may not prefer. We then simulate the coevolution between many cultural traits and the associated preferences, reproducing power-law frequency distributions of cultural traits (most traits are adopted by few individuals for a short time, and very few by many for a long time), as well as correlations between the rate of increase and the rate of decrease of traits (traits that increase rapidly in popularity are also abandoned quickly and vice versa). We also establish that alternative theories, that fashions result from individuals signaling their social status, or from individuals randomly copying each other, do not satisfactorily reproduce these empirical observations

    Within-host microevolution of Streptococcus pneumoniae is rapid and adaptive during natural colonisation

    Get PDF
    Genomic evolution, transmission and pathogenesis of Streptococcus pneumoniae, an opportunistic human-adapted pathogen, is driven principally by nasopharyngeal carriage. However, little is known about genomic changes during natural colonisation. Here, we use whole-genome sequencing to investigate within-host microevolution of naturally carried pneumococci in ninety-eight infants intensively sampled sequentially from birth until twelve months in a high-carriage African setting. We show that neutral evolution and nucleotide substitution rates up to forty-fold faster than observed over longer timescales in S. pneumoniae and other bacteria drives high within-host pneumococcal genetic diversity. Highly divergent co-existing strain variants emerge during colonisation episodes through real-time intra-host homologous recombination while the rest are co-transmitted or acquired independently during multiple colonisation episodes. Genic and intergenic parallel evolution occur particularly in antibiotic resistance, immune evasion and epithelial adhesion genes. Our findings suggest that within-host microevolution is rapid and adaptive during natural colonisation

    Carriage Dynamics of Pneumococcal Serotypes in Naturally Colonized Infants in a Rural African Setting During the First Year of Life

    Get PDF
    Streptococcus pneumoniae (the pneumococcus) carriage precedes invasive disease and influences population-wide strain dynamics, but limited data exist on temporal carriage patterns of serotypes due to the prohibitive costs of longitudinal studies. Here, we report carriage prevalence, clearance and acquisition rates of pneumococcal serotypes sampled from newborn infants bi-weekly from weeks 1 to 27, and then bi-monthly from weeks 35 to 52 in the Gambia. We used sweep latex agglutination and whole genome sequencing to serotype the isolates. We show rapid pneumococcal acquisition with nearly 31% of the infants colonized by the end of first week after birth and quickly exceeding 95% after 2 months. Co-colonization with multiple serotypes was consistently observed in over 40% of the infants at each sampling point during the first year of life. Overall, the mean acquisition time and carriage duration regardless of serotype was 38 and 24 days, respectively, but varied considerably between serotypes comparable to observations from other regions. Our data will inform disease prevention and control measures including providing baseline data for parameterising infectious disease mathematical models including those assessing the impact of clinical interventions such as pneumococcal conjugate vaccines

    Emotional Sentence Annotation Helps Predict Fiction Genre

    Get PDF
    Fiction, a prime form of entertainment, has evolved into multiple genres which one can broadly attribute to different forms of stories. In this paper, we examine the hypothesis that works of fiction can be characterised by the emotions they portray. To investigate this hypothesis, we use the work of fictions in the Project Gutenberg and we attribute basic emotional content to each individual sentence using Ekman’s model. A time-smoothed version of the emotional content for each basic emotion is used to train extremely randomized trees. We show through 10-fold Cross-Validation that the emotional content of each work of fiction can help identify each genre with significantly higher probability than random. We also show that the most important differentiator between genre novels is fear

    Plasma Homeostasis and Cloacal Urine Composition in Crocodylus porosus Caught Along a Salinity Gradient

    Get PDF
    Juveniles of the Estuarine or Saltwater Crocodile, Crocodylus porosus, maintain both osmotic pressure and plasma electrolyte homeostasis along a salinity gradient from fresh water to the sea. In fresh water (FW) the cloacal urine is a clear solution rich in ammonium and bicarbonate and containing small amounts of white precipitated solids with high concentrations of calcium and magnesium. In salt water (SW) the cloacal urine has a much higher proportion of solids, cream rather than white in colour, which are the major route for excretion of potassium in addition to calcium and magnesium. Neither liquid nor solid fractions of the cloacal urine represent a major route for excretion of sodium chloride. The solids are urates and uric acid, and their production probably constitutes an important strategy for water conservation by C. porosus in SW. These data, coupled with natural history observations and the recent identification of lingual salt glands, contribute to the conclusion that C. porosus is able to live and breed in either fresh or salt water and may be as euryhaline as any reptile

    International genomic definition of pneumococcal lineages, to contextualise disease, antibiotic resistance and vaccine impact

    Get PDF
    Background: Pneumococcal conjugate vaccines have reduced the incidence of invasive pneumococcal disease, caused by vaccine serotypes, but non-vaccine-serotypes remain a concern. We used whole genome sequencing to study pneumococcal serotype, antibiotic resistance and invasiveness, in the context of genetic background. / Methods: Our dataset of 13,454 genomes, combined with four published genomic datasets, represented Africa (40%), Asia (25%), Europe (19%), North America (12%), and South America (5%). These 20,027 pneumococcal genomes were clustered into lineages using PopPUNK, and named Global Pneumococcal Sequence Clusters (GPSCs). From our dataset, we additionally derived serotype and sequence type, and predicted antibiotic sensitivity. We then measured invasiveness using odds ratios that relating prevalence in invasive pneumococcal disease to carriage. / Findings: The combined collections (n = 20,027) were clustered into 621 GPSCs. Thirty-five GPSCs observed in our dataset were represented by >100 isolates, and subsequently classed as dominant-GPSCs. In 22/35 (63%) of dominant-GPSCs both non-vaccine serotypes and vaccine serotypes were observed in the years up until, and including, the first year of pneumococcal conjugate vaccine introduction. Penicillin and multidrug resistance were higher (p < .05) in a subset dominant-GPSCs (14/35, 9/35 respectively), and resistance to an increasing number of antibiotic classes was associated with increased recombination (R2 = 0.27 p < .0001). In 28/35 dominant-GPSCs, the country of isolation was a significant predictor (p < .05) of its antibiogram (mean misclassification error 0.28, SD ± 0.13). We detected increased invasiveness of six genetic backgrounds, when compared to other genetic backgrounds expressing the same serotype. Up to 1.6-fold changes in invasiveness odds ratio were observed. / Interpretation: We define GPSCs that can be assigned to any pneumococcal genomic dataset, to aid international comparisons. Existing non-vaccine-serotypes in most GPSCs preclude the removal of these lineages by pneumococcal conjugate vaccines; leaving potential for serotype replacement. A subset of GPSCs have increased resistance, and/or serotype-independent invasiveness

    Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>High-throughput sequencing technologies, such as the Illumina Genome Analyzer, are powerful new tools for investigating a wide range of biological and medical questions. Statistical and computational methods are key for drawing meaningful and accurate conclusions from the massive and complex datasets generated by the sequencers. We provide a detailed evaluation of statistical methods for normalization and differential expression (DE) analysis of Illumina transcriptome sequencing (mRNA-Seq) data.</p> <p>Results</p> <p>We compare statistical methods for detecting genes that are significantly DE between two types of biological samples and find that there are substantial differences in how the test statistics handle low-count genes. We evaluate how DE results are affected by features of the sequencing platform, such as, varying gene lengths, base-calling calibration method (with and without phi X control lane), and flow-cell/library preparation effects. We investigate the impact of the read count normalization method on DE results and show that the standard approach of scaling by total lane counts (e.g., RPKM) can bias estimates of DE. We propose more general quantile-based normalization procedures and demonstrate an improvement in DE detection.</p> <p>Conclusions</p> <p>Our results have significant practical and methodological implications for the design and analysis of mRNA-Seq experiments. They highlight the importance of appropriate statistical methods for normalization and DE inference, to account for features of the sequencing platform that could impact the accuracy of results. They also reveal the need for further research in the development of statistical and computational methods for mRNA-Seq.</p

    Putative novel cps loci in a large global collection of pneumococci

    Get PDF
    The pneumococcus produces a polysaccharide capsule, encoded by the cps locus, that provides protection against phagocytosis and determines serotype. Nearly 100 serotypes have been identified with new serotypes still being discovered, especially in previously understudied regions. Here we present an analysis of the cps loci of more than 18  000 genomes from the Global Pneumococcal Sequencing (GPS) project with the aim of identifying novel cps loci with the potential to produce previously unrecognized capsule structures. Serotypes were assigned using whole genome sequence data and 66 of the approximately 100 known serotypes were included in the final dataset. Closer examination of each serotype’s sequences identified nine putative novel cps loci (9X, 11X, 16X, 18X1, 18X2, 18X3, 29X, 33X and 36X) found in ~2.6  % of the genomes. The large number and global distribution of GPS genomes provided an unprecedented opportunity to identify novel cps loci and consider their phylogenetic and geographical distribution. Nine putative novel cps loci were identified and examples of each will undergo subsequent structural and immunological analysis

    You Name It – How Memory and Delay Govern First Name Dynamics

    Get PDF
    The adoption and abandonment of first names through time is a fascinating phenomenon that may shed light on social dynamics and the forces that determine cultural taste in general. Here we show that baby name dynamics is governed almost solely by deterministic forces, even though the emerging abundance statistics resembles the one obtained from a pure drift model. Exogenous events are shown to affect the name dynamics very rarely, and most of the year-to-year fluctuations around the deterministic trend may be attributed solely to demographic noise. We suggest that the rise and fall of a name reflect an “infection” process with delay and memory. The symmetry between adoption and abandonment speed emerges from our model without further assumptions
    corecore