65 research outputs found

    Don't blame distributional semantics if it can't do entailment

    Get PDF
    Distributional semantics has had enormous empirical success in Computational Linguistics and Cognitive Science in modeling various semantic phenomena, such as semantic similarity, and distributional models are widely used in state-of-the-art Natural Language Processing systems. However, the theoretical status of distributional semantics within a broader theory of language and cognition is still unclear: What does distributional semantics model? Can it be, on its own, a fully adequate model of the meanings of linguistic expressions? The standard answer is that distributional semantics is not fully adequate in this regard, because it falls short on some of the central aspects of formal semantic approaches: truth conditions, entailment, reference, and certain aspects of compositionality. We argue that this standard answer rests on a misconception: These aspects do not belong in a theory of expression meaning, they are instead aspects of speaker meaning, i.e., communicative intentions in a particular context. In a slogan: words do not refer, speakers do. Clearing this up enables us to argue that distributional semantics on its own is an adequate model of expression meaning. Our proposal sheds light on the role of distributional semantics in a broader theory of language and cognition, its relationship to formal semantics, and its place in computational models.Horizon 2020(H2020)715154FGW – Publications without University Leiden contrac

    A framework for Distributional Formal Semantics

    Get PDF
    Formal semantics and distributional semantics offer complementary strengths in capturing the meaning of natural language. As such, a considerable amount of research has sought to unify them, either by augmenting formal semantic systems with a distributional component, or by defining a formal system on top of distributed representations. Arriving at such a unified framework has, however, proven extremely challenging. One reason for this is that formal and distributional semantics operate on a fundamentally different `representational currency': formal semantics defines meaning in terms of models of the world, whereas distributional semantics defines meaning in terms of linguistic co-occurrence. Here, we pursue an alternative approach by deriving a vector space model that defines meaning in a distributed manner relative to formal models of the world. We will show that the resulting Distributional Formal Semantics offers probabilistic distributed representations that are also inherently compositional, and that naturally capture quantification and entailment. We moreover show that, when used as part of a neural network model, these representations allow for capturing incremental meaning construction and probabilistic inferencing. This framework thus lays the groundwork for an integrated distributional and formal approach to meaning

    Tempo and drivers of plant diversification in the European mountain system

    Get PDF
    There is still limited consensus on the evolutionary history of species-rich temperate alpine floras due to a lack of comparable and high-quality phylogenetic data covering multiple plant lineages. Here we reconstructed when and how European alpine plant lineages diversified, i.e., the tempo and drivers of speciation events. We performed full-plastome phylogenomics and used multi-clade comparative models applied to six representative angiosperm lineages that have diversified in European mountains (212 sampled species, 251 ingroup species total). Diversification rates remained surprisingly steady for most clades, even during the Pleistocene, with speciation events being mostly driven by geographic divergence and bedrock shifts. Interestingly, we inferred asymmetrical historical migration rates from siliceous to calcareous bedrocks, and from higher to lower elevations, likely due to repeated shrinkage and expansion of high elevation habitats during the Pleistocene. This may have buffered climate-related extinctions, but prevented speciation along elevation gradients as often documented for tropical alpine floras

    A large genome-wide association study of age-related macular degeneration highlights contributions of rare and common variants.

    Get PDF
    This is the author accepted manuscript. The final version is available from Nature Publishing Group via http://dx.doi.org/10.1038/ng.3448Advanced age-related macular degeneration (AMD) is the leading cause of blindness in the elderly, with limited therapeutic options. Here we report on a study of >12 million variants, including 163,714 directly genotyped, mostly rare, protein-altering variants. Analyzing 16,144 patients and 17,832 controls, we identify 52 independently associated common and rare variants (P < 5 × 10(-8)) distributed across 34 loci. Although wet and dry AMD subtypes exhibit predominantly shared genetics, we identify the first genetic association signal specific to wet AMD, near MMP9 (difference P value = 4.1 × 10(-10)). Very rare coding variants (frequency <0.1%) in CFH, CFI and TIMP3 suggest causal roles for these genes, as does a splice variant in SLC16A8. Our results support the hypothesis that rare coding variants can pinpoint causal genes within known genetic loci and illustrate that applying the approach systematically to detect new loci requires extremely large sample sizes.We thank all participants of all the studies included for enabling this research by their participation in these studies. Computer resources for this project have been provided by the high-performance computing centers of the University of Michigan and the University of Regensburg. Group-specific acknowledgments can be found in the Supplementary Note. The Center for Inherited Diseases Research (CIDR) Program contract number is HHSN268201200008I. This and the main consortium work were predominantly funded by 1X01HG006934-01 to G.R.A. and R01 EY022310 to J.L.H

    The seeds of divergence: the economy of French North America, 1688 to 1760

    Get PDF
    Generally, Canada has been ignored in the literature on the colonial origins of divergence with most of the attention going to the United States. Late nineteenth century estimates of income per capita show that Canada was relatively poorer than the United States and that within Canada, the French and Catholic population of Quebec was considerably poorer. Was this gap long standing? Some evidence has been advanced for earlier periods, but it is quite limited and not well-suited for comparison with other societies. This thesis aims to contribute both to Canadian economic history and to comparative work on inequality across nations during the early modern period. With the use of novel prices and wages from Quebec—which was then the largest settlement in Canada and under French rule—a price index, a series of real wages and a measurement of Gross Domestic Product (GDP) are constructed. They are used to shed light both on the course of economic development until the French were defeated by the British in 1760 and on standards of living in that colony relative to the mother country, France, as well as the American colonies. The work is divided into three components. The first component relates to the construction of a price index. The absence of such an index has been a thorn in the side of Canadian historians as it has limited the ability of historians to obtain real values of wages, output and living standards. This index shows that prices did not follow any trend and remained at a stable level. However, there were episodes of wide swings—mostly due to wars and the monetary experiment of playing card money. The creation of this index lays the foundation of the next component. The second component constructs a standardized real wage series in the form of welfare ratios (a consumption basket divided by nominal wage rate multiplied by length of work year) to compare Canada with France, England and Colonial America. Two measures are derived. The first relies on a “bare bones” definition of consumption with a large share of land-intensive goods. This measure indicates that Canada was poorer than England and Colonial America and not appreciably richer than France. However, this measure overestimates the relative position of Canada to the Old World because of the strong presence of land-intensive goods. A second measure is created using a “respectable” definition of consumption in which the basket includes a larger share of manufactured goods and capital-intensive goods. This second basket better reflects differences in living standards since the abundance of land in Canada (and Colonial America) made it easy to achieve bare subsistence, but the scarcity of capital and skilled labor made the consumption of luxuries and manufactured goods (clothing, lighting, imported goods) highly expensive. With this measure, the advantage of New France over France evaporates and turns slightly negative. In comparison with Britain and Colonial America, the gap widens appreciably. This element is the most important for future research. By showing a reversal because of a shift to a different type of basket, it shows that Old World and New World comparisons are very sensitive to how we measure the cost of living. Furthermore, there are no sustained improvements in living standards over the period regardless of the measure used. Gaps in living standards observed later in the nineteenth century existed as far back as the seventeenth century. In a wider American perspective that includes the Spanish colonies, Canada fares better. The third component computes a new series for Gross Domestic Product (GDP). This is to avoid problems associated with using real wages in the form of welfare ratios which assume a constant labor supply. This assumption is hard to defend in the case of Colonial Canada as there were many signs of increasing industriousness during the eighteenth and nineteenth centuries. The GDP series suggest no long-run trend in living standards (from 1688 to circa 1765). The long peace era of 1713 to 1740 was marked by modest economic growth which offset a steady decline that had started in 1688, but by 1760 (as a result of constant warfare) living standards had sunk below their 1688 levels. These developments are accompanied by observations that suggest that other indicators of living standard declined. The flat-lining of incomes is accompanied by substantial increases in the amount of time worked, rising mortality and rising infant mortality. In addition, comparisons of incomes with the American colonies confirm the results obtained with wages— Canada was considerably poorer. At the end, a long conclusion is provides an exploratory discussion of why Canada would have diverged early on. In structural terms, it is argued that the French colony was plagued by the problem of a small population which prohibited the existence of scale effects. In combination with the fact that it was dispersed throughout the territory, the small population of New France limited the scope for specialization and economies of scale. However, this problem was in part created, and in part aggravated, by institutional factors like seigneurial tenure. The colonial origins of French America’s divergence from the rest of North America are thus partly institutional

    The Seeds of Divergence: The Economy of French North America, 1688 to 1760

    Full text link

    A scaling law beyond Zipf'\''s law and its relation with Heaps'\'' law

    No full text
    The dependence with text length of the statistical properties of word occurrences has long been considered a severe limitation {for the usefulness of} quantitative linguistics. We propose a simple scaling form for the distribution of absolute word frequencies which uncovers the robustness of this distribution as text grows. In this way, the shape of the distribution is always the same and it is only a scale parameter which increases linearly with text length. By analyzing very long novels we show that this behavior holds both for raw, unlemmatized texts and for lemmatized texts. For the latter case, the word-frequency distribution is well fit by a double power law, maintaining the Zipf'\''s exponent value γ2 \gamma\simeq 2 for large frequencies but yielding a smaller exponent in the low frequency regime. The growth of the distribution with text length allows us to estimate the size of the vocabulary at each step and to propose an alternative to Heaps'\'' law, which turns out to be intimately connected to Zipf'\''s law, thanks to the scaling behavior
    corecore