20,791 research outputs found

    Cultural transmission results in convergence towards colour term universals.

    Get PDF
    As in biological evolution, multiple forces are involved in cultural evolution. One force is analogous to selection, and acts on differences in the fitness of aspects of culture by influencing who people choose to learn from. Another force is analogous to mutation, and influences how culture changes over time owing to errors in learning and the effects of cognitive biases. Which of these forces need to be appealed to in explaining any particular aspect of human cultures is an open question. We present a study that explores this question empirically, examining the role that the cognitive biases that influence cultural transmission might play in universals of colour naming. In a large-scale laboratory experiment, participants were shown labelled examples from novel artificial systems of colour terms and were asked to classify other colours on the basis of those examples. The responses of each participant were used to generate the examples seen by subsequent participants. By simulating cultural transmission in the laboratory, we were able to isolate a single evolutionary force-the effects of cognitive biases, analogous to mutation-and examine its consequences. Our results show that this process produces convergence towards systems of colour terms similar to those seen across human languages, providing support for the conclusion that the effects of cognitive biases, brought out through cultural transmission, can account for universals in colour naming

    Lost in semantic space: a multi-modal, non-verbal assessment of feature knowledge in semantic dementia

    Get PDF
    A novel, non-verbal test of semantic feature knowledge is introduced, enabling subordinate knowledge of four important concept attributes--colour, sound, environmental context and motion--to be individually probed. This methodology provides more specific information than existing non-verbal semantic tests about the status of attribute knowledge relating to individual concept representations. Performance on this test of a group of 12 patients with semantic dementia (10 male, mean age: 64.4 years) correlated strongly with their scores on more conventional tests of semantic memory, such as naming and word-to-picture matching. The test's overlapping structure, in which individual concepts were probed in two, three or all four modalities, provided evidence of performance consistency on individual items between feature conditions. Group and individual analyses revealed little evidence for differential performance across the four feature conditions, though sound and colour correlated most strongly, and motion least strongly, with other semantic tasks, and patients were less accurate on the motion features of living than non-living concepts (with no such conceptual domain differences in the other conditions). The results are discussed in the context of their implications for the place of semantic dementia within the classification of progressive aphasic syndromes, and for contemporary models of semantic representation and organization

    Predicting Good Configurations for GitHub and Stack Overflow Topic Models

    Full text link
    Software repositories contain large amounts of textual data, ranging from source code comments and issue descriptions to questions, answers, and comments on Stack Overflow. To make sense of this textual data, topic modelling is frequently used as a text-mining tool for the discovery of hidden semantic structures in text bodies. Latent Dirichlet allocation (LDA) is a commonly used topic model that aims to explain the structure of a corpus by grouping texts. LDA requires multiple parameters to work well, and there are only rough and sometimes conflicting guidelines available on how these parameters should be set. In this paper, we contribute (i) a broad study of parameters to arrive at good local optima for GitHub and Stack Overflow text corpora, (ii) an a-posteriori characterisation of text corpora related to eight programming languages, and (iii) an analysis of corpus feature importance via per-corpus LDA configuration. We find that (1) popular rules of thumb for topic modelling parameter configuration are not applicable to the corpora used in our experiments, (2) corpora sampled from GitHub and Stack Overflow have different characteristics and require different configurations to achieve good model fit, and (3) we can predict good configurations for unseen corpora reliably. These findings support researchers and practitioners in efficiently determining suitable configurations for topic modelling when analysing textual data contained in software repositories.Comment: to appear as full paper at MSR 2019, the 16th International Conference on Mining Software Repositorie

    The National Superficial Deposit Thickness Model. (Version 5)

    Get PDF
    The Superficial Deposits Thickness Model (SDTM) is a raster-based dataset designed to demonstrate the variation in thickness of Quaternary-age superficial deposits across Great Britain. Quaternary deposits (all unconsolidated material deposited in the last 2.6 million years) are of particular importance to environmental scientists and consultants concerned with our landscape, environment and habitats. The BGS has been generating national models of the thickness of Quaternary-age deposits since 2001, and this latest version of the model is based upon DiGMapGB-50 Version 5 geological mapping and borehole records registered with BGS before August 2008

    What are natural concepts? A design perspective

    Get PDF
    Conceptual spaces have become an increasingly popular modeling tool in cognitive psychology. The core idea of the conceptual spaces approach is that concepts can be represented as regions in similarity spaces. While it is generally acknowledged that not every region in such a space represents a natural concept, it is still an open question what distinguishes those regions that represent natural concepts from those that do not. The central claim of this paper is that natural concepts are represented by the cells of an optimally designed similarity space

    A fast no-rejection algorithm for the Category Game

    Get PDF
    The Category Game is a multi-agent model that accounts for the emergence of shared categorization patterns in a population of interacting individuals. In the framework of the model, linguistic categories appear as long lived consensus states that are constantly reshaped and re-negotiated by the communicating individuals. It is therefore crucial to investigate the long time behavior to gain a clear understanding of the dynamics. However, it turns out that the evolution of the emerging category system is so slow, already for small populations, that such an analysis has remained so far impossible. Here, we introduce a fast no-rejection algorithm for the Category Game that disentangles the physical simulation time from the CPU time, thus opening the way for thorough analysis of the model. We verify that the new algorithm is equivalent to the old one in terms of the emerging phenomenology and we quantify the CPU performances of the two algorithms, pointing out the neat advantages offered by the no-rejection one. This technical advance has already opened the way to new investigations of the model, thus helping to shed light on the fundamental issue of categorization.Comment: 17 pages, 7 figure

    Learning Colour Representations of Search Queries

    Full text link
    Image search engines rely on appropriately designed ranking features that capture various aspects of the content semantics as well as the historic popularity. In this work, we consider the role of colour in this relevance matching process. Our work is motivated by the observation that a significant fraction of user queries have an inherent colour associated with them. While some queries contain explicit colour mentions (such as 'black car' and 'yellow daisies'), other queries have implicit notions of colour (such as 'sky' and 'grass'). Furthermore, grounding queries in colour is not a mapping to a single colour, but a distribution in colour space. For instance, a search for 'trees' tends to have a bimodal distribution around the colours green and brown. We leverage historical clickthrough data to produce a colour representation for search queries and propose a recurrent neural network architecture to encode unseen queries into colour space. We also show how this embedding can be learnt alongside a cross-modal relevance ranker from impression logs where a subset of the result images were clicked. We demonstrate that the use of a query-image colour distance feature leads to an improvement in the ranker performance as measured by users' preferences of clicked versus skipped images.Comment: Accepted as a full paper at SIGIR 202
    corecore