48,068 research outputs found

    Holistic corpus-based dialectology

    Get PDF
    This paper is concerned with sketching future directions for corpus-based dialectology. We advocate a holistic approach to the study of geographically conditioned linguistic variability, and we present a suitable methodology, 'corpusbased dialectometry', in exactly this spirit. Specifically, we argue that in order to live up to the potential of the corpus-based method, practitioners need to (i) abandon their exclusive focus on individual linguistic features in favor of the study of feature aggregates, (ii) draw on computationally advanced multivariate analysis techniques (such as multidimensional scaling, cluster analysis, and principal component analysis), and (iii) aid interpretation of empirical results by marshalling state-of-the-art data visualization techniques. To exemplify this line of analysis, we present a case study which explores joint frequency variability of 57 morphosyntax features in 34 dialects all over Great Britain

    Joint perceptual decision-making: a case study in explanatory pluralism.

    Get PDF
    Traditionally different approaches to the study of cognition have been viewed as competing explanatory frameworks. An alternative view, explanatory pluralism, regards different approaches to the study of cognition as complementary ways of studying the same phenomenon, at specific temporal and spatial scales, using appropriate methodological tools. Explanatory pluralism has been often described abstractly, but has rarely been applied to concrete cases. We present a case study of explanatory pluralism. We discuss three separate ways of studying the same phenomenon: a perceptual decision-making task (Bahrami et al., 2010), where pairs of subjects share information to jointly individuate an oddball stimulus among a set of distractors. Each approach analyzed the same corpus but targeted different units of analysis at different levels of description: decision-making at the behavioral level, confidence sharing at the linguistic level, and acoustic energy at the physical level. We discuss the utility of explanatory pluralism for describing this complex, multiscale phenomenon, show ways in which this case study sheds new light on the concept of pluralism, and highlight good practices to critically assess and complement approaches

    QuesNet: A Unified Representation for Heterogeneous Test Questions

    Full text link
    Understanding learning materials (e.g. test questions) is a crucial issue in online learning systems, which can promote many applications in education domain. Unfortunately, many supervised approaches suffer from the problem of scarce human labeled data, whereas abundant unlabeled resources are highly underutilized. To alleviate this problem, an effective solution is to use pre-trained representations for question understanding. However, existing pre-training methods in NLP area are infeasible to learn test question representations due to several domain-specific characteristics in education. First, questions usually comprise of heterogeneous data including content text, images and side information. Second, there exists both basic linguistic information as well as domain logic and knowledge. To this end, in this paper, we propose a novel pre-training method, namely QuesNet, for comprehensively learning question representations. Specifically, we first design a unified framework to aggregate question information with its heterogeneous inputs into a comprehensive vector. Then we propose a two-level hierarchical pre-training algorithm to learn better understanding of test questions in an unsupervised way. Here, a novel holed language model objective is developed to extract low-level linguistic features, and a domain-oriented objective is proposed to learn high-level logic and knowledge. Moreover, we show that QuesNet has good capability of being fine-tuned in many question-based tasks. We conduct extensive experiments on large-scale real-world question data, where the experimental results clearly demonstrate the effectiveness of QuesNet for question understanding as well as its superior applicability

    Using Fuzzy Linguistic Representations to Provide Explanatory Semantics for Data Warehouses

    Get PDF
    A data warehouse integrates large amounts of extracted and summarized data from multiple sources for direct querying and analysis. While it provides decision makers with easy access to such historical and aggregate data, the real meaning of the data has been ignored. For example, "whether a total sales amount 1,000 items indicates a good or bad sales performance" is still unclear. From the decision makers' point of view, the semantics rather than raw numbers which convey the meaning of the data is very important. In this paper, we explore the use of fuzzy technology to provide this semantics for the summarizations and aggregates developed in data warehousing systems. A three layered data warehouse semantic model, consisting of quantitative (numerical) summarization, qualitative (categorical) summarization, and quantifier summarization, is proposed for capturing and explicating the semantics of warehoused data. Based on the model, several algebraic operators are defined. We also extend the SQL language to allow for flexible queries against such enhanced data warehouses

    On the universal structure of human lexical semantics

    Full text link
    How universal is human conceptual structure? The way concepts are organized in the human brain may reflect distinct features of cultural, historical, and environmental background in addition to properties universal to human cognition. Semantics, or meaning expressed through language, provides direct access to the underlying conceptual structure, but meaning is notoriously difficult to measure, let alone parameterize. Here we provide an empirical measure of semantic proximity between concepts using cross-linguistic dictionaries. Across languages carefully selected from a phylogenetically and geographically stratified sample of genera, translations of words reveal cases where a particular language uses a single polysemous word to express concepts represented by distinct words in another. We use the frequency of polysemies linking two concepts as a measure of their semantic proximity, and represent the pattern of such linkages by a weighted network. This network is highly uneven and fragmented: certain concepts are far more prone to polysemy than others, and there emerge naturally interpretable clusters loosely connected to each other. Statistical analysis shows such structural properties are consistent across different language groups, largely independent of geography, environment, and literacy. It is therefore possible to conclude the conceptual structure connecting basic vocabulary studied is primarily due to universal features of human cognition and language use.Comment: Press embargo in place until publicatio

    Painting and Language: A Pictoral Syntax of Shapes

    Get PDF
    In previous articles, the author proposed that paintings can have syntactic rules. In this article he develops his proposal further and shows that shapes act as syntactic elements in the languages of painting styles. He meets Nelson Goodman\u27s objections to his proposal by showing that shapes meet the criterion of syntactic discreteness proposed by the latter to separate linguistic from other symbolic systems. His approach is to specify style as the domain of a language of painting, to show that style is syntactical and to argue that shapes are the primitive syntactic elements of style. His essay relates current research on the development of syntax for picture-reading machines to the question of syntax for paintings
    • 

    corecore