8,557 research outputs found

    The Use of Lexical and Referential Cues in Children’s Online Interpretation of Adjectives

    Get PDF
    Recent research on moment-to-moment language comprehension has revealed striking differences between adults and preschool children. Adults rapidly use the referential principle to resolve syntactic ambiguity, assuming that modification is more likely when there are 2 possible referents for a definite noun phrase. Young children do not. We examine the scope of this phenomenon by exploring whether children use the referential principle to resolve another form of ambiguity. Scalar adjectives (big, small) are typically used to refer to an object when contrasting members of the same category are present in the scene (big and small coins). In the present experiment, 5-year-olds and adults heard instructions like “Point to the big (small) coin” while their eye-movements were measured to displays containing 1 or 2 coins. Both groups rapidly recruited the meaning of the adjective to distinguish between referents of different sizes. Critically, like adults, children were quicker to look to the correct item in trials containing 2 possible referents compared with 1. Nevertheless, children's sensitivity to the referential principle was substantially delayed compared to adults', suggesting possible differences in the recruitment of this top- down cue. The implications of current and previous findings are discussed with respect to the development of the architecture of language comprehension.LinguisticsPsycholog

    Syntactic Topic Models

    Full text link
    The syntactic topic model (STM) is a Bayesian nonparametric model of language that discovers latent distributions of words (topics) that are both semantically and syntactically coherent. The STM models dependency parsed corpora where sentences are grouped into documents. It assumes that each word is drawn from a latent topic chosen by combining document-level features and the local syntactic context. Each document has a distribution over latent topics, as in topic models, which provides the semantic consistency. Each element in the dependency parse tree also has a distribution over the topics of its children, as in latent-state syntax models, which provides the syntactic consistency. These distributions are convolved so that the topic of each word is likely under both its document and syntactic context. We derive a fast posterior inference algorithm based on variational methods. We report qualitative and quantitative studies on both synthetic data and hand-parsed documents. We show that the STM is a more predictive model of language than current models based only on syntax or only on topics

    Towards an Indexical Model of Situated Language Comprehension for Cognitive Agents in Physical Worlds

    Full text link
    We propose a computational model of situated language comprehension based on the Indexical Hypothesis that generates meaning representations by translating amodal linguistic symbols to modal representations of beliefs, knowledge, and experience external to the linguistic system. This Indexical Model incorporates multiple information sources, including perceptions, domain knowledge, and short-term and long-term experiences during comprehension. We show that exploiting diverse information sources can alleviate ambiguities that arise from contextual use of underspecific referring expressions and unexpressed argument alternations of verbs. The model is being used to support linguistic interactions in Rosie, an agent implemented in Soar that learns from instruction.Comment: Advances in Cognitive Systems 3 (2014

    Distributional Effects of Gender Contrasts Across Categories

    Get PDF
    This paper proposes a methodology for comparing grammatical contrasts across categories with the tools of distributional semantics. After outlining why such a comparison is relevant to current theoretical work on gender and other morphosyntactic features, we present intrinsic and extrinsic predictability as instruments for analyzing semantic contrasts between pairs of words. We then apply our method to a dataset of gender pairs of French nouns and adjectives. We find that, while the distributional effect of gender is overall less predictable for nouns than for adjectives, it is heavily influenced by semantic properties of the adjectives

    Detecting and Monitoring Hate Speech in Twitter

    Get PDF
    Social Media are sensors in the real world that can be used to measure the pulse of societies. However, the massive and unfiltered feed of messages posted in social media is a phenomenon that nowadays raises social alarms, especially when these messages contain hate speech targeted to a specific individual or group. In this context, governments and non-governmental organizations (NGOs) are concerned about the possible negative impact that these messages can have on individuals or on the society. In this paper, we present HaterNet, an intelligent system currently being used by the Spanish National Office Against Hate Crimes of the Spanish State Secretariat for Security that identifies and monitors the evolution of hate speech in Twitter. The contributions of this research are many-fold: (1) It introduces the first intelligent system that monitors and visualizes, using social network analysis techniques, hate speech in Social Media. (2) It introduces a novel public dataset on hate speech in Spanish consisting of 6000 expert-labeled tweets. (3) It compares several classification approaches based on different document representation strategies and text classification models. (4) The best approach consists of a combination of a LTSM+MLP neural network that takes as input the tweet’s word, emoji, and expression tokens’ embeddings enriched by the tf-idf, and obtains an area under the curve (AUC) of 0.828 on our dataset, outperforming previous methods presented in the literatureThe work by Quijano-Sanchez was supported by the Spanish Ministry of Science and Innovation grant FJCI-2016-28855. The research of Liberatore was supported by the Government of Spain, grant MTM2015-65803-R, and by the European Union’s Horizon 2020 Research and Innovation Programme, under the Marie Sklodowska-Curie grant agreement No. 691161 (GEOSAFE). All the financial support is gratefully acknowledge

    Analysis of characteristics of semantics of spoken language in normally developing Hindi speaking children

    Get PDF
    Background: There appears to be a lack of database of and dearth of studies focusing on the characteristics of semantics in Hindi speaking school aged children. Such a data base will be useful for building vocabulary for language disordered children and for constructing AAC boards for non-verbal children. Hence, it is essential to study the characteristics of semantics of normally developing children. This paper focuses on describing the semantic characteristics of spoken language in Hindi speaking children.Methods: 200 normally developing Hindi speaking children within the age group of 3 - 7 years were shown and instructed to describe three validated pictures of daily events. The responses were recorded and transcribed.  Analyses included type-token ratio, frequency of occurrence and comparisons between different word classes.Results: Percentage of nouns is highest followed by verbs, pronouns, adjectives. Frequency of occurrence of words increases with increase in age. The common words with high frequency of occurrence are hƐ, hũ, rΛhe, rΛha, rΛhi, dƷa, ɔr, khel, gaɖi, log, pe, ke.  There appears to be marked increase in different classes of words, one at 4 yrs of age (after Sr. KG) and other at 6 yrs of age (standard I).Conclusions: One of the highlighting features of this study is the huge database of semantics (of spoken language) collected from 200 school going children.  Creating such a database and utilizing it for assessing language of the disordered population appears to be the need of the hour.
    • 

    corecore