Search CORE

8,557 research outputs found

The Use of Lexical and Referential Cues in Children’s Online Interpretation of Adjectives

Author: Huang Yi Ting
Snedeker Jesse
Publication venue: 'American Psychological Association (APA)'
Publication date: 01/01/2013
Field of study

Recent research on moment-to-moment language comprehension has revealed striking differences between adults and preschool children. Adults rapidly use the referential principle to resolve syntactic ambiguity, assuming that modification is more likely when there are 2 possible referents for a definite noun phrase. Young children do not. We examine the scope of this phenomenon by exploring whether children use the referential principle to resolve another form of ambiguity. Scalar adjectives (big, small) are typically used to refer to an object when contrasting members of the same category are present in the scene (big and small coins). In the present experiment, 5-year-olds and adults heard instructions like “Point to the big (small) coin” while their eye-movements were measured to displays containing 1 or 2 coins. Both groups rapidly recruited the meaning of the adjective to distinguish between referents of different sizes. Critically, like adults, children were quicker to look to the correct item in trials containing 2 possible referents compared with 1. Nevertheless, children's sensitivity to the referential principle was substantially delayed compared to adults', suggesting possible differences in the recruitment of this top- down cue. The implications of current and previous findings are discussed with respect to the development of the architecture of language comprehension.LinguisticsPsycholog

Crossref

Harvard University - DASH

Syntactic Topic Models

Author: Blei David M.
Boyd-Graber Jordan
Publication venue
Publication date: 01/01/2008
Field of study

The syntactic topic model (STM) is a Bayesian nonparametric model of language that discovers latent distributions of words (topics) that are both semantically and syntactically coherent. The STM models dependency parsed corpora where sentences are grouped into documents. It assumes that each word is drawn from a latent topic chosen by combining document-level features and the local syntactic context. Each document has a distribution over latent topics, as in topic models, which provides the semantic consistency. Each element in the dependency parse tree also has a distribution over the topics of its children, as in latent-state syntax models, which provides the syntactic consistency. These distributions are convolved so that the topic of each word is likely under both its document and syntactic context. We derive a fast posterior inference algorithm based on variational methods. We report qualitative and quantitative studies on both synthetic data and hand-parsed documents. We show that the STM is a more predictive model of language than current models based only on syntax or only on topics

arXiv.org e-Print Archive

CiteSeerX

Recommended from our members

Corpus approaches to language in the media

Author: Jaworska Sylvia
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2018
Field of study

The main aim of this chapter is to offer an overview of research that has adopted the methodology of Corpus Linguistics to study aspects of language use in the media. The overview begins by introducing the key principles and analytical tools adopted in corpus research. To demonstrate the contribution of corpus approaches to media linguistics, a selection of recent corpus studies is subsequently discussed. The final section summarises the strengths and limitations of corpus approaches and discusses avenues for further research

Central Archive at the University of Reading

Towards an Indexical Model of Situated Language Comprehension for Cognitive Agents in Physical Worlds

Author: Laird John
Mininger Aaron
Mohan Shiwali
Publication venue
Publication date: 08/04/2016
Field of study

We propose a computational model of situated language comprehension based on the Indexical Hypothesis that generates meaning representations by translating amodal linguistic symbols to modal representations of beliefs, knowledge, and experience external to the linguistic system. This Indexical Model incorporates multiple information sources, including perceptions, domain knowledge, and short-term and long-term experiences during comprehension. We show that exploiting diverse information sources can alleviate ambiguities that arise from contextual use of underspecific referring expressions and unexpressed argument alternations of verbs. The model is being used to support linguistic interactions in Rosie, an agent implemented in Soar that learns from instruction.Comment: Advances in Cognitive Systems 3 (2014

arXiv.org e-Print Archive

Distributional Effects of Gender Contrasts Across Categories

Author: Bonami Olivier
Mickus Timothee
Paperno Denis
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2019
Field of study

This paper proposes a methodology for comparing grammatical contrasts across categories with the tools of distributional semantics. After outlining why such a comparison is relevant to current theoretical work on gender and other morphosyntactic features, we present intrinsic and extrinsic predictability as instruments for analyzing semantic contrasts between pairs of words. We then apply our method to a dataset of gender pairs of French nouns and adjectives. We find that, while the distributional effect of gender is overall less predictable for nouns than for adjectives, it is heavily influenced by semantic properties of the adjectives

ScholarWorks@UMass Amherst

HAL Descartes

Utrecht University Repository

Hal-Diderot

Detecting and Monitoring Hate Speech in Twitter

Author: Camacho-Collados Miguel
Liberatore Federico
Pereira-Kohatsu Juan Carlos
Quijano-Sánchez Lara
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

Social Media are sensors in the real world that can be used to measure the pulse of societies. However, the massive and unfiltered feed of messages posted in social media is a phenomenon that nowadays raises social alarms, especially when these messages contain hate speech targeted to a specific individual or group. In this context, governments and non-governmental organizations (NGOs) are concerned about the possible negative impact that these messages can have on individuals or on the society. In this paper, we present HaterNet, an intelligent system currently being used by the Spanish National Office Against Hate Crimes of the Spanish State Secretariat for Security that identifies and monitors the evolution of hate speech in Twitter. The contributions of this research are many-fold: (1) It introduces the first intelligent system that monitors and visualizes, using social network analysis techniques, hate speech in Social Media. (2) It introduces a novel public dataset on hate speech in Spanish consisting of 6000 expert-labeled tweets. (3) It compares several classification approaches based on different document representation strategies and text classification models. (4) The best approach consists of a combination of a LTSM+MLP neural network that takes as input the tweet’s word, emoji, and expression tokens’ embeddings enriched by the tf-idf, and obtains an area under the curve (AUC) of 0.828 on our dataset, outperforming previous methods presented in the literatureThe work by Quijano-Sanchez was supported by the Spanish Ministry of Science and Innovation grant FJCI-2016-28855. The research of Liberatore was supported by the Government of Spain, grant MTM2015-65803-R, and by the European Union’s Horizon 2020 Research and Innovation Programme, under the Marie Sklodowska-Curie grant agreement No. 691161 (GEOSAFE). All the financial support is gratefully acknowledge

Multidisciplinary Digital Publishing Institute

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Online Research @ Cardiff

Universidad Carlos III de Madrid e-Archivo

Biblos-e Archivo

Analysis of characteristics of semantics of spoken language in normally developing Hindi speaking children

Author: Banik Arun A.
Dafadar Bisma S.
Kant Anjali R.
Publication venue: 'Medip Academy'
Publication date: 17/01/2017
Field of study

Background: There appears to be a lack of database of and dearth of studies focusing on the characteristics of semantics in Hindi speaking school aged children. Such a data base will be useful for building vocabulary for language disordered children and for constructing AAC boards for non-verbal children. Hence, it is essential to study the characteristics of semantics of normally developing children. This paper focuses on describing the semantic characteristics of spoken language in Hindi speaking children.Methods: 200 normally developing Hindi speaking children within the age group of 3 - 7 years were shown and instructed to describe three validated pictures of daily events. The responses were recorded and transcribed. Analyses included type-token ratio, frequency of occurrence and comparisons between different word classes.Results: Percentage of nouns is highest followed by verbs, pronouns, adjectives. Frequency of occurrence of words increases with increase in age. The common words with high frequency of occurrence are hƐ, hũ, rΛhe, rΛha, rΛhi, dƷa, ɔr, khel, gaɖi, log, pe, ke. There appears to be marked increase in different classes of words, one at 4 yrs of age (after Sr. KG) and other at 6 yrs of age (standard I).Conclusions: One of the highlighting features of this study is the huge database of semantics (of spoken language) collected from 200 school going children. Creating such a database and utilizing it for assessing language of the disordered population appears to be the need of the hour.

International Journal of Research in Medical Sciences