45 research outputs found
Incrementality and the Dynamics of Routines in Dialogue
We propose a novel dual processing model of linguistic routinisation, specifically formulaic ex- pressions (from relatively fixed idioms, all the way through to looser collocational phenomena). This model is formalised using the Dynamic Syntax (DS) formal account of language processing, whereby we make a specific extension to the core DS lexical architecture to capture the dynamics of linguistic routinisation. This extension is inspired by work within cognitive science more broadly. DS has a range of attractive modelling features, such as full incrementality, as well as recent ac- counts of using resources of the core grammar for modelling a range of dialogue phenomena, all of which we deploy in our account. This leads to not only a fully incremental model of formulaic lan- guage, but further, this straightforwardly extends to routinised dialogue phenomena. We consider this approach to be a proof of concept of how interdisciplinary work within cognitive science holds out the promise of meeting challenges faced by modellers of dialogue and discourse
Recommended from our members
Building the Emirati Arabic FrameNet
The Emirati Arabic FrameNet (EAFN) project aims to initiate a FrameNet for Emirati Arabic, utilizing the Emirati Arabic Corpus. The goal is to create a resource comparable to the initial stages of the Berkeley FrameNet. The project is divided into manual and automatic tracks, based on the predominant techniques being used to collect frames in each track. Work on the EAFN is progressing, and we here report on initial results for annotations and evaluation. The EAFN project aims to provide a general semantic resource for the Arabic language, sure to be of interest to researchers from general linguistics to natural language processing. As we report here, the EAFN is well on target for the first release of data in the coming year
Learning Neural Word Salience Scores
Measuring the salience of a word is an essential step in numerous NLP tasks. Heuristic approaches such as tfidf have been used so far to estimate the salience of words. We propose \textitNeural Word Salience (NWS) scores, unlike heuristics, are learnt from a corpus. Specifically, we learn word salience scores such that, using pre-trained word embeddings as the input, can accurately predict the words that appear in a sentence, given the words that appear in the sentences preceding or succeeding that sentence. Experimental results on sentence similarity prediction show that the learnt word salience scores perform comparably or better than some of the state-of-the-art approaches for representing sentences on benchmark datasets for sentence similarity, while using only a fraction of the training and prediction times required by prior methods. Moreover, our NWS scores positively correlate with psycholinguistic measures such as concreteness, and imageability implying a close connection to the salience as perceived by humans
Recommended from our members
DiVE-Arabic: Gulf Arabic Dialogue in a Virtual Environment
Documentation of communicative behaviour across languages seems at a crossroads. While methods for collecting data on spoken or written communication, backed up by computational techniques, are evolving, the actual data being collected remain largely the same. Inspired by the efforts of some innovative researchers who are directly tackling the various obstacles to investigating language in the field (e.g. see various papers collected in Enfield & Stivers 2007), we report here about ongoing work to solve the general problem of collecting in situ data for situated linguistic interaction. The initial stages of this project have involved employing a portable format designed to increase range and flexibility of doing such collections in the field. Our motivation is to combine this with a parallel data set for a typologically distinct language, in order to contribute a parallel corpus of situated language use
Dimensions of Metaphorical Meaning
Recent work suggests that concreteness and imageability play an important role in the meanings of figurative expressions. We investigate this idea in several ways. First, we try to define more precisely the context within which a figurative expression may occur, by parsing a corpus annotated for metaphor. Next, we add both concreteness and imageability as “features” to the parsed metaphor corpus, by marking up words in this corpus using a psycholinguistic database of scores for concreteness and imageability. Finally, we carry out detailed statistical analyses of the augmented version of the original metaphor corpus, cross-matching the features of concreteness and imageability with others in the corpus such as parts of speech and dependency relations, in order to investigate in detail the use of such features in predicting whether a given expression is metaphorical or not
Recommended from our members
Dialogue Modelling and the Remit of Core Grammar
In confronting the challenge of providing formal models of dialogue, with its plethora of fragments and rich variation in modes of context-dependent construal, it might seem that linguists face two types of methodological choice: either (a) conversation employs dialogue-specific mechanisms, for which a grammar specific to such activity must be constructed; or (b) variation arises due to independent parsing/production systems which invoke a process-neutral grammar. However, as dialogue research continues to develop, there are intermediate possibilities, and in this paper we discuss the approach developed within Dynamic Syntax (DS, Kempson et al. 2001, Cann et al. 2005), a grammar framework within which, not only the parser, but indeed “syntax” itself are just a single mechanism allowing the progressive construction of semantic representations in context. Here we take as a case study the set of phenomena classifiable as clarifications, reformulations, fragment requests and corrections accompanied by extensions, and argue that though these may seem to be uniquely constitutive of dialogue, they are grounded in the mechanisms of apposition equivalently usable in monologue for presenting reformulations, extensions, self-corrections etc
Generation of Human Female Reproductive Tract Epithelium from Human Embryonic Stem Cells
BACKGROUND: Recent studies have identified stem/progenitor cells in human and mouse uterine epithelium, which are postulated to be responsible for tissue regeneration and proliferative disorders of human endometrium. These progenitor cells are thought to be derived from Müllerian duct (MD), the primordial female reproductive tract (FRT). METHODOLOGY/PRINCIPAL FINDINGS: We have developed a model of human reproductive tract development in which inductive neonatal mouse uterine mesenchyme (nMUM) is recombined with green fluorescent protein (GFP)-tagged human embryonic stem cells (hESCs); GFP-hESC (ENVY). We demonstrate for the first time that hESCs can be differentiated into cells with a human FRT epithelial cell phenotype. hESC derived FRT epithelial cells emerged from cultures containing MIXL1(+) mesendodermal precursors, paralleling events occurring during normal organogenesis. Following transplantation, nMUM treated embryoid bodies (EBs) generated epithelial structures with a typical MD phenotype that expressed the MD markers PAX2, HOXA10. Functionally, the hESCs derived FRT epithelium responded to exogenous estrogen by proliferating and secreting uterine-specific glycodelin A (GdA). CONCLUSIONS/SIGNIFICANCE: These data show nMUM can induce differentiation of hESC to form the FRT epithelium. This may provide a model to study early developmental events of the human FRT
Report on the Second NLG Challenge on Generating Instructions in Virtual Environments (GIVE-2)
We describe the second installment of the Challenge on Generating Instructions in Virtual Environments (GIVE-2), a shared task for the NLG community which took place in 2009-10. We evaluated seven NLG systems by connecting them to 1825 users over the Internet, and report the results of this evaluation in terms of objective and subjective measures
Recommended from our members
A Salvage Grammar of Malgana - The Language of Shark Bay, Western Australia
There are no longer any speakers of the West Australian Aboriginal language Malgana who have any degree of fluency, and the series of analyses in this report are based on data from audio tapes made in the middle of the last decade of the 20th century, as well as various written materials produced over more than 150 years. This grammar is therefore an attempt to salvage from the scarce material available as complete a description of Malgana as possible. Nevertheless, the character of Malgana shines through what remains. For example, typical of Pama-Nyungan languages in general, Malgana exhibits split-ergative nominal marking, and of Aboriginal languages of the central West of Australia in particular, Malgana displays a full contrastive laminal series of stops in its phonology. A conscious effort has been made to provide in this grammar as many resources as possible for the researcher interested in comparative study of the surrounding languages. To this end, a (Malgana-based) comparative wordlist has been constructed for the languages of the region centring on the Murchison River: Malgana, Nhanda, Badimaya, Wajarri, and (Southern and Northern) Yingkarta