5,804 research outputs found
A Survey of Word Reordering in Statistical Machine Translation: Computational Models and Language Phenomena
Word reordering is one of the most difficult aspects of statistical machine
translation (SMT), and an important factor of its quality and efficiency.
Despite the vast amount of research published to date, the interest of the
community in this problem has not decreased, and no single method appears to be
strongly dominant across language pairs. Instead, the choice of the optimal
approach for a new translation task still seems to be mostly driven by
empirical trials. To orientate the reader in this vast and complex research
area, we present a comprehensive survey of word reordering viewed as a
statistical modeling challenge and as a natural language phenomenon. The survey
describes in detail how word reordering is modeled within different
string-based and tree-based SMT frameworks and as a stand-alone task, including
systematic overviews of the literature in advanced reordering modeling. We then
question why some approaches are more successful than others in different
language pairs. We argue that, besides measuring the amount of reordering, it
is important to understand which kinds of reordering occur in a given language
pair. To this end, we conduct a qualitative analysis of word reordering
phenomena in a diverse sample of language pairs, based on a large collection of
linguistic knowledge. Empirical results in the SMT literature are shown to
support the hypothesis that a few linguistic facts can be very useful to
anticipate the reordering characteristics of a language pair and to select the
SMT framework that best suits them.Comment: 44 pages, to appear in Computational Linguistic
Recommended from our members
Investigating the role of phonological awareness on phonological recoding during reading in deaf children
This study uses eye-tracking to investigate the role of phonological awareness on phonological recoding during reading in deaf and hard-of-hearing (DHH) children who predominantly use sign language as compared to typically hearing children. Phonological recoding is one of the earliest strategies employed in reading, in which the reader maps each grapheme directly to the corresponding speech sound of the language (Jared, Levy, Ashby, and Agauas, 2015). Many DHH children struggle with reading, and the severity of the delays in some children increase with age. Although there are a few studies examining the eye-patterns during reading in DHH adults, there are considerably fewer studies examining phonological recoding and the role of phonological awareness during reading in DHH children (Belanger, Baum, and Mayberry 2011; Belanger, Rayner, and Mayberry, 2013). This study will be testing influence of the visual language signal on reading in deaf children. I compare phonological awareness skills of English, ASL, and mouthing gestures to reading fluency, measured via eye-movement patterns when reading a sequence of sentences an eye-tracker. Sentences are manipulated to target phonological recoding during reading by altering target words embedded in the sentence in three experimental conditions: no change, homophone foil, and spelling control (Jared et al. 2015). Preliminary results indicate that deaf signers are proficient readers and seemingly rely on ASL skills to read. In addition, I suggest that deaf signers do not participate in phonological recoding.Linguistic
Lessons learned in multilingual grounded language learning
Recent work has shown how to learn better visual-semantic embeddings by
leveraging image descriptions in more than one language. Here, we investigate
in detail which conditions affect the performance of this type of grounded
language learning model. We show that multilingual training improves over
bilingual training, and that low-resource languages benefit from training with
higher-resource languages. We demonstrate that a multilingual model can be
trained equally well on either translations or comparable sentence pairs, and
that annotating the same set of images in multiple language enables further
improvements via an additional caption-caption ranking objective.Comment: CoNLL 201
Recommended from our members
Making sense of higher education: students as consumers and the value of the university experience
In the global university sector competitive funding models are progressively becoming the norm, and institutions/courses are frequently now subject to the same kind of consumerist pressures typical of a highly marketised environment. In the United Kingdom, for example, students are increasingly demonstrating customer-like behaviour and are now demanding even more ‘value’ from institutions. Value, though, is a slippery concept and has proven problematic both in terms of its conceptualisation and measurement. This article explores the relationship between student value and higher education and, via study in one United Kingdom business school, suggests how this might be better understood and operationalised. Adopting a combined qualitative/quantitative approach, this article also looks to identify which of the key value drivers has most practical meaning and, coincidentally, identifies a value-related difference between home and international students
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
Centering, Anaphora Resolution, and Discourse Structure
Centering was formulated as a model of the relationship between attentional
state, the form of referring expressions, and the coherence of an utterance
within a discourse segment (Grosz, Joshi and Weinstein, 1986; Grosz, Joshi and
Weinstein, 1995). In this chapter, I argue that the restriction of centering to
operating within a discourse segment should be abandoned in order to integrate
centering with a model of global discourse structure. The within-segment
restriction causes three problems. The first problem is that centers are often
continued over discourse segment boundaries with pronominal referring
expressions whose form is identical to those that occur within a discourse
segment. The second problem is that recent work has shown that listeners
perceive segment boundaries at various levels of granularity. If centering
models a universal processing phenomenon, it is implausible that each listener
is using a different centering algorithm.The third issue is that even for
utterances within a discourse segment, there are strong contrasts between
utterances whose adjacent utterance within a segment is hierarchically recent
and those whose adjacent utterance within a segment is linearly recent. This
chapter argues that these problems can be eliminated by replacing Grosz and
Sidner's stack model of attentional state with an alternate model, the cache
model. I show how the cache model is easily integrated with the centering
algorithm, and provide several types of data from naturally occurring
discourses that support the proposed integrated model. Future work should
provide additional support for these claims with an examination of a larger
corpus of naturally occurring discourses.Comment: 35 pages, uses elsart12, lingmacros, named, psfi
Recent Trends in Computational Intelligence
Traditional models struggle to cope with complexity, noise, and the existence of a changing environment, while Computational Intelligence (CI) offers solutions to complicated problems as well as reverse problems. The main feature of CI is adaptability, spanning the fields of machine learning and computational neuroscience. CI also comprises biologically-inspired technologies such as the intellect of swarm as part of evolutionary computation and encompassing wider areas such as image processing, data collection, and natural language processing. This book aims to discuss the usage of CI for optimal solving of various applications proving its wide reach and relevance. Bounding of optimization methods and data mining strategies make a strong and reliable prediction tool for handling real-life applications
- …