Search CORE

10,390 research outputs found

MIRACLE evaluation of results for ImageCLEF 2003

Author: Fombella Mourelle Jorge
García Serrano Ana
González Cristóbal José Carlos
Goñi Menoyo José Miguel
Martínez Fernández José Luis
Martínez Fernández Paloma
Ruiz Cristina Alberto
Villena Román Julio
Publication venue: E.T.S.I. Telecomunicación (UPM)
Publication date: 01/01/2003
Field of study

ImageCLEF is a new pilot experiment introduced in CLEF 2003. It is devoted to the cross language retrieval of images using textual descriptions related to images contents. This paper presents MIRACLE research team experiments and results obtained for this track

Archivo Digital UPM

MIRACLE Retrieval Experiments with East Asian Languages

Author: González Cristóbal José Carlos
Goñi Menoyo José Miguel
Martínez Fernández José Luis
Villena Román Julio
Publication venue: E.T.S.I. Telecomunicación (UPM)
Publication date: 01/01/2005
Field of study

This paper describes the participation of MIRACLE in NTCIR 2005 CLIR task. Although our group has a strong background and long expertise in Computational Linguistics and Information Retrieval applied to European languages and using Latin and Cyrillic alphabets, this was our first attempt on East Asian languages. Our main goal was to study the particularities and distinctive characteristics of Japanese, Chinese and Korean, specially focusing on the similarities and differences with European languages, and carry out research on CLIR tasks which include those languages. The basic idea behind our participation in NTCIR is to test if the same familiar linguisticbased techniques may also applicable to East Asian languages, and study the necessary adaptations

Archivo Digital UPM

Beyond English text: Multilingual and multimedia information retrieval.

Author: Jones Gareth J.F.
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 01/01/2005
Field of study

Non

CiteSeerX

DCU Online Research Access Service

Domain-speciﬁc query translation for multilingual access to digital libraries

Author: Fantino Fabio
Fuller Marguerite
Jones Gareth J.F.
Newman Eamonn
Zhang Ying
Publication venue
Publication date: 15/06/2009
Field of study

Accurate high-coverage translation is a vital component of reliable cross language information access (CLIR) systems. This is particularly true of access to archives such as Digital Libraries which are often speciﬁc to certain domains. While general machine translation (MT) has been shown to be effective for CLIR tasks in information retrieval evaluation workshops, it is not well suited to specialized tasks where domain speciﬁc translations are required. We demonstrate that effective query translation in the domain of cultural heritage (CH) can be achieved by augmenting a standard MT system with domain-speciﬁc phrase dictionaries automatically mined from the online Wikipedia. Experiments using our hybrid translation system with sample query logs from users of CH websites demonstrate a large improvement in the accuracy of domain speciﬁc phrase detection and translation

Irish Universities

DCU Online Research Access Service

On the Reproducibility and Generalisation of the Linear Transformation of Word Embeddings

Author: Fang Anjie
Macdonald Craig
McCreadie Richard
Ounis Iadh
Yang Xiao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 22/12/2017
Field of study

Linear transformation is a way to learn a linear relationship between two word embeddings, such that words in the two different embedding spaces can be semantically related. In this paper, we examine the reproducibility and generalisation of the linear transformation of word embeddings. Linear transformation is particularly useful when translating word embedding models in different languages, since it can capture the semantic relationships between two models. We first reproduce two linear transformation approaches, a recent one using orthogonal transformation and the original one using simple matrix transformation. Previous findings on a machine translation task are re-examined, validating that linear transformation is indeed an effective way to transform word embedding models in different languages. In particular, we show that the orthogonal transformation can better relate the different embedding models. Following the verification of previous findings, we then study the generalisation of linear transformation in a multi-language Twitter election classification task. We observe that the orthogonal transformation outperforms the matrix transformation. In particular, it significantly outperforms the random classifier by at least 10% under the F1 metric across English and Spanish datasets. In addition, we also provide best practices when using linear transformation for multi-language Twitter election classification

Enlighten: Research Data (University of Glasgow)

Crossref

Enlighten

Recommended from our members

Cognate facilitation effects in bilingual children of varying language dominance

Author: Ramirez Mayra Chantal
Publication venue
Publication date: 02/04/2018
Field of study

A widely accepted theory is that bilinguals activate both of their languages regardless of which is in use. Though there is abundant research on this phenomenon in bilingual adults, less research has focused on bilingual children. Cognates (i.e., words that share meaning and sound across languages) have frequently been used to explore language co-activation. The present study investigates cognate facilitation effects in child bilinguals of varying language dominance. Spanish-English bilingual children between 6 and 10 years old performed a picture-naming task that included pictures of cognates and non-cognates. Children who were more English-dominant experienced larger cognate facilitation effects when producing words in their non-dominant language but not in their dominant language. In contrast, children with more balanced dominance did not experience cognate facilitation effects in either language. The findings from this study may have implications for the development of the bilingual lexicon.Psycholog

Texas ScholarWorks

GeoCLEF 2006: the CLEF 2006 Ccross-language geographic information retrieval track overview

Author: Bischoff K.
Di Nunzio G.M.
Ferro N.
Gey F.
Larson R.
Mandl T.
Rocha P.
Sanderson M.
Santos D.
Womser-Hacker C.
Publication venue
Publication date: 01/01/2006
Field of study

After being a pilot track in 2005, GeoCLEF advanced to be a regular track within CLEF 2006. The purpose of GeoCLEF is to test and evaluate cross-language geographic information retrieval (GIR): retrieval for topics with a geographic specification. For GeoCLEF 2006, twenty-five search topics were defined by the organizing groups for searching English, German, Portuguese and Spanish document collections. Topics were translated into English, German, Portuguese, Spanish and Japanese. Several topics in 2006 were significantly more geographically challenging than in 2005. Seventeen groups submitted 149 runs (up from eleven groups and 117 runs in GeoCLEF 2005). The groups used a variety of approaches, including geographic bounding boxes, named entity extraction and external knowledge bases (geographic thesauri and ontologies and gazetteers)

CiteSeerX

Crossref

Repositório Comum

White Rose Research Online

Frequency drives lexical access in reading but not in speaking: the frequency-lag hypothesis

Author: Duyck Wouter
Goldenberg Diane
Gollan Tamar H
Rayner Keith
Slattery Timothy J
Van Assche Eva
Publication venue: 'American Psychological Association (APA)'
Publication date: 01/01/2011
Field of study

To contrast mechanisms of lexical access in production versus comprehension we compared the effects of word frequency (high, low), context (none, low constraint, high constraint), and level of English proficiency (monolingual, Spanish-English bilingual, Dutch-English bilingual) on picture naming, lexical decision, and eye fixation times. Semantic constraint effects were larger in production than in reading. Frequency effects were larger in production than in reading without constraining context but larger in reading than in production with constraining context. Bilingual disadvantages were modulated by frequency in production but not in eye fixation times, were not smaller in low-constraint contexts, and were reduced by high-constraint contexts only in production and only at the lowest level of English proficiency. These results challenge existing accounts of bilingual disadvantages and reveal fundamentally different processes during lexical access across modalities, entailing a primarily semantically driven search in production but a frequency-driven search in comprehension. The apparently more interactive process in production than comprehension could simply reflect a greater number of frequency-sensitive processing stages in production

Ghent University Academic Bibliography

PubMed Central