Search CORE

2,230 research outputs found

Introduction to the special issue on cross-language algorithms and applications

Author: Bangalore Srinivas
Lambert Patrik
Montiel-Ponsoda Elena
Màrquez Lluís
Ruiz Costa-Jussà Marta
Publication venue
Publication date: 01/01/2016
Field of study

With the increasingly global nature of our everyday interactions, the need for multilingual technologies to support efficient and efective information access and communication cannot be overemphasized. Computational modeling of language has been the focus of Natural Language Processing, a subdiscipline of Artificial Intelligence. One of the current challenges for this discipline is to design methodologies and algorithms that are cross-language in order to create multilingual technologies rapidly. The goal of this JAIR special issue on Cross-Language Algorithms and Applications (CLAA) is to present leading research in this area, with emphasis on developing unifying themes that could lead to the development of the science of multi- and cross-lingualism. In this introduction, we provide the reader with the motivation for this special issue and summarize the contributions of the papers that have been included. The selected papers cover a broad range of cross-lingual technologies including machine translation, domain and language adaptation for sentiment analysis, cross-language lexical resources, dependency parsing, information retrieval and knowledge representation. We anticipate that this special issue will serve as an invaluable resource for researchers interested in topics of cross-lingual natural language processing.Postprint (published version

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

Software service adaptation based on interface localisation

Author: Collins Luke
Pahl Claus
Publication venue: 'IGI Global'
Publication date: 10/09/2014
Field of study

The aim of Web services is the provision of software services to a range of different users in different locations. Service localisation in this context can facilitate the internationalisation and localisation of services by allowing their adaption to different locales. The authors investigate three dimensions: (i) lingual localisation by providing service-level language translation techniques to adopt services to different languages, (ii) regulatory localisation by providing standards-based mappings to achieve regulatory compliance with regionally varying laws, standards and regulations, and (iii) social localisation by taking into account preferences and customs for individuals and the groups or communities in which they participate. The objective is to support and implement an explicit modelling of aspects that are relevant to localisation and runtime support consisting of tools and middleware services to automating the deployment based on models of locales, driven by the two localisation dimensions. The authors focus here on an ontology-based conceptual information model that integrates locale specification into service architectures in a coherent way

Irish Universities

DCU Online Research Access Service

Multilingual Schema Matching for Wikipedia Infoboxes

Author: Freire Juliana
Moreira Viviane
Nguyen Hoa
Nguyen Huong
Nguyen Thanh
Publication venue
Publication date: 01/01/2011
Field of study

Recent research has taken advantage of Wikipedia's multilingualism as a resource for cross-language information retrieval and machine translation, as well as proposed techniques for enriching its cross-language structure. The availability of documents in multiple languages also opens up new opportunities for querying structured Wikipedia content, and in particular, to enable answers that straddle different languages. As a step towards supporting such queries, in this paper, we propose a method for identifying mappings between attributes from infoboxes that come from pages in different languages. Our approach finds mappings in a completely automated fashion. Because it does not require training data, it is scalable: not only can it be used to find mappings between many language pairs, but it is also effective for languages that are under-represented and lack sufficient training samples. Another important benefit of our approach is that it does not depend on syntactic similarity between attribute names, and thus, it can be applied to language pairs that have distinct morphologies. We have performed an extensive experimental evaluation using a corpus consisting of pages in Portuguese, Vietnamese, and English. The results show that not only does our approach obtain high precision and recall, but it also outperforms state-of-the-art techniques. We also present a case study which demonstrates that the multilingual mappings we derive lead to substantial improvements in answer quality and coverage for structured queries over Wikipedia content.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

Challenges for the Multilingual Web of Data

Author: Buitelaar Paul
Cimiano Philipp
Gracia del Río Jorge
Gómez-Pérez A.
McCrae J.
Montiel-Ponsoda Elena
Publication venue: Facultad de Informática (UPM)
Publication date: 01/01/2011
Field of study

The Web has witnessed an enormous growth in the amount of semantic information published in recent years. This growth has been stimulated to a large extent by the emergence of Linked Data. Although this brings us a big step closer to the vision of a Semantic Web, it also raises new issues such as the need for dealing with information expressed in different natural languages. Indeed, although the Web of Data can contain any kind of information in any language, it still lacks explicit mechanisms to automatically reconcile such information when it is expressed in ifferent languages. This leads to situations in which data expressed in a certain language is not easily accessible to speakers of other languages. The Web of Data shows the potential for being extended to a truly multilingual web as vocabularies and data can be published in a language-independent fashion, while associated language-dependent (linguistic) information supporting the access across languages can be stored separately. In this sense, the multilingual Web of Data can be realized in our view as a layer of services and resources on top of the existing Linked Data infrastructure adding i) linguistic information for data and vocabularies in different languages, ii) mappings between data with labels in different languages, and iii) services to dynamically access and traverse Linked Data across different languages. In this article we present this vision of a multilingual Web of Data. We discuss challenges that need to be addressed to make this vision come true and discuss the role that techniques such as ontology localization, ontology mapping, and cross-lingual ontology-based information access and presentation will play in achieving this. Further, we propose an initial architecture and describe a roadmap that can provide a basis for the implementation of this vision

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Access to Research at National University of Ireland, Galway

Archivo Digital UPM

Cross-lingual RDF thesauri interlinking

Author: David Jérôme
Euzenat Jérôme
Lesnikova Tatiana
Publication venue: No commercial editor.
Publication date: 23/05/2016
Field of study

lesnikova2016aInternational audienceVarious lexical resources are being published in RDF. To enhance the usability of these resources, identical resources in different data sets should be linked. If lexical resources are described in different natural languages, then techniques to deal with multilinguality are required for interlinking. In this paper, we evaluate machine translation for interlinking concepts, i.e., generic entities named with a common noun or term. In our previous work, the evaluated method has been applied on named entities. We conduct two experiments involving different thesauri in different languages. The first experiment involves concepts from the TheSoz multilingual thesaurus in three languages: English, French and German. The second experiment involves concepts from the EuroVoc and AGROVOC thesauri in English and Chinese respectively. Our results demonstrate that machine translation can be beneficial for cross-lingual thesauri interlining independently of a dataset structure

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

Challenges for the multilingual Web of Data

Author: Buitelaar Paul
Cimiano Philipp
Garcia Jorge
Gómez-Pérez Ascuncion
McCrae John
Montiel-Ponsoda Elena
Publication venue: 'Elsevier BV'
Publication date: 01/01/2012
Field of study

Garcia J, Montiel-Ponsoda E, Cimiano P, Gómez-Pérez A, Buitelaar P, McCrae J. Challenges for the multilingual Web of Data. Journal of Web Semantics: Science, Services and Agents on the World Wide Web. 2012;11:63-71

Publications at Bielefeld University

Cross-lingual RDF thesauri interlinking

Author: David Jérôme
Euzenat Jérôme
Lesnikova Tatiana
Publication venue: No commercial editor.
Publication date: 23/05/2016
Field of study

Hal - Université Grenoble Alpes