10,143 research outputs found
Recommended from our members
Language engineering - a champion for European culture
Language is key to culture. It is a direct cultural medium as well as a means of recording and providing access to non-lingual elements of culture. Language is also fundamental to a sense of cultural identity. For this reason, it is vital, in a changing Europe, that we preserve the multi-lingual character of our society in order to move successfully towards closer co-operation at a political, economic, and social level.
Language engineering is the application of knowledge of language to the development of computer software which can recognise, understand, interpret, and generate human language in all its forms.
The paper provides a high level view of the ‘state of the art’ in language engineering and indicates ways in which it will have a profound impact on our culture in the future. It shows how advances in language engineering are an important aid in maintaining cultural diversity in a multi-lingual European society, while enabling the development of social cohesion across cultural and national divides. It addresses issues raised by the prospect of the Multi-lingual Information Society, including education, human communication with technology and information management, as well as aspects of digital cities such as tele-presence in digital libraries, virtual art galleries and electronic museums. The paper raises the issue of language as a factor in cultural domination, showing the contribution that language engineering can make towards countering it.
The paper also raises a number of controversial issues concerning the likely benefits arising from the ways in which language is likely to influence the culture of Europe
Embedding Web-based Statistical Translation Models in Cross-Language Information Retrieval
Although more and more language pairs are covered by machine translation
services, there are still many pairs that lack translation resources.
Cross-language information retrieval (CLIR) is an application which needs
translation functionality of a relatively low level of sophistication since
current models for information retrieval (IR) are still based on a
bag-of-words. The Web provides a vast resource for the automatic construction
of parallel corpora which can be used to train statistical translation models
automatically. The resulting translation models can be embedded in several ways
in a retrieval model. In this paper, we will investigate the problem of
automatically mining parallel texts from the Web and different ways of
integrating the translation models within the retrieval process. Our
experiments on standard test collections for CLIR show that the Web-based
translation models can surpass commercial MT systems in CLIR tasks. These
results open the perspective of constructing a fully automatic query
translation device for CLIR at a very low cost.Comment: 37 page
Implementation of a Human-Computer Interface for Computer Assisted Translation and Handwritten Text Recognition
A human-computer interface is developed to provide services of computer assisted machine translation (CAT) and computer assisted transcription of handwritten text images (CATTI). The back-end machine translation (MT) and handwritten text recognition (HTR) systems are provided by the Pattern Recognition and Human Language Technology (PRHLT) research group. The idea is to provide users with easy to use tools to convert interactive translation and transcription feasible tasks. The assisted service is provided by remote servers with CAT or CATTI capabilities. The interface supplies the user with tools for efficient local edition: deletion, insertion and substitution.Ocampo Sepúlveda, JC. (2009). Implementation of a Human-Computer Interface for Computer Assisted Translation and Handwritten Text Recognition. http://hdl.handle.net/10251/14318Archivo delegad
A Word Sense-Oriented User Interface for Interactive Multilingual Text Retrieval
In this paper we present an interface for supporting a user in an interactive cross-language search process using semantic classes. In order to enable users to access multilingual information, different problems have to be solved: disambiguating and translating the query words, as well as categorizing and presenting the results appropriately. Therefore, we first give a brief introduction to word sense disambiguation, cross-language text retrieval and document categorization and finally describe recent achievements of our research towards an interactive multilingual retrieval system. We focus especially on the problem of browsing and navigation of the different word senses in one source and possibly several target languages. In the last part of the paper, we discuss the developed user interface and its functionalities in more detail
JTEC panel report on machine translation in Japan
The goal of this report is to provide an overview of the state of the art of machine translation (MT) in Japan and to provide a comparison between Japanese and Western technology in this area. The term 'machine translation' as used here, includes both the science and technology required for automating the translation of text from one human language to another. Machine translation is viewed in Japan as an important strategic technology that is expected to play a key role in Japan's increasing participation in the world economy. MT is seen in Japan as important both for assimilating information into Japanese as well as for disseminating Japanese information throughout the world. Most of the MT systems now available in Japan are transfer-based systems. The majority of them exploit a case-frame representation of the source text as the basis of the transfer process. There is a gradual movement toward the use of deeper semantic representations, and some groups are beginning to look at interlingua-based systems
Proceedings
Proceedings of the NODALIDA 2009 workshop
Nordic Perspectives on the CLARIN Infrastructure of Language Resources.
Editors: Rickard Domeij, Kimmo Koskenniemi, Steven Krauwer, Bente Maegaard,
Eiríkur Rögnvaldsson and Koenraad de Smedt.
NEALT Proceedings Series, Vol. 5 (2009), v+45 pp.
© 2009 The editors and contributors.
Published by
Northern European Association for Language
Technology (NEALT)
http://omilia.uio.no/nealt .
Electronically published at
Tartu University Library (Estonia)
http://hdl.handle.net/10062/9207
Towards MKM in the Large: Modular Representation and Scalable Software Architecture
MKM has been defined as the quest for technologies to manage mathematical
knowledge. MKM "in the small" is well-studied, so the real problem is to scale
up to large, highly interconnected corpora: "MKM in the large". We contend that
advances in two areas are needed to reach this goal. We need representation
languages that support incremental processing of all primitive MKM operations,
and we need software architectures and implementations that implement these
operations scalably on large knowledge bases.
We present instances of both in this paper: the MMT framework for modular
theory-graphs that integrates meta-logical foundations, which forms the base of
the next OMDoc version; and TNTBase, a versioned storage system for XML-based
document formats. TNTBase becomes an MMT database by instantiating it with
special MKM operations for MMT.Comment: To appear in The 9th International Conference on Mathematical
Knowledge Management: MKM 201
- …