6,597 research outputs found
Providing Multilingual Access to Health-Oriented Content
Finding health-related content is not an easy task. People have to know what to search for, which medical terms to use, and where to find accurate information. This task becomes even harder when people such as immigrants wish to find information in their country of residence and do not speak the national language very well. In this paper, we present a new health information system that allows users to search for health information using natural language queries composed of multiple languages. We present the technical details of the system and outline the results of a preliminary user study to demonstrate the usability of the system
Extending, trimming and fusing WordNet for technical documents
This paper describes a tool for the automatic
extension and trimming of a multilingual
WordNet database for cross-lingual retrieval
and multilingual ontology building in
intranets and domain-specific document
collections. Hierarchies, built from
automatically extracted terms and combined
with the WordNet relations, are trimmed
with a disambiguation method based on the
document salience of the words in the
glosses. The disambiguation is tested in a
cross-lingual retrieval task, showing
considerable improvement (7%-11%). The
condensed hierarchies can be used as
browse-interfaces to the documents
complementary to retrieval
Dublin City University at QA@CLEF 2008
We describe our participation in Multilingual Question Answering at CLEF 2008 using German and English as our source and target languages respectively. The system was built using UIMA (Unstructured Information Management Architecture) as underlying framework
HILT IV : subject interoperability through building and embedding pilot terminology web services
A report of work carried out within the JISC-funded HILT Phase IV project, the paper looks at the project's context against the background of other recent and ongoing terminologies work, describes its outcome and conclusions, including technical outcomes and terminological characteristics, and considers possible future research and development directions. The Phase IV project has taken HILT to the point where the launch of an operational support service in the area of subject interoperability is a feasible option and where both investigation of specific needs in this area and practical collaborative work are sensible and feasible next steps. Moving forward requires detailed work, not only on terminology interoperability and associated service delivery issues, but also on service and end user needs and engagement, service sustainability issues, and the practicalities of interworking with other terminology services and projects in UK, Europe, and global contexts
Towards a Universal Wordnet by Learning from Combined Evidenc
Lexical databases are invaluable sources of knowledge about words and their meanings, with numerous applications in areas like NLP, IR, and AI. We propose a methodology for the automatic construction of a large-scale multilingual lexical database where words of many languages are hierarchically organized in terms of their meanings and their semantic relations to other words. This resource is bootstrapped from WordNet, a well-known English-language resource. Our approach extends WordNet with around 1.5 million meaning links for 800,000 words in over 200 languages, drawing on evidence extracted from a variety of resources including existing (monolingual) wordnets, (mostly bilingual) translation dictionaries, and parallel corpora. Graph-based scoring functions and statistical learning techniques are used to iteratively integrate this information and build an output graph. Experiments show that this wordnet has a high level of precision and coverage, and that it can be useful in applied tasks such as cross-lingual text classification
Intelligent multimedia indexing and retrieval through multi-source information extraction and merging
This paper reports work on automated meta-data\ud
creation for multimedia content. The approach results\ud
in the generation of a conceptual index of\ud
the content which may then be searched via semantic\ud
categories instead of keywords. The novelty\ud
of the work is to exploit multiple sources of\ud
information relating to video content (in this case\ud
the rich range of sources covering important sports\ud
events). News, commentaries and web reports covering\ud
international football games in multiple languages\ud
and multiple modalities is analysed and the\ud
resultant data merged. This merging process leads\ud
to increased accuracy relative to individual sources
- β¦