218 research outputs found
Modularity and Neural Integration in Large-Vocabulary Continuous Speech Recognition
This Thesis tackles the problems of modularity in Large-Vocabulary Continuous Speech Recognition with use of Neural Network
Establishing a New State-of-the-Art for French Named Entity Recognition
The French TreeBank developed at the University Paris 7 is the main source of
morphosyntactic and syntactic annotations for French. However, it does not
include explicit information related to named entities, which are among the
most useful information for several natural language processing tasks and
applications. Moreover, no large-scale French corpus with named entity
annotations contain referential information, which complement the type and the
span of each mention with an indication of the entity it refers to. We have
manually annotated the French TreeBank with such information, after an
automatic pre-annotation step. We sketch the underlying annotation guidelines
and we provide a few figures about the resulting annotations
CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines
Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective.
The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines.
From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
QCompere @ REPERE 2013
International audienceWe describe QCompere consortium submissions to the REPERE 2013 evaluation campaign. The REPERE challenge aims at gathering four communities (face recognition, speaker identification, optical character recognition and named entity detection) towards the same goal: multimodal person recognition in TV broadcast. First, four mono-modal components are introduced (one for each foregoing community) constituting the elementary building blocks of our various submissions. Then, depending on the target modality (speaker or face recognition) and on the task (supervised or unsupervised recognition), four different fusion techniques are introduced: they can be summarized as propagation-, classifier-, rule- or graph-based approaches. Finally, their performance is evaluated on REPERE 2013 test set and their advantages and limitations are discussed
The search and hyperlinking task at MediaEval 2013
The Search and Hyperlinking Task formed part of the MediaEval 2013 evaluation workshop. The Task consisted of two sub-tasks: (1) answering known-item queries from a collection of roughly 1200 hours of broadcast TV material, and (2) linking anchors within the known item to other parts of the video collection. We provide an overview of the task and the data sets used
Development of Hybrid ASR Systems for Low Resource Medical Domain Conversational Telephone Speech
Language barriers present a great challenge in our increasingly connected and
global world. Especially within the medical domain, e.g. hospital or emergency
room, communication difficulties and delays may lead to malpractice and
non-optimal patient care. In the HYKIST project, we consider patient-physician
communication, more specifically between a German-speaking physician and an
Arabic- or Vietnamese-speaking patient. Currently, a doctor can call the
Triaphon service to get assistance from an interpreter in order to help
facilitate communication. The HYKIST goal is to support the usually
non-professional bilingual interpreter with an automatic speech translation
system to improve patient care and help overcome language barriers. In this
work, we present our ASR system development efforts for this conversational
telephone speech translation task in the medical domain for two languages
pairs, data collection, various acoustic model architectures and
dialect-induced difficulties.Comment: ASR System Paper for HYKIST projec
- âŠ