21,861 research outputs found
Guidelines for a Dynamic Ontology - Integrating Tools of Evolution and Versioning in Ontology
Ontologies are built on systems that conceptually evolve over time. In
addition, techniques and languages for building ontologies evolve too. This has
led to numerous studies in the field of ontology versioning and ontology
evolution. This paper presents a new way to manage the lifecycle of an ontology
incorporating both versioning tools and evolution process. This solution,
called VersionGraph, is integrated in the source ontology since its creation in
order to make it possible to evolve and to be versioned. Change management is
strongly related to the model in which the ontology is represented. Therefore,
we focus on the OWL language in order to take into account the impact of the
changes on the logical consistency of the ontology like specified in OWL DL
Spanish generation from Spanish Sign Language using a phrase-based translation system
This paper describes the development of a Spoken Spanish generator from Spanish Sign Language (LSE â Lengua de Signos Española) in a specific domain: the renewal of Identity Document and Driverâs license. The system is composed of three modules. The first one is an interface where a deaf person can specify a sign sequence in sign-writing. The second one is a language translator for converting the sign sequence into a word sequence. Finally, the last module is a text to speech converter. Also, the paper describes the generation of a parallel corpus for the system development composed of more than 4,000 Spanish sentences and their LSE translations in the application domain. The paper is focused on the translation module that uses a statistical strategy with a phrase-based translation model, and this paper analyses the effect of the alignment configuration used during the process of word based translation model generation. Finally, the best configuration gives a 3.90% mWER and a 0.9645 BLEU
Towards Language-Universal End-to-End Speech Recognition
Building speech recognizers in multiple languages typically involves
replicating a monolingual training recipe for each language, or utilizing a
multi-task learning approach where models for different languages have separate
output labels but share some internal parameters. In this work, we exploit
recent progress in end-to-end speech recognition to create a single
multilingual speech recognition system capable of recognizing any of the
languages seen in training. To do so, we propose the use of a universal
character set that is shared among all languages. We also create a
language-specific gating mechanism within the network that can modulate the
network's internal representations in a language-specific way. We evaluate our
proposed approach on the Microsoft Cortana task across three languages and show
that our system outperforms both the individual monolingual systems and systems
built with a multi-task learning approach. We also show that this model can be
used to initialize a monolingual speech recognizer, and can be used to create a
bilingual model for use in code-switching scenarios.Comment: submitted to ICASSP 201
Advanced Speech Communication System for Deaf People
This paper describes the development of an Advanced Speech Communication System for Deaf People and its field evaluation in a real application domain: the renewal of Driverâs License. The system is composed of two modules. The first one is a Spanish into Spanish Sign Language (LSE: Lengua de Signos Española) translation module made up of a speech recognizer, a natural language translator (for converting a word sequence into a sequence of signs), and a 3D avatar animation module (for playing back the signs). The second module is a Spoken Spanish generator from sign writing composed of a visual interface (for specifying a sequence of signs), a language translator (for generating the sequence of words in Spanish), and finally, a text to speech converter. For language translation, the system integrates three technologies: an example based strategy, a rule based translation method and a statistical translator. This paper also includes a detailed description of the evaluation carried out in the Local Traffic Office in the city of Toledo (Spain) involving real government employees and deaf people. This evaluation includes objective measurements from the system and subjective information from questionnaire
Grouping axioms for more coherent ontology descriptions
Ontologies and datasets for the Semantic Web are encoded in OWL formalisms that are not easily comprehended by people. To make ontologies accessible to human domain experts, several research groups have developed ontology verbalisers using Natural Language Generation. In practice ontologies are usually composed of simple axioms, so that realising them separately is relatively easy; there remains however the problem of producing texts that are coherent and efficient. We describe in this paper some methods for producing sentences that aggregate over sets of axioms that share the same logical structure. Because these methods are based on logical structure rather than domain-specific concepts or language-specific syntax, they are generic both as regards domain and language
Assessing and refining mappings to RDF to improve dataset quality
RDF dataset quality assessment is currently performed primarily after data is published. However, there is neither a systematic way to incorporate its results into the dataset nor the assessment into the publishing workflow. Adjustments are manually -but rarely- applied. Nevertheless, the root of the violations which often derive from the mappings that specify how the RDF dataset will be generated, is not identified. We suggest an incremental, iterative and uniform validation workflow for RDF datasets stemming originally from (semi-) structured data (e.g., CSV, XML, JSON). In this work, we focus on assessing and improving their mappings. We incorporate (i) a test-driven approach for assessing the mappings instead of the RDF dataset itself, as mappings reflect how the dataset will be formed when generated; and (ii) perform semi-automatic mapping refinements based on the results of the quality assessment. The proposed workflow is applied to diverse cases, e.g., large, crowdsourced datasets such as DBpedia, or newly generated, such as iLastic. Our evaluation indicates the efficiency of our workflow, as it significantly improves the overall quality of an RDF dataset in the observed cases
Domain-speciïŹc query translation for multilingual access to digital libraries
Accurate high-coverage translation is a vital component of reliable cross language information access (CLIR) systems. This is particularly true of access to archives such as Digital Libraries which are often speciïŹc to certain domains. While general machine translation (MT) has been shown to be effective for CLIR tasks in information retrieval evaluation workshops, it is not well suited to specialized tasks where domain speciïŹc translations are required. We demonstrate that effective query translation
in the domain of cultural heritage (CH) can be achieved by augmenting a standard MT system with domain-speciïŹc phrase dictionaries automatically mined from the online Wikipedia. Experiments using our hybrid translation system with sample query logs from users of CH websites demonstrate a large improvement in the accuracy of domain speciïŹc phrase detection and translation
- âŠ