252 research outputs found

    Robust Spoken Language Understanding for House Service Robots

    Get PDF
    Service robotics has been growing significantly in thelast years, leading to several research results and to a numberof consumer products. One of the essential features of theserobotic platforms is represented by the ability of interactingwith users through natural language. Spoken commands canbe processed by a Spoken Language Understanding chain, inorder to obtain the desired behavior of the robot. The entrypoint of such a process is represented by an Automatic SpeechRecognition (ASR) module, that provides a list of transcriptionsfor a given spoken utterance. Although several well-performingASR engines are available off-the-shelf, they operate in a generalpurpose setting. Hence, they may be not well suited in therecognition of utterances given to robots in specific domains. Inthis work, we propose a practical yet robust strategy to re-ranklists of transcriptions. This approach improves the quality of ASRsystems in situated scenarios, i.e., the transcription of roboticcommands. The proposed method relies upon evidences derivedby a semantic grammar with semantic actions, designed tomodel typical commands expressed in scenarios that are specificto human service robotics. The outcomes obtained throughan experimental evaluation show that the approach is able toeffectively outperform the ASR baseline, obtained by selectingthe first transcription suggested by the AS

    Knowledge Expansion of a Statistical Machine Translation System using Morphological Resources

    Get PDF
    Translation capability of a Phrase-Based Statistical Machine Translation (PBSMT) system mostly depends on parallel data and phrases that are not present in the training data are not correctly translated. This paper describes a method that efficiently expands the existing knowledge of a PBSMT system without adding more parallel data but using external morphological resources. A set of new phrase associations is added to translation and reordering models; each of them corresponds to a morphological variation of the source/target/both phrases of an existing association. New associations are generated using a string similarity score based on morphosyntactic information. We tested our approach on En-Fr and Fr-En translations and results showed improvements of the performance in terms of automatic scores (BLEU and Meteor) and reduction of out-of-vocabulary (OOV) words. We believe that our knowledge expansion framework is generic and could be used to add different types of information to the model.JRC.G.2-Global security and crisis managemen

    A logic programming approach for the conservation of buildings based on an extension of the Eindhoven Classification Model

    Get PDF
    The identification, classification and recording of events that may lead to the deterioration of buildings are crucial for the development of appropriate repair strategies. This work presents an extension of the Eindhoven Classification Model to sort adverse events root causes for Building Conservation. Logic Programming was used for knowledge representation and reasoning, letting the modelling of the universe of discourse in terms of defective data, information and knowledge. Indeed, a systematization of the evolution process of the body of knowledge in terms of a new factor, the Quality of Information one, embedded in the Root Cause Analysis was accomplished, i.e., the system proposed led to a process of Quality of Information quantification that permit the study of the event's root causes, on time

    Applying the technology acceptance model to evaluation of recommender systems

    Get PDF
    In general, the study of recommender systems emphasizes the efficiency of techniques to provide accurate recommendations rather than factors influencing users' acceptance of the system; however, accuracy alone cannot account for users' satisfying experience. Bearing in mind this gap in the research, we apply the technology acceptance model (TAM) to evaluate user acceptance of a recommender system in the movies domain. Within the basic TAM model, we incorporate a new latent variable representing self-assessed user skills to use a recommender system. The experiment included 116 users who answered a satisfaction survey after using a movie recommender system. The results evince that perceived usefulness of the system has more impact than perceived ease of use to motivate acceptance of recommendations. Additionally, users' previous skills strongly influence perceived ease of use, which directly impacts on perceived usefulness of the system. These findings can assist developers of recommender systems in their attempt to maximize users' experience.Fil: Armentano, Marcelo Gabriel. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tandil. Instituto Superior de Ingenieria del Software; ArgentinaFil: Christensen, Ingrid Alina. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tandil. Instituto Superior de Ingenieria del Software; ArgentinaFil: Schiaffino, Silvia Noemi. Consejo Nacional de Investigaciones Científicas y Técnicas. Centro Científico Tecnológico Tandil. Instituto Superior de Ingenieria del Software; Argentin

    From Task Classification Towards Similarity Measures for Recommendation in Crowdsourcing Systems

    Full text link
    Task selection in micro-task markets can be supported by recommender systems to help individuals to find appropriate tasks. Previous work showed that for the selection process of a micro-task the semantic aspects, such as the required action and the comprehensibility, are rated more important than factual aspects, such as the payment or the required completion time. This work gives a foundation to create such similarity measures. Therefore, we show that an automatic classification based on task descriptions is possible. Additionally, we propose similarity measures to cluster micro-tasks according to semantic aspects.Comment: Work in Progress Paper at HCOMP 201

    More effective boilerplate removal – the GoldMiner algorithm

    Get PDF
    Abstract—The ever-increasing web is an important source for building large-scale corpora. However, dynamically generated web pages often contain much irrelevant and duplicated text, which impairs the quality of the corpus. To ensure the high quality of web-based corpora, a good boilerplate removal algorithm is needed to extract only the relevant content from web pages. In this article, we present an automatic text extraction procedure, GoldMiner, which by enhancing a previously published boilerplate removal algorithm, minimizes the occurrence of irrelevant duplicated content in corpora, and keeps the text more coherent than previous tools. The algorithm exploits similarities in the HTML structure of pages coming from the same domain. A new evaluation document set (CleanPortalEval) is also presented, which can demonstrate the power of boilerplate removal algorithms for web portal pages. Index Terms—corpus building, boilerplate removal, the web as corpus I. THE TASK When constructing corpora from web content, the extraction of relevant text from dynamically generated HTML pages is not a trivial task due to the great amount of irrelevant repeated text that needs to be identified and removed so that it does not compromise the quality of the corpus. This task, called boilerplate removal in the literature, consists of categorizing HTML content as valuable vs. irrelevant, filtering out menus, headers and footers, advertisements, and structure repeated on many pages. In this paper, we present a boilerplate removal algorithm that removes irrelevant content from crawled content more effectively than previous tools. The structure of our paper is as follows. First, we present some tools that we used as baselines when evaluating the performance of our system. The algorithm implemented in one of these tools, jusText, is also used as part of our enhanced boilerplate removal algorithm. This is followed by the presentation of the enhanced system, called GoldMiner, and the evaluation of the results

    Wikification of learning objects using metadata as an alternative context for disambiguation

    Get PDF
    We present a methodology to wikify learning objects. Our proposal is focused on two processes: word sense disambiguation and relevant phrase selection. The disambiguation process involves the use of the learning objects metadata as either additional or alternative context. This increases the probability of success when a learning object has a low quality context. The selection of relevant phrases is perf ormed by identifying the highest values of semantic relat edness between the main subject of a learning object and t he phrases. This criterion is useful for achieving the didactic objectives of the learning object
    corecore