4,227 research outputs found

    Using hybrid algorithmic-crowdsourcing methods for academic knowledge acquisition

    Get PDF
    such as Figures, Tables, Deļ¬nitions, Algo- rithms, etc., which are called Knowledge Cells hereafter. An advanced academic search engine which could take advantage of Knowledge Cells and their various relation- ships to obtain more accurate search results is expected. Further, itā€™s expected to provide a ļ¬ne-grained search regard- ing to Knowledge Cells for deep-level information discovery and exploration. Therefore, it is important to identify and extract the Knowledge Cells and their various relationships which are often intrinsic and implicit in articles. With the exponential growth of scientiļ¬c publications, discovery and acquisition of such useful academic knowledge impose some practical challenges For example, existing algorithmic meth- ods can hardly extend to handle diverse layouts of journals, nor to scale up to process massive documents. As crowd- sourcing has become a powerful paradigm for large scale problem-solving especially for tasks that are difļ¬cult for computers but easy for human, we consider the problem of academic knowledge discovery and acquisition as a crowd- sourced database problem and show a hybrid framework to integrate the accuracy of crowdsourcing workers and the speed of automatic algorithms. In this paper, we introduce our current system implementation, a platform for academic knowledge discovery and acquisition (PANDA), as well as some interesting observations and promising future directions.Peer reviewe

    Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses

    Full text link
    Automatically evaluating the quality of dialogue responses for unstructured domains is a challenging problem. Unfortunately, existing automatic evaluation metrics are biased and correlate very poorly with human judgements of response quality. Yet having an accurate automatic evaluation procedure is crucial for dialogue research, as it allows rapid prototyping and testing of new models with fewer expensive human evaluations. In response to this challenge, we formulate automatic dialogue evaluation as a learning problem. We present an evaluation model (ADEM) that learns to predict human-like scores to input responses, using a new dataset of human response scores. We show that the ADEM model's predictions correlate significantly, and at a level much higher than word-overlap metrics such as BLEU, with human judgements at both the utterance and system-level. We also show that ADEM can generalize to evaluating dialogue models unseen during training, an important step for automatic dialogue evaluation.Comment: ACL 201

    TiFi: Taxonomy Induction for Fictional Domains [Extended version]

    No full text
    Taxonomies are important building blocks of structured knowledge bases, and their construction from text sources and Wikipedia has received much attention. In this paper we focus on the construction of taxonomies for fictional domains, using noisy category systems from fan wikis or text extraction as input. Such fictional domains are archetypes of entity universes that are poorly covered by Wikipedia, such as also enterprise-specific knowledge bases or highly specialized verticals. Our fiction-targeted approach, called TiFi, consists of three phases: (i) category cleaning, by identifying candidate categories that truly represent classes in the domain of interest, (ii) edge cleaning, by selecting subcategory relationships that correspond to class subsumption, and (iii) top-level construction, by mapping classes onto a subset of high-level WordNet categories. A comprehensive evaluation shows that TiFi is able to construct taxonomies for a diverse range of fictional domains such as Lord of the Rings, The Simpsons or Greek Mythology with very high precision and that it outperforms state-of-the-art baselines for taxonomy induction by a substantial margin

    Towards Automatically Extracting UML Class Diagrams from Natural Language Specifications

    Full text link
    In model-driven engineering (MDE), UML class diagrams serve as a way to plan and communicate between developers. However, it is complex and resource-consuming. We propose an automated approach for the extraction of UML class diagrams from natural language software specifications. To develop our approach, we create a dataset of UML class diagrams and their English specifications with the help of volunteers. Our approach is a pipeline of steps consisting of the segmentation of the input into sentences, the classification of the sentences, the generation of UML class diagram fragments from sentences, and the composition of these fragments into one UML class diagram. We develop a quantitative testing framework specific to UML class diagram extraction. Our approach yields low precision and recall but serves as a benchmark for future research.Comment: 8 pages, 7 tables, 9 figures, 2 algorithms, to be published in MODELS '22 Companio

    Open semantic service networks

    Get PDF
    Online service marketplaces will soon be part of the economy to scale the provision of specialized multi-party services through automation and standardization. Current research, such as the *-USDL service description language family, is already deļ¬ning the basic building blocks to model the next generation of business services. Nonetheless, the developments being made do not target to interconnect services via service relationships. Without the concept of relationship, marketplaces will be seen as mere functional silos containing service descriptions. Yet, in real economies, all services are related and connected. Therefore, to address this gap we introduce the concept of open semantic service network (OSSN), concerned with the establishment of rich relationships between services. These networks will provide valuable knowledge on the global service economy, which can be exploited for many socio-economic and scientiļ¬c purposes such as service network analysis, management, and control
    • ā€¦
    corecore