24,343 research outputs found

    Collaborative Development and Evaluation of Text-processing Workflows in a UIMA-supported Web-based Workbench

    Get PDF
    Challenges in creating comprehensive text-processing worklows include a lack of the interoperability of individual components coming from different providers and/or a requirement imposed on the end users to know programming techniques to compose such workflows. In this paper we demonstrate Argo, a web-based system that addresses these issues in several ways. It supports the widely adopted Unstructured Information Management Architecture (UIMA), which handles the problem of interoperability; it provides a web browser-based interface for developing workflows by drawing diagrams composed of a selection of available processing components; and it provides novel user-interactive analytics such as the annotation editor which constitutes a bridge between automatic processing and manual correction. These features extend the target audience of Argo to users with a limited or no technical background. Here, we focus specifically on the construction of advanced workflows, involving multiple branching and merging points, to facilitate various comparative evalutions. Together with the use of user-collaboration capabilities supported in Argo, we demonstrate several use cases including visual inspections, comparisions of multiple processing segments or complete solutions against a reference standard, inter-annotator agreement, and shared task mass evaluations. Ultimetely, Argo emerges as a one-stop workbench for defining, processing, editing and evaluating text processing tasks

    ADEPT2 - Next Generation Process Management Technology

    Get PDF
    If current process management systems shall be applied to a broad spectrum of applications, they will have to be significantly improved with respect to their technological capabilities. In particular, in dynamic environments it must be possible to quickly implement and deploy new processes, to enable ad-hoc modifications of single process instances at runtime (e.g., to add, delete or shift process steps), and to support process schema evolution with instance migration, i.e., to propagate process schema changes to already running instances. These requirements must be met without affecting process consistency and by preserving the robustness of the process management system. In this paper we describe how these challenges have been addressed and solved in the ADEPT2 Process Management System. Our overall vision is to provide a next generation process management technology which can be used in a variety of application domains

    Building trainable taggers in a web-based, UIMA-supported NLP workbench

    Get PDF
    Argo is a web-based NLP and text mining workbench with a convenient graphical user interface for designing and executing processing workflows of various complexity. The workbench is intended for specialists and nontechnical audiences alike, and provides the ever expanding library of analytics compliant with the Unstructured Information Management Architecture, a widely adopted interoperability framework. We explore the flexibility of this framework by demonstrating workflows involving three processing components capable of performing self-contained machine learning-based tagging. The three components are responsible for the three distinct tasks of 1) generating observations or features, 2) training a statistical model based on the generated features, and 3) tagging unlabelled data with the model. The learning and tagging components are based on an implementation of conditional random fields (CRF); whereas the feature generation component is an analytic capable of extending basic token information to a comprehensive set of features. Users define the features of their choice directly from Argo’s graphical interface, without resorting to programming (a commonly used approach to feature engineering). The experimental results performed on two tagging tasks, chunking and named entity recognition, showed that a tagger with a generic set of features built in Argo is capable of competing with taskspecific solutions.
    • …
    corecore