Search CORE

73 research outputs found

ConceptScope: Organizing and Visualizing Knowledge in Documents based on Domain Ontology

Author: Chandrasegaran Senthil
Ma Kwan-Liu
Zhang Xiaoyu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2021
Field of study

Current text visualization techniques typically provide overviews of document content and structure using intrinsic properties such as term frequencies, co-occurrences, and sentence structures. Such visualizations lack conceptual overviews incorporating domain-relevant knowledge, needed when examining documents such as research articles or technical reports. To address this shortcoming, we present ConceptScope, a technique that utilizes a domain ontology to represent the conceptual relationships in a document in the form of a Bubble Treemap visualization. Multiple coordinated views of document structure and concept hierarchy with text overviews further aid document analysis. ConceptScope facilitates exploration and comparison of single and multiple documents respectively. We demonstrate ConceptScope by visualizing research articles and transcripts of technical presentations in computer science. In a comparative study with DocuBurst, a popular document visualization tool, ConceptScope was found to be more informative in exploring and comparing domain-specific documents, but less so when it came to documents that spanned multiple disciplines.Comment: 19 pages, 5 figure

arXiv.org e-Print Archive

TU Delft Repository

Proceedings of the Conference on Natural Language Processing 2010

Author
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2010
Field of study

This book contains state-of-the-art contributions to the 10th conference on Natural Language Processing, KONVENS 2010 (Konferenz zur Verarbeitung natürlicher Sprache), with a focus on semantic processing. The KONVENS in general aims at offering a broad perspective on current research and developments within the interdisciplinary field of natural language processing. The central theme draws specific attention towards addressing linguistic aspects ofmeaning, covering deep as well as shallow approaches to semantic processing. The contributions address both knowledgebased and data-driven methods for modelling and acquiring semantic information, and discuss the role of semantic information in applications of language technology. The articles demonstrate the importance of semantic processing, and present novel and creative approaches to natural language processing in general. Some contributions put their focus on developing and improving NLP systems for tasks like Named Entity Recognition or Word Sense Disambiguation, or focus on semantic knowledge acquisition and exploitation with respect to collaboratively built ressources, or harvesting semantic information in virtual games. Others are set within the context of real-world applications, such as Authoring Aids, Text Summarisation and Information Retrieval. The collection highlights the importance of semantic processing for different areas and applications in Natural Language Processing, and provides the reader with an overview of current research in this field

Acronym

Robust Dialog Management Through A Context-centric Architecture

Author: Hung Victor C.
Publication venue: 'Information Bulletin on Variable Stars (IBVS)'
Publication date: 01/01/2010
Field of study

This dissertation presents and evaluates a method of managing spoken dialog interactions with a robust attention to fulfilling the human user’s goals in the presence of speech recognition limitations. Assistive speech-based embodied conversation agents are computer-based entities that interact with humans to help accomplish a certain task or communicate information via spoken input and output. A challenging aspect of this task involves open dialog, where the user is free to converse in an unstructured manner. With this style of input, the machine’s ability to communicate may be hindered by poor reception of utterances, caused by a user’s inadequate command of a language and/or faults in the speech recognition facilities. Since a speech-based input is emphasized, this endeavor involves the fundamental issues associated with natural language processing, automatic speech recognition and dialog system design. Driven by ContextBased Reasoning, the presented dialog manager features a discourse model that implements mixed-initiative conversation with a focus on the user’s assistive needs. The discourse behavior must maintain a sense of generality, where the assistive nature of the system remains constant regardless of its knowledge corpus. The dialog manager was encapsulated into a speech-based embodied conversation agent platform for prototyping and testing purposes. A battery of user trials was performed on this agent to evaluate its performance as a robust, domain-independent, speech-based interaction entity capable of satisfying the needs of its users

University of Central Florida (UCF): STARS (Showcase of Text, Archives, Research & Scholarship)

A robust methodology for automated essay grading

Author: Fazal Anhar
Publication venue: Curtin University
Publication date: 01/01/2013
Field of study

None of the available automated essay grading systems can be used to grade essays according to the National Assessment Program – Literacy and Numeracy (NAPLAN) analytic scoring rubric used in Australia. This thesis is a humble effort to address this limitation. The objective of this thesis is to develop a robust methodology for automatically grading essays based on the NAPLAN rubric by using heuristics and rules based on English language and neural network modelling

espace@Curtin

Idiom treatment experiments in machine translation

Author: Anastasiou Dimitra
Publication venue: Fakultät 4 - Philosophische Fakultät II. Fachrichtung 4.6 - Angewandte Sprachwissenschaft sowie Übersetzen und Dolmetschen
Publication date: 01/01/2010
Field of study

Idiomatic expressions pose a particular challenge for the today\u27;s Machine Translation systems, because their translation mostly does not result literally, but logically. The present dissertation shows, how with the help of a corpus, and morphosyntactic rules, such idiomatic expressions can be recognized and finally correctly translated. The work leads the reader in the first chapter generally to the field of Machine Translation and following that, it focuses on the special field of Example-based Machine Translation. Next, an important part of the doctoral thesis dissertation is devoted to the theory of idiomatic expressions. The practical part of the thesis describes how the hybrid Example-based Machine Translation system METIS-II, with the help of morphosyntactic rules, is able to correctly process certain idiomatic expressions and finally, to translate them. The following chapter deals with the function of the transfer system CAT2 and its handling of the idiomatic expressions. The last part of the thesis includes the evaluation of three commercial systems, namely SYSTRAN, T1 Langenscheidt, and Power Translator Pro, with respect to continuous and discontinuous idiomatic expressions. For this, both small corpora and a part of the extensive corpus Europarl and the Digital Lexicon of the German Language in 20th century were processed, firstly manually and then automatically. The dissertation concludes with results from this evaluation.Idiomatische Redewendungen stellen für heutige maschinelle Übersetzungssysteme eine besondere Herausforderung dar, da ihre Übersetzung nicht wörtlich, sondern stets sinngemäß erfolgen muss. Die vorliegende Dissertation zeigt, wie mit Hilfe eines Korpus sowie morphosyntaktischer Regeln solche idiomatische Redewendungen erkannt und am Ende richtig übersetzt werden können. Die Arbeit führt den Leser im ersten Kapitel allgemein in das Gebiet der Maschinellen Übersetzung ein und vertieft im Anschluss daran das Spezialgebiet der Beispielbasierten Maschinellen Übersetzung. Im Folgenden widmet sich ein wesentlicher Teil der Doktorarbeit der Theorie über idiomatische Redewendungen. Der praktische Teil der Arbeit beschreibt wie das hybride Beispielbasierte Maschinelle Übersetzungssystem METIS-II mit Hilfe von morphosyntaktischen Regeln befähigt wurde, bestimmte idiomatische Redewendungen korrekt zu bearbeiten und am Ende zu übersetzen. Das nachfolgende Kapitel behandelt die Funktion des Transfersystems CAT2 und dessen Umgang mit idiomatischen Wendungen. Der letzte Teil der Arbeit beinhaltet die Evaluation von drei kommerzielle Systemen, nämlich SYSTRAN, T1 Langenscheidt und Power Translator Pro, in Bezug auf deren Umgang mit kontinuierlichen und diskontinuierlichen idiomatischen Redewendungen. Hierzu wurden sowohl kleine Korpora als auch ein Teil des umfangreichen Korpus Europarl und des Digatalen Wörterbuchs der deutschen Sprache des 20. Jh. erst manuell und dann maschinell bearbeitet. Die Dissertation wird mit Folgerungen aus der Evaluation abgeschlossen

Profiling lexical bundles in an EAP pre-sessional course: A corpus-based study on textbooks and instructors' materials

Author: Fattani Reem Abdulkareem I
Publication venue: 'University of Sheffield Conference Proceedings'
Publication date: 01/10/2018
Field of study

White Rose E-theses Online

Design of a Controlled Language for Critical Infrastructures Protection

Author: CANTARELLA SIMONA
FERIGATO Carlo
OWUSU EVANS BOATENG
Publication venue: European Language Resources Association
Publication date: 28/03/2012
Field of study

We describe a project for the construction of controlled language for critical infrastructures protection (CIP). This project originates from the need to coordinate and categorize the communications on CIP at the European level. These communications can be physically represented by official documents, reports on incidents, informal communications and plain e-mail. We explore the application of traditional library science tools for the construction of controlled languages in order to achieve our goal. Our starting point is an analogous work done during the sixties in the field of nuclear science known as the Euratom Thesaurus.JRC.G.6-Security technology assessmen

JRC Publications Repository

Language technology in multimedia information retrieval:Proceedings of the fourteenth International Twente Workshop on Language Technology

Author
Publication venue: 'University Library/University of Twente'
Publication date: 01/12/1998
Field of study

University of Twente Research Information