Search CORE

7,401 research outputs found

Automatic case acquisition from texts for process-oriented case-based reasoning

Author: Ber Florence Le
Dufour-Lussier Valmi
Lieber Jean
Nauer Emmanuel
Publication venue: 'Elsevier BV'
Publication date: 20/12/2012
Field of study

This paper introduces a method for the automatic acquisition of a rich case representation from free text for process-oriented case-based reasoning. Case engineering is among the most complicated and costly tasks in implementing a case-based reasoning system. This is especially so for process-oriented case-based reasoning, where more expressive case representations are generally used and, in our opinion, actually required for satisfactory case adaptation. In this context, the ability to acquire cases automatically from procedural texts is a major step forward in order to reason on processes. We therefore detail a methodology that makes case acquisition from processes described as free text possible, with special attention given to assembly instruction texts. This methodology extends the techniques we used to extract actions from cooking recipes. We argue that techniques taken from natural language processing are required for this task, and that they give satisfactory results. An evaluation based on our implemented prototype extracting workflows from recipe texts is provided.Comment: Sous presse, publication pr\'evue en 201

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Tagging and linking lecture audio recordings: goals and practice

Author: Barr Niall
Draper Stephen
Given Michael
Gray Norman
Honeychurch Sarah
Labrosse Nicolas
Publication venue
Publication date: 01/01/2013
Field of study

Making and distributing audio recordings of lectures is cheap and technically straightforward, and these recordings represent an underexploited teaching resource. We explore the reasons why such recordings are not more used; we believe the barriers inhibiting such use should be easily overcome. Students can listen to a lecture they missed, or re-listen to a lecture at revision time, but their interaction is limited by the affordances of the replaying technology. Listening to lecture audio is generally solitary, linear, and disjoint from other available media. In this paper, we describe a tool we are developing at the University of Glasgow, which enriches students' interactions with lecture audio. We describe our experiments with this tool in session 2012-13. Fewer students used the tool than we expected would naturally do so, and we discuss some possible explanations for this

arXiv.org e-Print Archive

CiteSeerX

Enlighten

From chunks to function-argument structure : a similarity-based approach

Author: Hinrichs Erhard
Kübler Sandra
Publication venue
Publication date: 01/01/2001
Field of study

Chunk parsing has focused on the recognition of partial constituent structures at the level of individual chunks. Little attention has been paid to the question of how such partial analyses can be combined into larger structures for complete utterances. Such larger structures are not only desirable for a deeper syntactic analysis. They also constitute a necessary prerequisite for assigning function-argument structure. The present paper offers a similaritybased algorithm for assigning functional labels such as subject, object, head, complement, etc. to complete syntactic structures on the basis of prechunked input. The evaluation of the algorithm has concentrated on measuring the quality of functional labels. It was performed on a German and an English treebank using two different annotation schemes at the level of function argument structure. The results of 89.73% correct functional labels for German and 90.40%for English validate the general approach

CiteSeerX

Crossref

Publikationsserver der Universität Tübingen

Hochschulschriftenserver - Universität Frankfurt am Main

Towards Universal Semantic Tagging

Author: Abzianidze Lasha
Bos Johan
Publication venue
Publication date: 29/09/2017
Field of study

The paper proposes the task of universal semantic tagging---tagging word tokens with language-neutral, semantically informative tags. We argue that the task, with its independent nature, contributes to better semantic analysis for wide-coverage multilingual text. We present the initial version of the semantic tagset and show that (a) the tags provide semantically fine-grained information, and (b) they are suitable for cross-lingual semantic parsing. An application of the semantic tagging in the Parallel Meaning Bank supports both of these points as the tags contribute to formal lexical semantics and their cross-lingual projection. As a part of the application, we annotate a small corpus with the semantic tags and present new baseline result for universal semantic tagging.Comment: 9 pages, International Conference on Computational Semantics (IWCS

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Annotating patient clinical records with syntactic chunks and named entities: the Harvey corpus

Author: A Roberts
A Shah
Aleksandar Savkov
B Efron
G Hripcsak
G Savova
J Cohen
J Foster
J-W Fan
Jackie Cassell
John Carroll
K Verspoor
KH Krippendorff
LK Tanabe
M Bada
MP Marcus
Rob Koeling
S Abney
W Sun
Ö Uzuner
Ö Uzuner
Ö Uzuner
Ö Uzuner
Ö Uzuner
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The free text notes typed by physicians during patient consultations contain valuable information for the study of disease and treatment. These notes are difficult to process by existing natural language analysis tools since they are highly telegraphic (omitting many words), and contain many spelling mistakes, inconsistencies in punctuation, and non-standard word order. To support information extraction and classification tasks over such text, we describe a de-identified corpus of free text notes, a shallow syntactic and named entity annotation scheme for this kind of text, and an approach to training domain specialists with no linguistic background to annotate the text. Finally, we present a statistical chunking system for such clinical text with a stable learning rate and good accuracy, indicating that the manual annotation is consistent and that the annotation scheme is tractable for machine learning

Crossref

Springer - Publisher Connector

PubMed Central

Sussex Research Online

Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews

Author: Turney Peter D.
Publication venue
Publication date: 01/01/2002
Field of study

This paper presents a simple unsupervised learning algorithm for classifying reviews as recommended (thumbs up) or not recommended (thumbs down). The classification of a review is predicted by the average semantic orientation of the phrases in the review that contain adjectives or adverbs. A phrase has a positive semantic orientation when it has good associations (e.g., "subtle nuances") and a negative semantic orientation when it has bad associations (e.g., "very cavalier"). In this paper, the semantic orientation of a phrase is calculated as the mutual information between the given phrase and the word "excellent" minus the mutual information between the given phrase and the word "poor". A review is classified as recommended if the average semantic orientation of its phrases is positive. The algorithm achieves an average accuracy of 74% when evaluated on 410 reviews from Epinions, sampled from four different domains (reviews of automobiles, banks, movies, and travel destinations). The accuracy ranges from 84% for automobile reviews to 66% for movie reviews

arXiv.org e-Print Archive

CiteSeerX

NRC Publications Archive

CogPrints Cognitive Sciences Eprint Archive

NLP Resources for a Rare Language Morphological Analyzer: Danish Case

Author: Котов М.В.
Publication venue: National Technical University «KhPI», Lviv Polytechnic National University
Publication date: 01/01/2017
Field of study

ORCID ID: http://orcid.org/0000-0001-8327-5197The paper discusses the characteristics and practical aspects of application of the natural language processing resources available for developing a rare language morphological analysis solution. The case under consideration reveals the pipeline design needed to prepare the grammatical resources for Danish. Being rare not only in terms of distribution, but also in the amount of natural language resources available, the Danish language represents a significant problem in terms of application of third-party tools to help solve various NLP-related issues. The paper focuses on part-of-speech tagging and lemmatization, typical but indispensable tasks at the pre-processing stage within the framework of developing a morphological analyzer as a custom NLP solution

Електронного архіву Харківського національного університету імені В.Н.Каразіна (Electronic Archive V.N. Karazin Kharkiv National University)