Search CORE

43,614 research outputs found

Event-based Access to Historical Italian War Memoirs

Author: Nanni Federico
Ponzetto Simone Paolo
Rovera Marco
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2021
Field of study

The progressive digitization of historical archives provides new, often domain specific, textual resources that report on facts and events which have happened in the past; among these, memoirs are a very common type of primary source. In this paper, we present an approach for extracting information from Italian historical war memoirs and turning it into structured knowledge. This is based on the semantic notions of events, participants and roles. We evaluate quantitatively each of the key-steps of our approach and provide a graph-based representation of the extracted knowledge, which allows to move between a Close and a Distant Reading of the collection.Comment: 23 pages, 6 figure

arXiv.org e-Print Archive

MAnnheim DOCument Server

Recommended from our members

A tool for enhancing MetaMap performance when annotating clinical guideline documents with UMLS concepts

Author: Gooch P.
Roudsari A.
Publication venue
Publication date: 01/01/2011
Field of study

We developed a tool that integrates the National Library of Medicine's MetaMap software with GATE, an open-source text an- alytics framework. The tool allows non-ASCII encoded documents of numerous formats to be annotated with UMLS concepts. We created a GATE pipeline to chunk cardiovascular disease guideline text into default segments (blank-line delimited), XML element content, sentences and phrases, which were sequentially submitted to MetaMap for annotation. XML element, sentence and phrase chunking allowed term extraction and mapping to be completed in around 1/3 of the time taken with de- fault chunking, although with slight loss of accuracy (F1.0s=0.94-0.99). However, phrase chunking allows more complex input to be processed in real time, which is not possible with the other approaches. We discuss the results in relation to use of MetaMap's --term processing option for generating pre- and post-coordinated mappings from composite phrases

City Research Online

Structural variation in generated health reports

Author: Hallett Catalina
Scott Donia
Publication venue
Publication date: 01/01/2005
Field of study

We present a natural language generator that produces a range of medical reports on the clinical histories of cancer patients, and discuss the problem of conceptual restatement in generating various textual views of the same conceptual content. We focus on two features of our system: the demand for 'loose paraphrases' between the various reports on a given patient, with a high degree of semantic overlap but some necessary amount of distinctive content; and the requirement for paraphrasing at primarily the discourse level

CiteSeerX

Open Research Online (The Open University)

Open Data Platform for Knowledge Access in Plant Health Domain : VESPA Mining

Author: Andro Mathieu
Corbière Roselyne
Phan Tien T.
Turenne Nicolas
Publication venue
Publication date: 01/01/2015
Field of study

Important data are locked in ancient literature. It would be uneconomic to produce these data again and today or to extract them without the help of text mining technologies. Vespa is a text mining project whose aim is to extract data on pest and crops interactions, to model and predict attacks on crops, and to reduce the use of pesticides. A few attempts proposed an agricultural information access. Another originality of our work is to parse documents with a dependency of the document architecture

arXiv.org e-Print Archive

CiteSeerX

ProdInra

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL-Rennes 1

HAL - UPEC / UPEM

Recommended from our members

Lexical patterns, features and knowledge resources for coreference resolution in clinical notes

Author: Abdul Roudsari
D’Avolio
Miller
Phil Gooch
Rahman
Recasens
Rosse
Savova
Savova
Uzuner
van Deemter
Zheng
Zheng
Publication venue: 'Elsevier BV'
Publication date: 01/10/2012
Field of study

Generation of entity coreference chains provides a means to extract linked narrative events from clinical notes, but despite being a well-researched topic in natural language processing, general- purpose coreference tools perform poorly on clinical texts. This paper presents a knowledge-centric and pattern-based approach to resolving coreference across a wide variety of clinical records comprising discharge summaries, progress notes, pathology, radiology and surgical reports from two corpora (Ontology Development and Information Extraction (ODIE) and i2b2/VA). In addition, a method for generating coreference chains using progressively pruned linked lists is demonstrated that reduces the search space and facilitates evaluation by a number of metrics. Independent evaluation results show an F-measure for each corpus of 79.2% and 87.5%, respectively, which offers performance at least as good as human annotators, greatly increased performance over general- purpose tools, and improvement on previously reported clinical coreference systems. The system uses a number of open-source components that are available to download

City Research Online

Elsevier - Publisher Connector

Crossref

Summarisation and visualisation of e-Health data repositories

Author: Hallett Catalina
Power Richard
Scott Donia
Publication venue
Publication date: 01/01/2006
Field of study

At the centre of the Clinical e-Science Framework (CLEF) project is a repository of well organised, detailed clinical histories, encoded as data that will be available for use in clinical care and in-silico medical experiments. We describe a system that we have developed as part of the CLEF project, to perform the task of generating a diverse range of textual and graphical summaries of a patient’s clinical history from a data-encoded model, a chronicle, representing the record of the patient’s medical history. Although the focus of our current work is on cancer patients, the approach we describe is generalisable to a wide range of medical areas

CiteSeerX

Open Research Online (The Open University)

Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples

Author: Anantharam Pramod
Anantharam Pramod
Balasuriya Lakshika
Ferrucci David
Kimmig Angelika
McMahon Connor
Meng Lingling
Perera Sujan
Sheth Amit
Wijeratne Sanjaya
Wijeratne Sanjaya
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

Machine Learning has been a big success story during the AI resurgence. One particular stand out success relates to learning from a massive amount of data. In spite of early assertions of the unreasonable effectiveness of data, there is increasing recognition for utilizing knowledge whenever it is available or can be created purposefully. In this paper, we discuss the indispensable role of knowledge for deeper understanding of content where (i) large amounts of training data are unavailable, (ii) the objects to be recognized are complex, (e.g., implicit entities and highly subjective content), and (iii) applications need to use complementary or related data in multiple modalities/media. What brings us to the cusp of rapid progress is our ability to (a) create relevant and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP techniques. Using diverse examples, we seek to foretell unprecedented progress in our ability for deeper understanding and exploitation of multimodal data and continued incorporation of knowledge in learning techniques.Comment: Pre-print of the paper accepted at 2017 IEEE/WIC/ACM International Conference on Web Intelligence (WI). arXiv admin note: substantial text overlap with arXiv:1610.0770

arXiv.org e-Print Archive

Crossref

Scholar Commons - Institutional Repository of the University of South Carolina

CORE