61,027 research outputs found
Security and confidentiality approach for the Clinical E-Science Framework (CLEF)
CLEF is an MRC sponsored project in the E-Science programme that aims to
establish policies and infrastructure for the next generation of integrated clinical and
bioscience research. One of the major goals of the project is to provide a
pseudonymised repository of histories of cancer patients that can be accessed by
researchers. Robust mechanisms and policies are needed to ensure that patient
privacy and confidentiality are preserved while delivering a repository of such
medically rich information for the purposes of scientific research. This paper
summarises the overall approach adopted by CLEF to meet data protection
requirements, including the data flows and pseudonymisation mechanisms that are
currently being developed. Intended constraints and monitoring policies that will
apply to research interrogation of the repository are also outlined. Once evaluated, it
is hoped that the CLEF approach can serve as a model for other distributed
electronic health record repositories to be accessed for research
On Quantifying Qualitative Geospatial Data: A Probabilistic Approach
Living in the era of data deluge, we have witnessed a web content explosion,
largely due to the massive availability of User-Generated Content (UGC). In
this work, we specifically consider the problem of geospatial information
extraction and representation, where one can exploit diverse sources of
information (such as image and audio data, text data, etc), going beyond
traditional volunteered geographic information. Our ambition is to include
available narrative information in an effort to better explain geospatial
relationships: with spatial reasoning being a basic form of human cognition,
narratives expressing such experiences typically contain qualitative spatial
data, i.e., spatial objects and spatial relationships.
To this end, we formulate a quantitative approach for the representation of
qualitative spatial relations extracted from UGC in the form of texts. The
proposed method quantifies such relations based on multiple text observations.
Such observations provide distance and orientation features which are utilized
by a greedy Expectation Maximization-based (EM) algorithm to infer a
probability distribution over predefined spatial relationships; the latter
represent the quantified relationships under user-defined probabilistic
assumptions. We evaluate the applicability and quality of the proposed approach
using real UGC data originating from an actual travel blog text corpus. To
verify the quality of the result, we generate grid-based maps visualizing the
spatial extent of the various relations
Rhetorical structure and reader manipulation in Agatha Christie's <i>Murder on the Orient Express</i>
This paper describes Agatha Christie’s use of rhetoric to convince readers of the ‘truth’ of her detective’s solution in The Murder on the Orient Express, and uses an adaptation of Rhetorical Structure Theory (RST) designed for analyses of long extracts of a narrative text. The paper aims to demonstrate firstly the rhetorical practice of Christie, and secondly to demonstrate a tabular, non-diagrammatic exposition of RST, with some suggestions for future alterations to this method
Recommended from our members
Learning from the learners' experience: e-Learning@greenwich post-conference reflections
This publication comprises papers from presenters who, having made a conference presentation, were invited to author an academic paper about their work
Implementing a Portable Clinical NLP System with a Common Data Model - a Lisp Perspective
This paper presents a Lisp architecture for a portable NLP system, termed
LAPNLP, for processing clinical notes. LAPNLP integrates multiple standard,
customized and in-house developed NLP tools. Our system facilitates portability
across different institutions and data systems by incorporating an enriched
Common Data Model (CDM) to standardize necessary data elements. It utilizes
UMLS to perform domain adaptation when integrating generic domain NLP tools. It
also features stand-off annotations that are specified by positional reference
to the original document. We built an interval tree based search engine to
efficiently query and retrieve the stand-off annotations by specifying
positional requirements. We also developed a utility to convert an inline
annotation format to stand-off annotations to enable the reuse of clinical text
datasets with inline annotations. We experimented with our system on several
NLP facilitated tasks including computational phenotyping for lymphoma patients
and semantic relation extraction for clinical notes. These experiments
showcased the broader applicability and utility of LAPNLP.Comment: 6 pages, accepted by IEEE BIBM 2018 as regular pape
Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples
Machine Learning has been a big success story during the AI resurgence. One
particular stand out success relates to learning from a massive amount of data.
In spite of early assertions of the unreasonable effectiveness of data, there
is increasing recognition for utilizing knowledge whenever it is available or
can be created purposefully. In this paper, we discuss the indispensable role
of knowledge for deeper understanding of content where (i) large amounts of
training data are unavailable, (ii) the objects to be recognized are complex,
(e.g., implicit entities and highly subjective content), and (iii) applications
need to use complementary or related data in multiple modalities/media. What
brings us to the cusp of rapid progress is our ability to (a) create relevant
and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP
techniques. Using diverse examples, we seek to foretell unprecedented progress
in our ability for deeper understanding and exploitation of multimodal data and
continued incorporation of knowledge in learning techniques.Comment: Pre-print of the paper accepted at 2017 IEEE/WIC/ACM International
Conference on Web Intelligence (WI). arXiv admin note: substantial text
overlap with arXiv:1610.0770
- …