27,915 research outputs found
Rule-based approach for identifying assertions in clinical free-text data
A rule-based approach for classifying previously identified medical concepts in the clinical free text into an assertion category is presented. There are six different categories of assertions for the task: Present, Absent, Possible, Conditional, Hypothetical and Not associated with the patient. The assertion classification algorithms were largely based on extending the popular NegEx and Context algorithms. In addition, a health based clinical terminology called SNOMED CT and other publicly available dictionaries were used to classify assertions, which did not fit the NegEx/Context model. The data for this task includes discharge summaries from Partners HealthCare and from Beth Israel Deaconess Medical Centre, as well as discharge summaries and progress notes from University of Pittsburgh Medical Centre. The set consists of 349 discharge reports, each with pairs of ground truth concept and assertion files for system development, and 477 reports for evaluation. The system’s performance on the evaluation data set was 0.83, 0.83 and 0.83 for recall, precision and F1-measure, respectively. Although the rule-based system shows promise, further improvements can be made by incorporating machine learning approaches
Robustness of Bayesian Pool-based Active Learning Against Prior Misspecification
We study the robustness of active learning (AL) algorithms against prior
misspecification: whether an algorithm achieves similar performance using a
perturbed prior as compared to using the true prior. In both the average and
worst cases of the maximum coverage setting, we prove that all
-approximate algorithms are robust (i.e., near -approximate) if
the utility is Lipschitz continuous in the prior. We further show that
robustness may not be achieved if the utility is non-Lipschitz. This suggests
we should use a Lipschitz utility for AL if robustness is required. For the
minimum cost setting, we can also obtain a robustness result for approximate AL
algorithms. Our results imply that many commonly used AL algorithms are robust
against perturbed priors. We then propose the use of a mixture prior to
alleviate the problem of prior misspecification. We analyze the robustness of
the uniform mixture prior and show experimentally that it performs reasonably
well in practice.Comment: This paper is published at AAAI Conference on Artificial Intelligence
(AAAI 2016
Recommended from our members
Fluorescent optical fibre chemosensor for the detection of mercury
This work aims to develop a stable, compact and portable fibre optic sensing system which is capable of real time detection of the mercury ion (II), Hg2+. A novel fluorescent polymeric material for Hg2+ detection, based on a coumarin derivative (acting as the fluorophore) and an azathia crown ether moiety (acting as the mercury ion receptor), has been designed and synthesized. The material was covalently attached to the distal end of an optical fibre and exhibited a significant increase in fluorescence intensity in response to Hg2+ in the μM concentration range via a photoinduced electron transfer (PET) mechanism. The sensor has also demonstrated a high selectivity for Hg2+ over other metal ions. A washing protocol was identified for sensor regeneration, allowing the probe to be re-used. The approach developed in this work can also be used for the preparation of sensors for other heavy metals
GraphH: High Performance Big Graph Analytics in Small Clusters
It is common for real-world applications to analyze big graphs using
distributed graph processing systems. Popular in-memory systems require an
enormous amount of resources to handle big graphs. While several out-of-core
approaches have been proposed for processing big graphs on disk, the high disk
I/O overhead could significantly reduce performance. In this paper, we propose
GraphH to enable high-performance big graph analytics in small clusters.
Specifically, we design a two-stage graph partition scheme to evenly divide the
input graph into partitions, and propose a GAB (Gather-Apply-Broadcast)
computation model to make each worker process a partition in memory at a time.
We use an edge cache mechanism to reduce the disk I/O overhead, and design a
hybrid strategy to improve the communication performance. GraphH can
efficiently process big graphs in small clusters or even a single commodity
server. Extensive evaluations have shown that GraphH could be up to 7.8x faster
compared to popular in-memory systems, such as Pregel+ and PowerGraph when
processing generic graphs, and more than 100x faster than recently proposed
out-of-core systems, such as GraphD and Chaos when processing big graphs
- …