14,640 research outputs found
BioCaster: detecting public health rumors with a Web-based text mining system
Summary: BioCaster is an ontology-based text mining system for detecting and tracking the distribution of infectious disease outbreaks from linguistic signals on the Web. The system continuously analyzes documents reported from over 1700 RSS feeds, classifies them for topical relevance and plots them onto a Google map using geocoded information. The background knowledge for bridging the gap between Layman's terms and formal-coding systems is contained in the freely available BioCaster ontology which includes information in eight languages focused on the epidemiological role of pathogens as well as geographical locations with their latitudes/longitudes. The system consists of four main stages: topic classification, named entity recognition (NER), disease/location detection and event recognition. Higher order event analysis is used to detect more precisely specified warning signals that can then be notified to registered users via email alerts. Evaluation of the system for topic recognition and entity identification is conducted on a gold standard corpus of annotated news articles
i-JEN: Visual interactive Malaysia crime news retrieval system
Supporting crime news investigation involves a mechanism to help monitor the current and past status of criminal events. We believe this could be well facilitated by focusing on the user interfaces and the event crime model aspects. In this paper we discuss on a development of Visual Interactive Malaysia Crime News Retrieval System (i-JEN) and describe the approach, user studies and planned, the system architecture and future plan. Our main objectives are to construct crime-based event; investigate the use of crime-based event in improving the classification and clustering; develop an interactive crime news retrieval system; visualize crime news in an effective and interactive way; integrate them into a usable and robust system and evaluate the usability and system performance. The system will serve as a news monitoring system which aims to automatically organize, retrieve and present the crime news in such a way as to support an effective monitoring, searching, and browsing for the target users groups of general public, news analysts and policemen or crime investigators. The study will contribute to the better understanding of the crime data consumption in the Malaysian context as well as the developed system with the visualisation features to address crime data and the eventual goal of combating the crimes
What's unusual in online disease outbreak news?
Background: Accurate and timely detection of public health events of
international concern is necessary to help support risk assessment and response
and save lives. Novel event-based methods that use the World Wide Web as a
signal source offer potential to extend health surveillance into areas where
traditional indicator networks are lacking. In this paper we address the issue
of systematically evaluating online health news to support automatic alerting
using daily disease-country counts text mined from real world data using
BioCaster. For 18 data sets produced by BioCaster, we compare 5 aberration
detection algorithms (EARS C2, C3, W2, F-statistic and EWMA) for performance
against expert moderated ProMED-mail postings. Results: We report sensitivity,
specificity, positive predictive value (PPV), negative predictive value (NPV),
mean alerts/100 days and F1, at 95% confidence interval (CI) for 287
ProMED-mail postings on 18 outbreaks across 14 countries over a 366 day period.
Results indicate that W2 had the best F1 with a slight benefit for day of week
effect over C2. In drill down analysis we indicate issues arising from the
granular choice of country-level modeling, sudden drops in reporting due to day
of week effects and reporting bias. Automatic alerting has been implemented in
BioCaster available from http://born.nii.ac.jp. Conclusions: Online health news
alerts have the potential to enhance manual analytical methods by increasing
throughput, timeliness and detection rates. Systematic evaluation of health
news aberrations is necessary to push forward our understanding of the complex
relationship between news report volumes and case numbers and to select the
best performing features and algorithms
Towards cross-lingual alerting for bursty epidemic events
Background: Online news reports are increasingly becoming a source for event
based early warning systems that detect natural disasters. Harnessing the
massive volume of information available from multilingual newswire presents as
many challenges as opportunities due to the patterns of reporting complex
spatiotemporal events. Results: In this article we study the problem of
utilising correlated event reports across languages. We track the evolution of
16 disease outbreaks using 5 temporal aberration detection algorithms on
text-mined events classified according to disease and outbreak country. Using
ProMED reports as a silver standard, comparative analysis of news data for 13
languages over a 129 day trial period showed improved sensitivity, F1 and
timeliness across most models using cross-lingual events. We report a detailed
case study analysis for Cholera in Angola 2010 which highlights the challenges
faced in correlating news events with the silver standard. Conclusions: The
results show that automated health surveillance using multilingual text mining
has the potential to turn low value news into high value alerts if informed
choices are used to govern the selection of models and data sources. An
implementation of the C2 alerting algorithm using multilingual news is
available at the BioCaster portal http://born.nii.ac.jp/?page=globalroundup
High-level feature detection from video in TRECVid: a 5-year retrospective of achievements
Successful and effective content-based access to digital
video requires fast, accurate and scalable methods to determine the video content automatically. A variety of contemporary approaches to this rely on text taken from speech within the video, or on matching one video frame against others using low-level characteristics like
colour, texture, or shapes, or on determining and matching objects appearing within the video. Possibly the most important technique, however, is one which determines the presence or absence of a high-level or semantic feature, within a video clip or shot. By utilizing dozens, hundreds or even thousands of such semantic features we can support many kinds of content-based video navigation. Critically however, this depends on being able to determine whether each feature is or is not present in a video clip.
The last 5 years have seen much progress in the development of techniques to determine the presence of semantic features within video. This progress can be tracked in the annual TRECVid benchmarking activity where dozens of research groups measure the effectiveness of their techniques on common data and using an open, metrics-based approach. In this chapter we summarise the work
done on the TRECVid high-level feature task, showing the
progress made year-on-year. This provides a fairly comprehensive statement on where the state-of-the-art is regarding this important task, not just for one research group or for one approach, but across the spectrum. We then use this past and on-going work as a basis for highlighting the trends that are emerging in this area, and the questions which remain to be addressed before we can
achieve large-scale, fast and reliable high-level feature detection on video
- …