48,373 research outputs found
Scatteract: Automated extraction of data from scatter plots
Charts are an excellent way to convey patterns and trends in data, but they
do not facilitate further modeling of the data or close inspection of
individual data points. We present a fully automated system for extracting the
numerical values of data points from images of scatter plots. We use deep
learning techniques to identify the key components of the chart, and optical
character recognition together with robust regression to map from pixels to the
coordinate system of the chart. We focus on scatter plots with linear scales,
which already have several interesting challenges. Previous work has done fully
automatic extraction for other types of charts, but to our knowledge this is
the first approach that is fully automatic for scatter plots. Our method
performs well, achieving successful data extraction on 89% of the plots in our
test set.Comment: Submitted to ECML PKDD 2017 proceedings, 16 page
Database support of detector operation and data analysis in the DEAP-3600 Dark Matter experiment
The DEAP-3600 detector searches for dark matter interactions on a 3.3 tonne
liquid argon target. Over nearly a decade, from start of detector construction
through the end of the data analysis phase, well over 200 scientists will have
contributed to the project. The DEAP-3600 detector will amass in excess of 900
TB of data representing more than 10 particle interactions, a few of
which could be from dark matter. At the same time, metadata exceeding 80 GB
will be generated. This metadata is crucial for organizing and interpreting the
dark matter search data and contains both structured and unstructured
information.
The scale of the data collected, the important role of metadata in
interpreting it, the number of people involved, and the long lifetime of the
project necessitate an industrialized approach to metadata management.
We describe how the CouchDB and the PostgreSQL database systems were
integrated into the DEAP detector operation and analysis workflows. This
integration provides unified, distributed access to both structured
(PostgreSQL) and unstructured (CouchDB) metadata at runtime of the data
analysis software. It also supports operational and reporting requirements
Reliability measurement during software development
During the development of data base software for a multi-sensor tracking system, reliability was measured. The failure ratio and failure rate were found to be consistent measures. Trend lines were established from these measurements that provided good visualization of the progress on the job as a whole as well as on individual modules. Over one-half of the observed failures were due to factors associated with the individual run submission rather than with the code proper. Possible application of these findings for line management, project managers, functional management, and regulatory agencies is discussed. Steps for simplifying the measurement process and for use of these data in predicting operational software reliability are outlined
Search Tracker: Human-derived object tracking in-the-wild through large-scale search and retrieval
Humans use context and scene knowledge to easily localize moving objects in
conditions of complex illumination changes, scene clutter and occlusions. In
this paper, we present a method to leverage human knowledge in the form of
annotated video libraries in a novel search and retrieval based setting to
track objects in unseen video sequences. For every video sequence, a document
that represents motion information is generated. Documents of the unseen video
are queried against the library at multiple scales to find videos with similar
motion characteristics. This provides us with coarse localization of objects in
the unseen video. We further adapt these retrieved object locations to the new
video using an efficient warping scheme. The proposed method is validated on
in-the-wild video surveillance datasets where we outperform state-of-the-art
appearance-based trackers. We also introduce a new challenging dataset with
complex object appearance changes.Comment: Under review with the IEEE Transactions on Circuits and Systems for
Video Technolog
Automated annotation of multimedia audio data with affective labels for information management
The emergence of digital multimedia systems is creating many new opportunities for rapid access to huge content archives. In order to fully exploit these information sources, the content must be annotated with significant features. An important aspect of human interpretation of multimedia data, which is often overlooked, is the affective dimension. Such information is a potentially useful component for content-based classification and retrieval. Much of the affective information of multimedia content is contained within the audio data stream. Emotional
features can be defined in terms of arousal and valence levels. In this study low-level audio features are extracted to calculate arousal and valence levels of
multimedia audio streams. These are then mapped onto a set of keywords with predetermined emotional interpretations. Experimental results illustrate the use of this system to assign affective annotation to multimedia data
- …