72,823 research outputs found
Physical Representation-based Predicate Optimization for a Visual Analytics Database
Querying the content of images, video, and other non-textual data sources
requires expensive content extraction methods. Modern extraction techniques are
based on deep convolutional neural networks (CNNs) and can classify objects
within images with astounding accuracy. Unfortunately, these methods are slow:
processing a single image can take about 10 milliseconds on modern GPU-based
hardware. As massive video libraries become ubiquitous, running a content-based
query over millions of video frames is prohibitive.
One promising approach to reduce the runtime cost of queries of visual
content is to use a hierarchical model, such as a cascade, where simple cases
are handled by an inexpensive classifier. Prior work has sought to design
cascades that optimize the computational cost of inference by, for example,
using smaller CNNs. However, we observe that there are critical factors besides
the inference time that dramatically impact the overall query time. Notably, by
treating the physical representation of the input image as part of our query
optimization---that is, by including image transforms, such as resolution
scaling or color-depth reduction, within the cascade---we can optimize data
handling costs and enable drastically more efficient classifier cascades.
In this paper, we propose Tahoma, which generates and evaluates many
potential classifier cascades that jointly optimize the CNN architecture and
input data representation. Our experiments on a subset of ImageNet show that
Tahoma's input transformations speed up cascades by up to 35 times. We also
find up to a 98x speedup over the ResNet50 classifier with no loss in accuracy,
and a 280x speedup if some accuracy is sacrificed.Comment: Camera-ready version of the paper submitted to ICDE 2019, In
Proceedings of the 35th IEEE International Conference on Data Engineering
(ICDE 2019
Application of multiobjective genetic programming to the design of robot failure recognition systems
We present an evolutionary approach using multiobjective genetic programming (MOGP) to derive optimal feature extraction preprocessing stages for robot failure detection. This data-driven machine learning method is compared both with conventional (nonevolutionary) classifiers and a set of domain-dependent feature extraction methods. We conclude MOGP is an effective and practical design method for failure recognition systems with enhanced recognition accuracy over conventional classifiers, independent of domain knowledge
Engineering Crowdsourced Stream Processing Systems
A crowdsourced stream processing system (CSP) is a system that incorporates
crowdsourced tasks in the processing of a data stream. This can be seen as
enabling crowdsourcing work to be applied on a sample of large-scale data at
high speed, or equivalently, enabling stream processing to employ human
intelligence. It also leads to a substantial expansion of the capabilities of
data processing systems. Engineering a CSP system requires the combination of
human and machine computation elements. From a general systems theory
perspective, this means taking into account inherited as well as emerging
properties from both these elements. In this paper, we position CSP systems
within a broader taxonomy, outline a series of design principles and evaluation
metrics, present an extensible framework for their design, and describe several
design patterns. We showcase the capabilities of CSP systems by performing a
case study that applies our proposed framework to the design and analysis of a
real system (AIDR) that classifies social media messages during time-critical
crisis events. Results show that compared to a pure stream processing system,
AIDR can achieve a higher data classification accuracy, while compared to a
pure crowdsourcing solution, the system makes better use of human workers by
requiring much less manual work effort
Visual Analysis of Spatio-Temporal Event Predictions: Investigating the Spread Dynamics of Invasive Species
Invasive species are a major cause of ecological damage and commercial
losses. A current problem spreading in North America and Europe is the vinegar
fly Drosophila suzukii. Unlike other Drosophila, it infests non-rotting and
healthy fruits and is therefore of concern to fruit growers, such as vintners.
Consequently, large amounts of data about infestations have been collected in
recent years. However, there is a lack of interactive methods to investigate
this data. We employ ensemble-based classification to predict areas susceptible
to infestation by D. suzukii and bring them into a spatio-temporal context
using maps and glyph-based visualizations. Following the information-seeking
mantra, we provide a visual analysis system Drosophigator for spatio-temporal
event prediction, enabling the investigation of the spread dynamics of invasive
species. We demonstrate the usefulness of this approach in two use cases
- …