4,205 research outputs found
Efficient Iterative Processing in the SciDB Parallel Array Engine
Many scientific data-intensive applications perform iterative computations on
array data. There exist multiple engines specialized for array processing.
These engines efficiently support various types of operations, but none
includes native support for iterative processing. In this paper, we develop a
model for iterative array computations and a series of optimizations. We
evaluate the benefits of an optimized, native support for iterative array
processing on the SciDB engine and real workloads from the astronomy domain
ClusPath: a temporal-driven clustering to infer typical evolution paths
© 2015, The Author(s). We propose ClusPath, a novel algorithm for detecting general evolution tendencies in a population of entities. We show how abstract notions, such as the Swedish socio-economical model (in a political dataset) or the companies fiscal optimization (in an economical dataset) can be inferred from low-level descriptive features. Such high-level regularities in the evolution of entities are detected by combining spatial and temporal features into a spatio-temporal dissimilarity measure and using semi-supervised clustering techniques. The relations between the evolution phases are modeled using a graph structure, inferred simultaneously with the partition, by using a “slow changing world” assumption. The idea is to ensure a smooth passage for entities along their evolution paths, which catches the long-term trends in the dataset. Additionally, we also provide a method, based on an evolutionary algorithm, to tune the parameters of ClusPath to new, unseen datasets. This method assesses the fitness of a solution using four opposed quality measures and proposes a balanced compromise
Recommended from our members
On Optimal and Fair Service Allocation in Mobile Cloud Computing
This paper studies the optimal and fair service allocation for a variety of
mobile applications (single or group and collaborative mobile applications) in
mobile cloud computing. We exploit the observation that using tiered clouds,
i.e. clouds at multiple levels (local and public) can increase the performance
and scalability of mobile applications. We proposed a novel framework to model
mobile applications as a location-time workflows (LTW) of tasks; here users
mobility patterns are translated to mobile service usage patterns. We show that
an optimal mapping of LTWs to tiered cloud resources considering multiple QoS
goals such application delay, device power consumption and user cost/price is
an NP-hard problem for both single and group-based applications. We propose an
efficient heuristic algorithm called MuSIC that is able to perform well (73% of
optimal, 30% better than simple strategies), and scale well to a large number
of users while ensuring high mobile application QoS. We evaluate MuSIC and the
2-tier mobile cloud approach via implementation (on real world clouds) and
extensive simulations using rich mobile applications like intensive signal
processing, video streaming and multimedia file sharing applications. Our
experimental and simulation results indicate that MuSIC supports scalable
operation (100+ concurrent users executing complex workflows) while improving
QoS. We observe about 25% lower delays and power (under fixed price
constraints) and about 35% decrease in price (considering fixed delay) in
comparison to only using the public cloud. Our studies also show that MuSIC
performs quite well under different mobility patterns, e.g. random waypoint and
Manhattan models
A Constrained Randomization Approach to Interactive Visual Data Exploration with Subjective Feedback
Data visualization and iterative/interactive data mining are growing rapidly in attention, both in research as well as in industry. However, while there are a plethora of advanced data mining methods and lots of works in the field of visualization, integrated methods that combine advanced visualization and/or interaction with data mining techniques in a principled way are rare. We present a framework based on constrained randomization which lets users explore high-dimensional data via 'subjectively informative' two-dimensional data visualizations. The user is presented with 'interesting' projections, allowing users to express their observations using visual interactions that update a background model representing the user's belief state. This background model is then considered by a projection-finding algorithm employing data randomization to compute a new 'interesting' projection. By providing users with information that contrasts with the background model, we maximize the chance that the user encounters striking new information present in the data. This process can be iterated until the user runs out of time or until the difference between the randomized and the real data is insignificant. We present two case studies, one controlled study on synthetic data and another on census data, using the proof-of-concept tool SIDE that demonstrates the presented framework.Peer reviewe
- …