Search CORE

22,154 research outputs found

Multi-dimensional data indexing and range query processing via Voronoi diagram for internet of things

Author: Abbasi Qammer H.
Choo Kim-Kwang Raymond
Gu Zonghua
Wan Shaohua
Wang Tian
Zhao Yu
Publication venue: 'Elsevier BV'
Publication date: 01/02/2019
Field of study

In a typical Internet of Things (IoT) deployment such as smart cities and Industry 4.0, the amount of sensory data collected from physical world is significant and wide-ranging. Processing large amount of real-time data from the diverse IoT devices is challenging. For example, in IoT environment, wireless sensor networks (WSN) are typically used for the monitoring and collecting of data in some geographic area. Spatial range queries with location constraints to facilitate data indexing are traditionally employed in such applications, which allows the querying and managing the data based on SQL structure. One particular challenge is to minimize communication cost and storage requirements in multi-dimensional data indexing approaches. In this paper, we present an energy- and time-efficient multidimensional data indexing scheme, which is designed to answer range query. Specifically, we propose data indexing methods which utilize hierarchical indexing structures, using binary space partitioning (BSP), such as kd-tree, quad-tree, k-means clustering, and Voronoi-based methods to provide more efficient routing with less latency. Simulation results demonstrate that the Voronoi Diagram-based algorithm minimizes the average energy consumption and query response time

Enlighten

Dynamic load balancing in parallel KD-tree k-means

Author: Di Fatta Giuseppe
Pettinger David
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/06/2010
Field of study

One among the most influential and popular data mining methods is the k-Means algorithm for cluster analysis. Techniques for improving the efficiency of k-Means have been largely explored in two main directions. The amount of computation can be significantly reduced by adopting geometrical constraints and an efficient data structure, notably a multidimensional binary search tree (KD-Tree). These techniques allow to reduce the number of distance computations the algorithm performs at each iteration. A second direction is parallel processing, where data and computation loads are distributed over many processing nodes. However, little work has been done to provide a parallel formulation of the efficient sequential techniques based on KD-Trees. Such approaches are expected to have an irregular distribution of computation load and can suffer from load imbalance. This issue has so far limited the adoption of these efficient k-Means variants in parallel computing environments. In this work, we provide a parallel formulation of the KD-Tree based k-Means algorithm for distributed memory systems and address its load balancing issue. Three solutions have been developed and tested. Two approaches are based on a static partitioning of the data set and a third solution incorporates a dynamic load balancing policy

FRIOD: a deeply integrated feature-rich interactive system for effective and efficient outlier detection

Author: Chang Liang
Fournier-Viger Philippe
Li Hongzhou
Lin Jerry Chun-Wei
Zhang Ji
Zhu Xiaodong
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/11/2017
Field of study

In this paper, we propose an novel interactive outlier detection system called feature-rich interactive outlier detection (FRIOD), which features a deep integration of human interaction to improve detection performance and greatly streamline the detection process. A user-friendly interactive mechanism is developed to allow easy and intuitive user interaction in all the major stages of the underlying outlier detection algorithm which includes dense cell selection, location-aware distance thresholding, and final top outlier validation. By doing so, we can mitigate the major difficulty of the competitive outlier detection methods in specifying the key parameter values, such as the density and distance thresholds. An innovative optimization approach is also proposed to optimize the grid-based space partitioning, which is a critical step of FRIOD. Such optimization fully considers the high-quality outliers it detects with the aid of human interaction. The experimental evaluation demonstrates that FRIOD can improve the quality of the detected outliers and make the detection process more intuitive, effective, and efficient

The Simulation Model Partitioning Problem: an Adaptive Solution Based on Self-Clustering (Extended Version)

Author: D'Angelo Gabriele
Publication venue: 'Elsevier BV'
Publication date: 04/11/2016
Field of study

This paper is about partitioning in parallel and distributed simulation. That means decomposing the simulation model into a numberof components and to properly allocate them on the execution units. An adaptive solution based on self-clustering, that considers both communication reduction and computational load-balancing, is proposed. The implementation of the proposed mechanism is tested using a simulation model that is challenging both in terms of structure and dynamicity. Various configurations of the simulation model and the execution environment have been considered. The obtained performance results are analyzed using a reference cost model. The results demonstrate that the proposed approach is promising and that it can reduce the simulation execution time in both parallel and distributed architectures

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

ACCAMS: Additive Co-Clustering to Approximate Matrices Succinctly

Author: Ahmed Amr
Beutel Alex
Smola Alexander J.
Publication venue
Publication date: 31/12/2014
Field of study

Matrix completion and approximation are popular tools to capture a user's preferences for recommendation and to approximate missing data. Instead of using low-rank factorization we take a drastically different approach, based on the simple insight that an additive model of co-clusterings allows one to approximate matrices efficiently. This allows us to build a concise model that, per bit of model learned, significantly beats all factorization approaches to matrix approximation. Even more surprisingly, we find that summing over small co-clusterings is more effective in modeling matrices than classic co-clustering, which uses just one large partitioning of the matrix. Following Occam's razor principle suggests that the simple structure induced by our model better captures the latent preferences and decision making processes present in the real world than classic co-clustering or matrix factorization. We provide an iterative minimization algorithm, a collapsed Gibbs sampler, theoretical guarantees for matrix approximation, and excellent empirical evidence for the efficacy of our approach. We achieve state-of-the-art results on the Netflix problem with a fraction of the model complexity.Comment: 22 pages, under review for conference publicatio

arXiv.org e-Print Archive

CiteSeerX