Search CORE

239,393 research outputs found

Finding Top-k Dominance on Incomplete Big Data Using Map-Reduce Framework

Author: Ezatpoor Payam
Publication venue: Digital Scholarship@UNLV
Publication date: 01/05/2017
Field of study

Incomplete data is one major kind of multi-dimensional dataset that has random-distributed missing nodes in its dimensions. It is very difficult to retrieve information from this type of dataset when it becomes huge. Finding top-k dominant values in this type of dataset is a challenging procedure. Some algorithms are present to enhance this process but are mostly efficient only when dealing with a small-size incomplete data. One of the algorithms that make the application of TKD query possible is the Bitmap Index Guided (BIG) algorithm. This algorithm strongly improves the performance for incomplete data, but it is not originally capable of finding top-k dominant values in incomplete big data, nor is it designed to do so. Several other algorithms have been proposed to find the TKD query, such as Skyband Based and Upper Bound Based algorithms, but their performance is also questionable. Algorithms developed previously were among the first attempts to apply TKD query on incomplete data; however, all these had weak performances or were not compatible with the incomplete data. This thesis proposes MapReduced Enhanced Bitmap Index Guided Algorithm (MRBIG) for dealing with the aforementioned issues. MRBIG uses the MapReduce framework to enhance the performance of applying top-k dominance queries on huge incomplete datasets. The proposed approach uses the MapReduce parallel computing approach using multiple computing nodes. The framework separates the tasks between several computing nodes that independently and simultaneously work to find the result. This method has achieved up to two times faster processing time in finding the TKD query result in comparison to previously presented algorithms

University of Nevada, Las Vegas Repository

Solving DCOPs with Distributed Large Neighborhood Search

Author: Aspenberg Per
Blomgran Parmis
Braga Silva Jefferson
Dietrich Franciele
Faccin Bampi Vinicius
Hammerman Malin
Tätting Love
Publication venue
Publication date: 01/01/2017
Field of study

The field of Distributed Constraint Optimization has gained momentum in recent years, thanks to its ability to address various applications related to multi-agent cooperation. Nevertheless, solving Distributed Constraint Optimization Problems (DCOPs) optimally is NP-hard. Therefore, in large-scale, complex applications, incomplete DCOP algorithms are necessary. Current incomplete DCOP algorithms suffer of one or more of the following limitations: they (a) find local minima without providing quality guarantees; (b) provide loose quality assessment; or (c) are unable to benefit from the structure of the problem, such as domain-dependent knowledge and hard constraints. Therefore, capitalizing on strategies from the centralized constraint solving community, we propose a Distributed Large Neighborhood Search (D-LNS) framework to solve DCOPs. The proposed framework (with its novel repair phase) provides guarantees on solution quality, refining upper and lower bounds during the iterative process, and can exploit domain-dependent structures. Our experimental results show that D-LNS outperforms other incomplete DCOP algorithms on both structured and unstructured problem instances

arXiv.org e-Print Archive

Publikationer från Linköpings universitet

Crossref

Directory of Open Access Journals

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Efficient Algorithms for Bayesian Network Parameter Learning from Incomplete Data

Author: Broeck Guy Van den
Choi Arthur
Mohan Karthika
Pearl Judea
Publication venue
Publication date: 25/11/2014
Field of study

We propose an efficient family of algorithms to learn the parameters of a Bayesian network from incomplete data. In contrast to textbook approaches such as EM and the gradient method, our approach is non-iterative, yields closed form parameter estimates, and eliminates the need for inference in a Bayesian network. Our approach provides consistent parameter estimates for missing data problems that are MCAR, MAR, and in some cases, MNAR. Empirically, our approach is orders of magnitude faster than EM (as our approach requires no inference). Given sufficient data, we learn parameters that can be orders of magnitude more accurate

arXiv.org e-Print Archive

CiteSeerX

New Algorithms for $M$ -Estimation of Multivariate Scatter and Location

Author: Duembgen Lutz
Nordhausen Klaus
Schuhmacher Heike
Publication venue: 'Elsevier BV'
Publication date: 03/12/2015
Field of study

We present new algorithms for

M

-estimators of multivariate scatter and location and for symmetrized

M

-estimators of multivariate scatter. The new algorithms are considerably faster than currently used fixed-point and related algorithms. The main idea is to utilize a second order Taylor expansion of the target functional and to devise a partial Newton-Raphson procedure. In connection with symmetrized

M

-estimators we work with incomplete

U

-statistics to accelerate our procedures initially

arXiv.org e-Print Archive

Crossref

Bern Open Repository and Information System (BORIS)

Link-Prediction Enhanced Consensus Clustering for Complex Networks

Author: Adar Eytan
Burgess Matthew
Cafarella Michael
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 04/06/2015
Field of study

Many real networks that are inferred or collected from data are incomplete due to missing edges. Missing edges can be inherent to the dataset (Facebook friend links will never be complete) or the result of sampling (one may only have access to a portion of the data). The consequence is that downstream analyses that consume the network will often yield less accurate results than if the edges were complete. Community detection algorithms, in particular, often suffer when critical intra-community edges are missing. We propose a novel consensus clustering algorithm to enhance community detection on incomplete networks. Our framework utilizes existing community detection algorithms that process networks imputed by our link prediction based algorithm. The framework then merges their multiple outputs into a final consensus output. On average our method boosts performance of existing algorithms by 7% on artificial data and 17% on ego networks collected from Facebook

arXiv.org e-Print Archive

Directory of Open Access Journals

Module extraction via query inseparability in OWL 2 QL

Author: Konev B.
Kontchakov Roman
Ludwig M.
Schneider T.
Wolter F.
Zakharyaschev Michael
Publication venue: CEUR Workshop Proceedings
Publication date: 01/01/2011
Field of study

We show that deciding conjunctive query inseparability for OWL 2 QL ontologies is PSpace-hard and in ExpTime. We give polynomial-time (incomplete) algorithms and demonstrate by experiments that they can be used for practical module extraction

Birkbeck Institutional Research Online