6,221 research outputs found
Reverse Nearest Neighbor Heat Maps: A Tool for Influence Exploration
We study the problem of constructing a reverse nearest neighbor (RNN) heat
map by finding the RNN set of every point in a two-dimensional space. Based on
the RNN set of a point, we obtain a quantitative influence (i.e., heat) for the
point. The heat map provides a global view on the influence distribution in the
space, and hence supports exploratory analyses in many applications such as
marketing and resource management. To construct such a heat map, we first
reduce it to a problem called Region Coloring (RC), which divides the space
into disjoint regions within which all the points have the same RNN set. We
then propose a novel algorithm named CREST that efficiently solves the RC
problem by labeling each region with the heat value of its containing points.
In CREST, we propose innovative techniques to avoid processing expensive RNN
queries and greatly reduce the number of region labeling operations. We perform
detailed analyses on the complexity of CREST and lower bounds of the RC
problem, and prove that CREST is asymptotically optimal in the worst case.
Extensive experiments with both real and synthetic data sets demonstrate that
CREST outperforms alternative algorithms by several orders of magnitude.Comment: Accepted to appear in ICDE 201
Computationally efficient induction of classification rules with the PMCRI and J-PMCRI frameworks
In order to gain knowledge from large databases, scalable data mining technologies are needed. Data are captured on a large scale and thus databases are increasing at a fast pace. This leads to the utilisation of parallel computing technologies in order to cope with large amounts of data. In the area of classiļ¬cation rule induction, parallelisation of classiļ¬cation rules has focused on the divide and conquer approach, also known as the Top Down Induction of Decision Trees (TDIDT). An alternative approach to classiļ¬cation rule induction is separate and conquer which has only recently been in the focus of parallelisation. This work introduces and evaluates empirically a framework for the parallel induction of classiļ¬cation rules, generated by members of the Prism family of algorithms. All members of the Prism family of algorithms follow the separate and conquer approach.are increasing at a fast pace. This leads to the utilisation of parallel computing technologies in order to cope with large amounts of data. In the area of classiļ¬cation rule induction, parallelisation of classiļ¬cation rules has focused on the divide and conquer approach, also known as the Top Down Induction of Decision Trees (TDIDT). An alternative approach to classiļ¬cation rule induction is separate and conquer which has only recently been in the focus of parallelisation. This work introduces and evaluates empirically a framework for the parallel induction of classiļ¬cation rules, generated by members of the Prism family of algorithms. All members of the Prism family of algorithms follow the separate and conquer approach
Computational determination of (3,11) and (4,7) cages
A (k,g)-graph is a k-regular graph of girth g, and a (k,g)-cage is a
(k,g)-graph of minimum order. We show that a (3,11)-graph of order 112 found by
Balaban in 1973 is minimal and unique. We also show that the order of a
(4,7)-cage is 67 and find one example. Finally, we improve the lower bounds on
the orders of (3,13)-cages and (3,14)-cages to 202 and 260, respectively. The
methods used were a combination of heuristic hill-climbing and an innovative
backtrack search
- ā¦