1,260 research outputs found
Towards a semantic and statistical selection of association rules
The increasing growth of databases raises an urgent need for more accurate
methods to better understand the stored data. In this scope, association rules
were extensively used for the analysis and the comprehension of huge amounts of
data. However, the number of generated rules is too large to be efficiently
analyzed and explored in any further process. Association rules selection is a
classical topic to address this issue, yet, new innovated approaches are
required in order to provide help to decision makers. Hence, many interesting-
ness measures have been defined to statistically evaluate and filter the
association rules. However, these measures present two major problems. On the
one hand, they do not allow eliminating irrelevant rules, on the other hand,
their abun- dance leads to the heterogeneity of the evaluation results which
leads to confusion in decision making. In this paper, we propose a two-winged
approach to select statistically in- teresting and semantically incomparable
rules. Our statis- tical selection helps discovering interesting association
rules without favoring or excluding any measure. The semantic comparability
helps to decide if the considered association rules are semantically related
i.e comparable. The outcomes of our experiments on real datasets show promising
results in terms of reduction in the number of rules
Skyline: Interactive In-Editor Computational Performance Profiling for Deep Neural Network Training
Training a state-of-the-art deep neural network (DNN) is a
computationally-expensive and time-consuming process, which incentivizes deep
learning developers to debug their DNNs for computational performance. However,
effectively performing this debugging requires intimate knowledge about the
underlying software and hardware systems---something that the typical deep
learning developer may not have. To help bridge this gap, we present Skyline: a
new interactive tool for DNN training that supports in-editor computational
performance profiling, visualization, and debugging. Skyline's key contribution
is that it leverages special computational properties of DNN training to
provide (i) interactive performance predictions and visualizations, and (ii)
directly manipulatable visualizations that, when dragged, mutate the batch size
in the code. As an in-editor tool, Skyline allows users to leverage these
diagnostic features to debug the performance of their DNNs during development.
An exploratory qualitative user study of Skyline produced promising results;
all the participants found Skyline to be useful and easy to use.Comment: 14 pages, 5 figures. Appears in the proceedings of UIST'2
Efficient Computation of Group Skyline Queries on MapReduce
Skyline query is one of the important issues indatabase research and has been applied in diverse applicationsincluding multi-criteria decision support systems and so on. Theresponse of a skyline query eliminates unnecessary tuples andreturns only the user-interested result. Traditional skyline querypicks out the outstanding tuples, based on one-to-one recordcomparisons. Some modern applications request, beyond thesingular ones, for superior combinations of records. For example,fantasy basketball is composed of 5 players, fantasy baseball of 9players, and a hackathon of several programmers. Group skylineaims at considering all the groups comprising several records,and finding out the non-dominated ones. Because of the highcomplexity, few studies have been conducted and none has beenpresented in either distributed or parallel computing. This paperis the first study that solves the group skyline in the distributedMapReduce framework. We propose the MRGS algorithm togenerate all the combinations, compute the winners at each localnode, and find out the answer globally. We further propose theMRIGS algorithm to release the bottleneck of MRGS onunbalanced computing load of nodes. Finally, we propose theMRIGS-P algorithm to prune the impossible combinations andproduce indexed and balanced MapReduce computation.Extensive experiments with NBA datasets show that MRIGS-P is6 times faster than the MRGS algorithm
- …