2,844 research outputs found
Towards a semantic and statistical selection of association rules
The increasing growth of databases raises an urgent need for more accurate
methods to better understand the stored data. In this scope, association rules
were extensively used for the analysis and the comprehension of huge amounts of
data. However, the number of generated rules is too large to be efficiently
analyzed and explored in any further process. Association rules selection is a
classical topic to address this issue, yet, new innovated approaches are
required in order to provide help to decision makers. Hence, many interesting-
ness measures have been defined to statistically evaluate and filter the
association rules. However, these measures present two major problems. On the
one hand, they do not allow eliminating irrelevant rules, on the other hand,
their abun- dance leads to the heterogeneity of the evaluation results which
leads to confusion in decision making. In this paper, we propose a two-winged
approach to select statistically in- teresting and semantically incomparable
rules. Our statis- tical selection helps discovering interesting association
rules without favoring or excluding any measure. The semantic comparability
helps to decide if the considered association rules are semantically related
i.e comparable. The outcomes of our experiments on real datasets show promising
results in terms of reduction in the number of rules
RRR: Rank-Regret Representative
Selecting the best items in a dataset is a common task in data exploration.
However, the concept of "best" lies in the eyes of the beholder: different
users may consider different attributes more important, and hence arrive at
different rankings. Nevertheless, one can remove "dominated" items and create a
"representative" subset of the data set, comprising the "best items" in it. A
Pareto-optimal representative is guaranteed to contain the best item of each
possible ranking, but it can be almost as big as the full data. Representative
can be found if we relax the requirement to include the best item for every
possible user, and instead just limit the users' "regret". Existing work
defines regret as the loss in score by limiting consideration to the
representative instead of the full data set, for any chosen ranking function.
However, the score is often not a meaningful number and users may not
understand its absolute value. Sometimes small ranges in score can include
large fractions of the data set. In contrast, users do understand the notion of
rank ordering. Therefore, alternatively, we consider the position of the items
in the ranked list for defining the regret and propose the {\em rank-regret
representative} as the minimal subset of the data containing at least one of
the top- of any possible ranking function. This problem is NP-complete. We
use the geometric interpretation of items to bound their ranks on ranges of
functions and to utilize combinatorial geometry notions for developing
effective and efficient approximation algorithms for the problem. Experiments
on real datasets demonstrate that we can efficiently find small subsets with
small rank-regrets
A search for disk-galaxy lenses in the Sloan Digital Sky Survey
We present the first automated spectroscopic search for disk-galaxy lenses,
using the Sloan Digital Sky Survey database. We follow up eight gravitational
lens candidates, selected among a sample of ~40000 candidate massive disk
galaxies, using a combination of ground-based imaging and long-slit
spectroscopy. We confirm two gravitational lens systems: one probable disk
galaxy, and one probable S0 galaxy. The remaining systems are four promising
disk-galaxy lens candidates, as well as two probable gravitational lenses whose
lens galaxy might be an S0 galaxy. The redshifts of the lenses are z_lens ~
0.1. The redshift range of the background sources is z_source ~ 0.3 - 0.7. The
systems presented here are (confirmed or candidate) galaxy-galaxy lensing
systems, that is, systems where the multiple images are faint and extended,
allowing an accurate determination of the lens galaxy mass and light
distributions without contamination from the background galaxy. Moreover, the
low redshift of the (confirmed or candidates) lens galaxies is favorable for
measuring rotation points to complement the lensing study. We estimate the
rest-frame total mass-to-light ratio within the Einstein radius for the two
confirmed lenses: we find M_tot/L_I = 5.4 +- 1.5 within 3.9 +- 0.9 kpc for SDSS
J081230.30+543650.9, and M_tot/L_I = 1.5 +- 0.9 within 1.4 +- 0.8 kpc for SDSS
J145543.55+530441.2 (all in solar units). Hubble Space Telescope or Adaptive
Optics imaging is needed to further study the systems.Comment: ApJ, accepte
- …