Search CORE

2,844 research outputs found

Towards a semantic and statistical selection of association rules

Author: Bouker Slim
Nguifo Engelbert Mephu
Saidi Rabie
Yahia Sadok Ben
Publication venue
Publication date: 01/01/2013
Field of study

The increasing growth of databases raises an urgent need for more accurate methods to better understand the stored data. In this scope, association rules were extensively used for the analysis and the comprehension of huge amounts of data. However, the number of generated rules is too large to be efficiently analyzed and explored in any further process. Association rules selection is a classical topic to address this issue, yet, new innovated approaches are required in order to provide help to decision makers. Hence, many interesting- ness measures have been defined to statistically evaluate and filter the association rules. However, these measures present two major problems. On the one hand, they do not allow eliminating irrelevant rules, on the other hand, their abun- dance leads to the heterogeneity of the evaluation results which leads to confusion in decision making. In this paper, we propose a two-winged approach to select statistically in- teresting and semantically incomparable rules. Our statis- tical selection helps discovering interesting association rules without favoring or excluding any measure. The semantic comparability helps to decide if the considered association rules are semantically related i.e comparable. The outcomes of our experiments on real datasets show promising results in terms of reduction in the number of rules

arXiv.org e-Print Archive

HAL Clermont Université

RRR: Rank-Regret Representative

Author: Asudeh Abolfazl
Das Gautam
Jagadish H. V.
Nazi Azade
Zhang Nan
Publication venue
Publication date: 01/01/2018
Field of study

Selecting the best items in a dataset is a common task in data exploration. However, the concept of "best" lies in the eyes of the beholder: different users may consider different attributes more important, and hence arrive at different rankings. Nevertheless, one can remove "dominated" items and create a "representative" subset of the data set, comprising the "best items" in it. A Pareto-optimal representative is guaranteed to contain the best item of each possible ranking, but it can be almost as big as the full data. Representative can be found if we relax the requirement to include the best item for every possible user, and instead just limit the users' "regret". Existing work defines regret as the loss in score by limiting consideration to the representative instead of the full data set, for any chosen ranking function. However, the score is often not a meaningful number and users may not understand its absolute value. Sometimes small ranges in score can include large fractions of the data set. In contrast, users do understand the notion of rank ordering. Therefore, alternatively, we consider the position of the items in the ranked list for defining the regret and propose the {\em rank-regret representative} as the minimal subset of the data containing at least one of the top-

k

of any possible ranking function. This problem is NP-complete. We use the geometric interpretation of items to bound their ranks on ranges of functions and to utilize combinatorial geometry notions for developing effective and efficient approximation algorithms for the problem. Experiments on real datasets demonstrate that we can efficiently find small subsets with small rank-regrets

arXiv.org e-Print Archive

University of Illinois at Chicago: UIC INDIGO (INtellectual property in DIGital form available online in an Open environment)

A search for disk-galaxy lenses in the Sloan Digital Sky Survey

Author: Adelman-McCarthy
Bell
Belokurov
Bolton
Bolton
Bolton
Bolton
Castander
Chloé Féron
Covone
de Jong
Dobler
Dutton
Fassnacht
Faure
Ghosh
Inada
Jaunsen
Jens Hjorth
Johan Samsing
John P. McKean
Kassin
Kayo
Keeton
Koopmans
Limousin
Maller
Mandelbaum
Marshall
Marshall
Navarro
Navarro
Newton
Oguri
Oguri
Oguri
Oguri
Peng
Percival
Salucci
Strateva
Sérsic
Sérsic
Trott
Tully
Winn
Winn
York
Publication venue: 'IOP Publishing'
Publication date: 01/01/2009
Field of study

We present the first automated spectroscopic search for disk-galaxy lenses, using the Sloan Digital Sky Survey database. We follow up eight gravitational lens candidates, selected among a sample of ~40000 candidate massive disk galaxies, using a combination of ground-based imaging and long-slit spectroscopy. We confirm two gravitational lens systems: one probable disk galaxy, and one probable S0 galaxy. The remaining systems are four promising disk-galaxy lens candidates, as well as two probable gravitational lenses whose lens galaxy might be an S0 galaxy. The redshifts of the lenses are z_lens ~ 0.1. The redshift range of the background sources is z_source ~ 0.3 - 0.7. The systems presented here are (confirmed or candidate) galaxy-galaxy lensing systems, that is, systems where the multiple images are faint and extended, allowing an accurate determination of the lens galaxy mass and light distributions without contamination from the background galaxy. Moreover, the low redshift of the (confirmed or candidates) lens galaxies is favorable for measuring rotation points to complement the lensing study. We estimate the rest-frame total mass-to-light ratio within the Einstein radius for the two confirmed lenses: we find M_tot/L_I = 5.4 +- 1.5 within 3.9 +- 0.9 kpc for SDSS J081230.30+543650.9, and M_tot/L_I = 1.5 +- 0.9 within 1.4 +- 0.8 kpc for SDSS J145543.55+530441.2 (all in solar units). Hubble Space Telescope or Adaptive Optics imaging is needed to further study the systems.Comment: ApJ, accepte

arXiv.org e-Print Archive

Crossref

Copenhagen University Research Information System