Search CORE

61 research outputs found

Join sizes, urn models and normal limiting distributions

Author: Gardy Danièle
Publication venue: Published by Elsevier B.V.
Publication date: 12/09/1994
Field of study

AbstractWe study some parameters of relational databases (sizes of relations obtained by a join) that can be described by generating functions on three variables, of the kind ϕ(x, y, z)d. We modelize these parameters by suitable urn models and give conditions under which they asymptotically follow a gaussian distribution

Elsevier - Publisher Connector

B-urns

Author: Chauvin Brigitte
Gardy Danièle
Pouyanne Nicolas
Ton-That Dai-Hai
Publication venue
Publication date: 22/07/2015
Field of study

The fringe of a B-tree with parameter

m

is considered as a particular P\'olya urn with

m

colors. More precisely, the asymptotic behaviour of this fringe, when the number of stored keys tends to infinity, is studied through the composition vector of the fringe nodes. We establish its typical behaviour together with the fluctuations around it. The well known phase transition in P\'olya urns has the following effect on B-trees: for

m\leq 59

, the fluctuations are asymptotically Gaussian, though for

m\geq 60

, the composition vector is oscillating; after scaling, the fluctuations of such an urn strongly converge to a random variable

W

. This limit is

\mathbb C

-valued and it does not seem to follow any classical law. Several properties of

W

are shown: existence of exponential moments, characterization of its distribution as the solution of a smoothing equation, existence of a density relatively to the Lebesgue measure on

\mathbb C

, support of

W

. Moreover, a few representations of the composition vector for various values of

m

illustrate the different kinds of convergence

arXiv.org e-Print Archive

HAL UVSQ

A Computational Model for Logical Analysis of Data

Author: Gardy Danièle
Lardeux Frédéric
Saubion Frédéric
Publication venue
Publication date: 12/07/2022
Field of study

Initially introduced by Peter Hammer, Logical Analysis of Data is a methodology that aims at computing a logical justification for dividing a group of data in two groups of observations, usually called the positive and negative groups. Consider this partition into positive and negative groups as the description of a partially defined Boolean function; the data is then processed to identify a subset of attributes, whose values may be used to characterize the observations of the positive groups against those of the negative group. LAD constitutes an interesting rule-based learning alternative to classic statistical learning techniques and has many practical applications. Nevertheless, the computation of group characterization may be costly, depending on the properties of the data instances. A major aim of our work is to provide effective tools for speeding up the computations, by computing some \emph{a priori} probability that a given set of attributes does characterize the positive and negative groups. To this effect, we propose several models for representing the data set of observations, according to the information we have on it. These models, and the probabilities they allow us to compute, are also helpful for quickly assessing some properties of the real data at hand; furthermore they may help us to better analyze and understand the computational difficulties encountered by solving methods. Once our models have been established, the mathematical tools for computing probabilities come from Analytic Combinatorics. They allow us to express the desired probabilities as ratios of generating functions coefficients, which then provide a quick computation of their numerical values. A further, long-range goal of this paper is to show that the methods of Analytic Combinatorics can help in analyzing the performance of various algorithms in LAD and related fields

arXiv.org e-Print Archive

Average cost of orthogonal range queries in multiattribute trees

Author: Flajolet Philippe
Gardy Danièle
Puech Claude
Publication venue: HAL CCSD
Publication date: 01/01/1988
Field of study

Résumé disponible dans les fichiers attaché

INRIA a CCSD electronic archive server

Hal-Diderot

Birthday paradox,coupon collectors,caching algorithms and self-organizing search

Author: Flajolet Philippe
Gardy Danièle
Thimonier Loÿs
Publication venue: HAL CCSD
Publication date: 01/01/1987
Field of study

Résumé disponible dans les fichiers attaché

INRIA a CCSD electronic archive server

Hal-Diderot

The permutation-path coloring problem on trees

Author: Barth Dominique
Corteel Sylvie
Denise Alain
Gardy Danièle
Valencia-Pabon Mario
Publication venue: Elsevier Science B.V.
Publication date: 17/03/2003
Field of study

AbstractIn this paper we first show that the permutation-path coloring problem is NP-hard even for very restrictive instances like involutions, which are permutations that contain only cycles of length at most two, on both binary trees and on trees having only two vertices with degree greater than two, and for circular permutations, which are permutations that contain exactly one cycle, on trees with maximum degree greater than or equal to 4. We calculate a lower bound on the average complexity of the permutation-path coloring problem on arbitrary networks. Then we give combinatorial and asymptotic results for the permutation-path coloring problem on linear networks in order to show that the average number of colors needed to color any permutation on a linear network on n vertices is n/4+o(n). We extend these results and obtain an upper bound on the average complexity of the permutation-path coloring problem on arbitrary trees, obtaining exact results in the case of generalized star trees. Finally we explain how to extend these results for the involutions-path coloring problem on arbitrary trees

Elsevier - Publisher Connector