Search CORE

13,613 research outputs found

Learning the dependence structure of rare events: a non-asymptotic study

Author: Clémençon Stéphan
Goix Nicolas
Sabourin Anne
Publication venue
Publication date: 23/05/2015
Field of study

Assessing the probability of occurrence of extreme events is a crucial issue in various fields like finance, insurance, telecommunication or environmental sciences. In a multivariate framework, the tail dependence is characterized by the so-called stable tail dependence function (STDF). Learning this structure is the keystone of multivariate extremes. Although extensive studies have proved consistency and asymptotic normality for the empirical version of the STDF, non-asymptotic bounds are still missing. The main purpose of this paper is to fill this gap. Taking advantage of adapted VC-type concentration inequalities, upper bounds are derived with expected rate of convergence in O(k^-1/2). The concentration tools involved in this analysis rely on a more general study of maximal deviations in low probability regions, and thus directly apply to the classification of extreme data

arXiv.org e-Print Archive

On Anomaly Ranking and Excess-Mass Curves

Author: Clémençon Stéphan
Goix Nicolas
Sabourin Anne
Publication venue
Publication date: 01/10/2014
Field of study

Learning how to rank multivariate unlabeled observations depending on their degree of abnormality/novelty is a crucial problem in a wide range of applications. In practice, it generally consists in building a real valued "scoring" function on the feature space so as to quantify to which extent observations should be considered as abnormal. In the 1-d situation, measurements are generally considered as "abnormal" when they are remote from central measures such as the mean or the median. Anomaly detection then relies on tail analysis of the variable of interest. Extensions to the multivariate setting are far from straightforward and it is precisely the main purpose of this paper to introduce a novel and convenient (functional) criterion for measuring the performance of a scoring function regarding the anomaly ranking task, referred to as the Excess-Mass curve (EM curve). In addition, an adaptive algorithm for building a scoring function based on unlabeled data X1 , . . . , Xn with a nearly optimal EM is proposed and is analyzed from a statistical perspective

arXiv.org e-Print Archive

The biHecke monoid of a finite Coxeter group

Author: Hivert Florent
Schilling Anne
Thiéry Nicolas M.
Publication venue
Publication date: 11/12/2009
Field of study

The usual combinatorial model for the 0-Hecke algebra of the symmetric group is to consider the algebra (or monoid) generated by the bubble sort operators. This construction generalizes to any finite Coxeter group W. The authors previously introduced the Hecke group algebra, constructed as the algebra generated simultaneously by the bubble sort and antisort operators, and described its representation theory. In this paper, we consider instead the monoid generated by these operators. We prove that it has |W| simple and projective modules. In order to construct a combinatorial model for the simple modules, we introduce for each w in W a combinatorial module whose support is the interval [1,w] in right weak order. This module yields an algebra, whose representation theory generalizes that of the Hecke group algebra. This involves the introduction of a w-analogue of the combinatorics of descents of W and a generalization to finite Coxeter groups of blocks of permutation matrices.Comment: 12 pages, 1 figure, submitted to FPSAC'1

arXiv.org e-Print Archive

HAL - Normandie Université

Spectral gap for random-to-random shuffling on linear extensions

Author: Ayyer Arvind
Schilling Anne
Thiéry Nicolas M.
Publication venue: 'Informa UK Limited'
Publication date: 13/10/2015
Field of study

In this paper, we propose a new Markov chain which generalizes random-to-random shuffling on permutations to random-to-random shuffling on linear extensions of a finite poset of size

n

. We conjecture that the second largest eigenvalue of the transition matrix is bounded above by

(1+1/n)(1-2/n)

with equality when the poset is disconnected. This Markov chain provides a way to sample the linear extensions of the poset with a relaxation time bounded above by

n^2/(n+2)

and a mixing time of

O(n^2 \log n)

. We conjecture that the mixing time is in fact

O(n \log n)

as for the usual random-to-random shuffling.Comment: 16 pages, 10 figures; v2: typos fixed plus extra information in figures; v3: added explicit conjecture 2.2 + Section 3.6 on the diameter of the Markov Chain as evidence + misc minor improvements; v4: fixed bibliograph

arXiv.org e-Print Archive

HAL-CentraleSupelec

Open Access Repository of IISc Research Publications

eScholarship - University of California

HAL-Rennes 1

The importance of biomass net uptake for a trace metal budget in a forest stand in north-eastern France

Author: Gandois Laure
Nicolas Manuel
Probst Anne
VanderHeijden Gregory
Publication venue: 'Elsevier BV'
Publication date: 01/11/2010
Field of study

The trace metal (TM: Cd, Cu, Ni, Pb and Zn) budget (stocks and annual fluxes) was evaluated in a forest stand (silver fir, Abies alba Miller) in north-eastern France. Trace metal concentrations were measured in different tree compartments in order to assess TM partitioning and dynamics in the trees. Inputs included bulk deposition, estimated dry deposition and weathering. Outputs were leaching and biomass exportation. Atmospheric deposition was the main input flux. The estimated dry deposition accounted for about 40% of the total trace metal deposition. The relative importance of leaching (estimated by a lumped parameter water balance model, BILJOU) and net biomass uptake (harvesting) for ecosystem exportation depended on the element. Trace metal distribution between tree compartments (stem wood and bark, branches and needles) indicated that Pb was mainly stored in the stem, whereas Zn and Ni, and to a lesser extent Cd and Cu, were translocated to aerial parts of the trees and cycled in the ecosystem. For Zn and Ni, leaching was the main output flux (N95% of the total output) and the plot budget (input–output) was negative, whereas for Pb the biomass net exportation represented 60% of the outputs and the budget was balanced. Cadmium and Cu had intermediate behaviours, with 18% and 30% of the total output relative to biomass exportation, respectively, and the budgets were negative. The net uptake by biomass was particularly important for Pb budgets, less so for Cd and Cu and not very important for Zn and Ni in such forest stands

Crossref

Open Archive Toulouse Archive Ouverte

HAL-INSU

HAL-IRD

On the representation theory of finite J-trivial monoids

Author: Denton Tom
Hivert Florent
Schilling Anne
Thiéry Nicolas M.
Publication venue
Publication date: 17/10/2010
Field of study

In 1979, Norton showed that the representation theory of the 0-Hecke algebra admits a rich combinatorial description. Her constructions rely heavily on some triangularity property of the product, but do not use explicitly that the 0-Hecke algebra is a monoid algebra. The thesis of this paper is that considering the general setting of monoids admitting such a triangularity, namely J-trivial monoids, sheds further light on the topic. This is a step to use representation theory to automatically extract combinatorial structures from (monoid) algebras, often in the form of posets and lattices, both from a theoretical and computational point of view, and with an implementation in Sage. Motivated by ongoing work on related monoids associated to Coxeter systems, and building on well-known results in the semi-group community (such as the description of the simple modules or the radical), we describe how most of the data associated to the representation theory (Cartan matrix, quiver) of the algebra of any J-trivial monoid M can be expressed combinatorially by counting appropriate elements in M itself. As a consequence, this data does not depend on the ground field and can be calculated in O(n^2), if not O(nm), where n=|M| and m is the number of generators. Along the way, we construct a triangular decomposition of the identity into orthogonal idempotents, using the usual M\"obius inversion formula in the semi-simple quotient (a lattice), followed by an algorithmic lifting step. Applying our results to the 0-Hecke algebra (in all finite types), we recover previously known results and additionally provide an explicit labeling of the edges of the quiver. We further explore special classes of J-trivial monoids, and in particular monoids of order preserving regressive functions on a poset, generalizing known results on the monoids of nondecreasing parking functions.Comment: 41 pages; 4 figures; added Section 3.7.4 in version 2; incorporated comments by referee in version

arXiv.org e-Print Archive

HAL - Normandie Université

CiteSeerX

eScholarship - University of California