Search CORE

154 research outputs found

Revisiting Numerical Pattern Mining with Formal Concept Analysis

Author: Kaytoue Mehdi
Kuznetsov Sergei O.
Napoli Amedeo
Publication venue
Publication date: 01/06/2011
Field of study

In this paper, we investigate the problem of mining numerical data in the framework of Formal Concept Analysis. The usual way is to use a scaling procedure --transforming numerical attributes into binary ones-- leading either to a loss of information or of efficiency, in particular w.r.t. the volume of extracted patterns. By contrast, we propose to directly work on numerical data in a more precise and efficient way, and we prove it. For that, the notions of closed patterns, generators and equivalent classes are revisited in the numerical context. Moreover, two original algorithms are proposed and used in an evaluation involving real-world data, showing the predominance of the present approach

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Characterization of order-like dependencies with formal concept analysis

Author: Baixeries i Juvillà Jaume
Codocedo Victor
Kaytoue Mehdi
Napoli Amedeo
Publication venue
Publication date: 01/01/2011
Field of study

Functional Dependencies (FDs) play a key role in many fields of the relational database model, one of the most widely used database systems. FDs have also been applied in data analysis, data quality, knowl- edge discovery and the like, but in a very limited scope, because of their fixed semantics. To overcome this limitation, many generalizations have been defined to relax the crisp definition of FDs. FDs and a few of their generalizations have been characterized with Formal Concept Analysis which reveals itself to be an interesting unified framework for charac- terizing dependencies, that is, understanding and computing them in a formal way. In this paper, we extend this work by taking into account order-like dependencies. Such dependencies, well defined in the database field, consider an ordering on the domain of each attribute, and not sim- ply an equality relation as with standard FDs.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

Identifying Avatar Aliases in Starcraft 2

Author: Boulicaut Jean-François
Cavadenti Olivier
Codocedo Victor
Kaytoue Mehdi
Publication venue
Publication date: 04/08/2015
Field of study

In electronic sports, cyberathletes conceal their online training using different avatars (virtual identities), allowing them not being recognized by the opponents they may face in future competitions. In this article, we propose a method to tackle this avatar aliases identification problem. Our method trains a classifier on behavioural data and processes the confusion matrix to output label pairs which concentrate confusion. We experimented with Starcraft 2 and report our first results.Comment: Machine Learning and Data Mining for Sports Analytics ECML/PKDD 2015 workshop, 11 September 2015, Porto, Portuga

arXiv.org e-Print Archive

HAL

Hal-Diderot

The Coron System

Author: Kaytoue Mehdi
Marcuola Florent
Napoli Amedeo
Szathmary Laszlo
Villerd Jean
Publication venue
Publication date: 15/03/2010
Field of study

Coron is a domain and platform independent, multi-purposed data mining toolkit, which incorporates not only a rich collection of data mining algorithms, but also allows a number of auxiliary operations. To the best of our knowledge, a data mining toolkit designed specifically for itemset extraction and association rule generation like Coron does not exist elsewhere. Coron also provides support for preparing and filtering data, and for interpreting the extracted units of knowledge

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Anytime Subgroup Discovery in Numerical Domains with Guarantees

Author: Belfodil Adnene
Belfodil Aimene
Kaytoue Mehdi
Publication venue: HAL CCSD
Publication date: 10/09/2018
Field of study

International audienceSubgroup discovery is the task of discovering patterns that accurately discriminate a class label from the others. Existing approaches can uncover such patterns either through an exhaustive or an approximate exploration of the pattern search space. However, an exhaustive exploration is generally unfeasible whereas approximate approaches do not provide guarantees bounding the error of the best pattern quality nor the exploration progression ("How far are we of an exhaustive search"). We design here an algorithm for mining numerical data with three key properties w.r.t. the state of the art: (i) It yields progressively interval patterns whose quality improves over time; (ii) It can be interrupted anytime and always gives a guarantee bounding the error on the top pattern quality and (iii) It always bounds a distance to the exhaustive exploration. After reporting experimentations showing the effectiveness of our method, we discuss its generalization to other kinds of patterns

Mining Biclusters of Similar Values with Triadic Concept Analysis

Author: Kaytoue Mehdi
Kuznetsov Sergei O.
Macko Juraj
Meira Wagner
Napoli Amedeo
Publication venue
Publication date: 01/01/2011
Field of study

Biclustering numerical data became a popular data-mining task in the beginning of 2000's, especially for analysing gene expression data. A bicluster reflects a strong association between a subset of objects and a subset of attributes in a numerical object/attribute data-table. So called biclusters of similar values can be thought as maximal sub-tables with close values. Only few methods address a complete, correct and non redundant enumeration of such patterns, which is a well-known intractable problem, while no formal framework exists. In this paper, we introduce important links between biclustering and formal concept analysis. More specifically, we originally show that Triadic Concept Analysis (TCA), provides a nice mathematical framework for biclustering. Interestingly, existing algorithms of TCA, that usually apply on binary data, can be used (directly or with slight modifications) after a preprocessing step for extracting maximal biclusters of similar values.Comment: Concept Lattices and their Applications (CLA) (2011

arXiv.org e-Print Archive

CiteSeerX

INRIA a CCSD electronic archive server

On-Premise AIOps Infrastructure for a Software Editor SME: An Experience Report

Author: Bendimerad Anes
Kaytoue Mehdi
Mathonat Romain
Remil Youcef
Publication venue
Publication date: 22/08/2023
Field of study

Information Technology has become a critical component in various industries, leading to an increased focus on software maintenance and monitoring. With the complexities of modern software systems, traditional maintenance approaches have become insufficient. The concept of AIOps has emerged to enhance predictive maintenance using Big Data and Machine Learning capabilities. However, exploiting AIOps requires addressing several challenges related to the complexity of data and incident management. Commercial solutions exist, but they may not be suitable for certain companies due to high costs, data governance issues, and limitations in covering private software. This paper investigates the feasibility of implementing on-premise AIOps solutions by leveraging open-source tools. We introduce a comprehensive AIOps infrastructure that we have successfully deployed in our company, and we provide the rationale behind different choices that we made to build its various components. Particularly, we provide insights into our approach and criteria for selecting a data management system and we explain its integration. Our experience can be beneficial for companies seeking to internally manage their software maintenance processes with a modern AIOps approach

arXiv.org e-Print Archive

AIOps Solutions for Incident Management: Technical Guidelines and A Comprehensive Literature Review

Author: Bendimerad Anes
Kaytoue Mehdi
Mathonat Romain
Remil Youcef
Publication venue
Publication date: 01/04/2024
Field of study

The management of modern IT systems poses unique challenges, necessitating scalability, reliability, and efficiency in handling extensive data streams. Traditional methods, reliant on manual tasks and rule-based approaches, prove inefficient for the substantial data volumes and alerts generated by IT systems. Artificial Intelligence for Operating Systems (AIOps) has emerged as a solution, leveraging advanced analytics like machine learning and big data to enhance incident management. AIOps detects and predicts incidents, identifies root causes, and automates healing actions, improving quality and reducing operational costs. However, despite its potential, the AIOps domain is still in its early stages, decentralized across multiple sectors, and lacking standardized conventions. Research and industrial contributions are distributed without consistent frameworks for data management, target problems, implementation details, requirements, and capabilities. This study proposes an AIOps terminology and taxonomy, establishing a structured incident management procedure and providing guidelines for constructing an AIOps framework. The research also categorizes contributions based on criteria such as incident management tasks, application areas, data sources, and technical approaches. The goal is to provide a comprehensive review of technical and research aspects in AIOps for incident management, aiming to structure knowledge, identify gaps, and establish a foundation for future developments in the field

arXiv.org e-Print Archive

Computing Functional Dependencies with Pattern Structures

Author: Baixeries i Juvillà Jaume
Kaytoue Mehdi
Napoli Amedeo
Publication venue
Publication date: 01/01/2012
Field of study

The treatment of many-valued data with FCA has been achieved by means of scaling. This method has some drawbacks, since the size of the resulting formal contexts depends usually on the number of di erent values that are present in a table, which can be very large. Pattern structures have been proved to deal with many-valued data, offering a viable and sound alternative to scaling in order to represent and analyze sets of many-valued data with FCA. Functional dependencies have already been dealt with FCA using the binarization of a table, that is, creating a formal context out of a set of data. Unfortunately, although this method is standard and simple, it has an important drawback, which is the fact that the resulting context is quadratic in number of objects w.r.t. the original set of data. In this paper, we examine how we can extract the functional dependencies that hold in a set of data using pattern structures. This allows to build an equivalent concept lattice avoiding the step of binarization, and thus comes with better concept representation and computation.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

SeqScout: Using a Bandit Model to Discover Interesting Subgroups in Labeled Sequences

Author: Boulicaut Jean-François
Kaytoue Mehdi
Mathonat Romain
Nurbakova Diana
Publication venue: HAL CCSD
Publication date: 05/10/2019
Field of study

International audienceIt is extremely useful to exploit labeled datasets not only to learn models but also to improve our understanding of a domain and its available targeted classes. The so-called subgroup discovery task has been considered for a long time. It concerns the discovery of patterns or descriptions, the set of supporting objects of which have interesting properties, e.g., they characterize or discriminate a given target class. Though many subgroup discovery algorithms have been proposed for transactional data, discovering subgroups within labeled sequential data and thus searching for descriptions as sequential patterns has been much less studied. In that context, exhaustive exploration strategies can not be used for real-life applications and we have to look for heuristic approaches. We propose the algorithm SeqScout to discover interesting subgroups (w.r.t. a chosen quality measure) from labeled sequences of itemsets. This is a new sampling algorithm that mines discriminant sequential patterns using a multi-armed bandit model. It is an anytime algorithm that, for a given budget, finds a collection of local optima in the search space of descriptions and thus subgroups. It requires a light configuration and it is independent from the quality measure used for pattern scoring. Furthermore, it is fairly simple to implement. We provide qualitative and quantitative experiments on several datasets to illustrate its added-value

Crossref

HAL

Hal-Diderot