144 research outputs found
A Constraint Programming Approach for Mining Sequential Patterns in a Sequence Database
Constraint-based pattern discovery is at the core of numerous data mining
tasks. Patterns are extracted with respect to a given set of constraints
(frequency, closedness, size, etc). In the context of sequential pattern
mining, a large number of devoted techniques have been developed for solving
particular classes of constraints. The aim of this paper is to investigate the
use of Constraint Programming (CP) to model and mine sequential patterns in a
sequence database. Our CP approach offers a natural way to simultaneously
combine in a same framework a large set of constraints coming from various
origins. Experiments show the feasibility and the interest of our approach
Discovering Knowledge using a Constraint-based Language
Discovering pattern sets or global patterns is an attractive issue from the
pattern mining community in order to provide useful information. By combining
local patterns satisfying a joint meaning, this approach produces patterns of
higher level and thus more useful for the data analyst than the usual local
patterns, while reducing the number of patterns. In parallel, recent works
investigating relationships between data mining and constraint programming (CP)
show that the CP paradigm is a nice framework to model and mine such patterns
in a declarative and generic way. We present a constraint-based language which
enables us to define queries addressing patterns sets and global patterns. The
usefulness of such a declarative approach is highlighted by several examples
coming from the clustering based on associations. This language has been
implemented in the CP framework.Comment: 12 page
Cost Function Networks to Solve Large Computational Protein Design Problems
International audienc
Crossing Boundaries: Tapestry Within the Context of the 21st Century
International audienceGraphical model processing is a central problem in artificial intelligence. The optimization of the combined cost of a network of local cost functions federates a variety of famous problems including CSP, SAT and Max-SAT but also optimization in stochastic variants such as Markov Random Fields and Bayesian networks. Exact solving methods for these problems typically include branch and bound and local inference-based bounds.In this paper we are interested in understanding when and how dynamic programming based optimization can be used to efficiently enforce soft local consistencies on Global Cost Functions, defined as parameterized families of cost functions of unbounded arity. Enforcing local consistencies in cost function networks is performed by applying so-called Equivalence Preserving Transformations (EPTs) to the cost functions. These EPTs may transform global cost functions and make them intractable to optimize.We identify as tractable projection-safe those global cost functions whose optimization is and remains tractable after applying the EPTs used for enforcing arc consistency. We also provide new classes of cost functions that are tractable projection-safe thanks to dynamic programming.We show that dynamic programming can either be directly used inside filtering algorithms, defining polynomially DAG-filterable cost functions, or emulated by arc consistency filtering on a Berge-acyclic network of bounded-arity cost functions, defining Berge-acyclic network-decomposable cost functions. We give examples of such cost functions and we provide a systematic way to define decompositions from existing decomposable global constraints.These two approaches to enforcing consistency in global cost functions are then embedded in a solver for extensive experiments that confirm the feasibility and efficiency of our proposal
Closed-Pattern : Une contrainte globale pour lâextraction de motifs frĂ©quents fermĂ©s
National audienceLâextraction de motifs frĂ©quents fermĂ©s est un des dĂ©fis majeurs en fouille de donnĂ©es. Les travaux entrepris rĂ©cemment en extraction de motifs ont mis en avant lâintĂ©rĂȘt dâutiliser les contraintes pour une fouille dĂ©clarative. Ces approches se sont montrĂ©es trĂšs attractives par leurs flexibilitĂ©, mais lâutilisation dâun nombre important de contraintes rĂ©ifiĂ©es et de variables auxiliaires posent un sĂ©rieux problĂšme quant au traitement des bases de grandes tailles. Dans ce papier, nous prĂ©sentons une contrainte globale nommĂ©e ClosedPattern, qui capture la sĂ©mantique particuliĂšre des motifs fermĂ©s pour rĂ©soudre efficacement ce problĂšme, sans faire appel aux contraintes rĂ©ifiĂ©es. Nous proposons un algorithme de filtrage pour la contrainte ClosedPattern, qui maintient la consistance de domaine DC en un temps et espace polynomial
Cost Function Networks to Solve Large Computational Protein Design Problems
International audienc
A tensor based hyper-heuristic for nurse rostering
Nurse rostering is a well-known highly constrained scheduling problem requiring assignment of shifts to nurses satisfying a variety of constraints. Exact algorithms may fail to produce high quality solutions, hence (meta)heuristics are commonly preferred as solution methods which are often designed and tuned for specific (group of) problem instances. Hyper-heuristics have emerged as general search methodologies that mix and manage a predefined set of low level heuristics while solving computationally hard problems. In this study, we describe an online learning hyper-heuristic employing a data science technique which is capable of self-improvement via tensor analysis for nurse rostering. The proposed approach is evaluated on a well-known nurse rostering benchmark consisting of a diverse collection of instances obtained from different hospitals across the world. The empirical results indicate the success of the tensor-based hyper-heuristic, improving upon the best-known solutions for four of the instances
- âŠ