16 research outputs found
Statistical strategies for pruning all the uninteresting association rules
We propose a general framework to describe formally the
problem of capturing the intensity of implication for
association rules through statistical metrics.
In this framework we present properties that influence the
interestingness of a rule, analyze the conditions that
lead a measure to perform a perfect prune at a time,
and define a final proper order to sort the surviving
rules. We will discuss why none of the currently employed
measures can capture objective interestingness, and
just the combination of some of them, in a multi-step fashion,
can be reliable. In contrast, we propose a new simple modification
of the Pearson coefficient that will meet all the necessary
requirements. We statistically infer the convenient cut-off
threshold for this new metric by empirically describing its
distribution function through simulation. Final experiments
serve to show the ability of our proposal.Postprint (published version
Discovering unbounded episodes in sequential data
One basic goal in the analysis of time-series data is
to find frequent interesting episodes, i.e, collections
of events occurring frequently together in the input sequence.
Most widely-known work decide the interestingness of an episode from a
fixed user-specified window width or interval, that bounds the
subsequent sequential association rules.
We present in this paper, a more intuitive definition that
allows, in turn, interesting episodes to grow during the mining without any
user-specified help. A convenient algorithm to
efficiently discover the proposed unbounded episodes is also implemented.
Experimental results confirm that our approach results useful
and advantageous.Postprint (published version
Characterization of concept lattices for ordered contexts
The discovery of frequent sequential patterns in an ordered collection
of data, such as sequential databases or time-series data, is an important issue
in several contexts. In this paper, we employ formal concept
analysis to develop the notion of closure for these sequential
patterns and to characterize the concept lattice of the ordered
contexts. The proposed concept lattice will serve as a model
for the patterns extracted in the context of sequential databases
by a recent algorithm (CloSpan, cite{Clospan}). Finally,
we will show how we can also use our model to derive other kind of
structured patterns, like the closed
set of episodes in the context of time-series data cite{Toivonen}.
So, the convenient transformation of the sequential patterns
in the concepts of the lattice will give rise to the most representative
set of parallel and serial closed episodes
Discovering unbounded episodes in sequential data
One basic goal in the analysis of time-series data is
to find frequent interesting episodes, i.e, collections
of events occurring frequently together in the input sequence.
Most widely-known work decide the interestingness of an episode from a
fixed user-specified window width or interval, that bounds the
subsequent sequential association rules.
We present in this paper, a more intuitive definition that
allows, in turn, interesting episodes to grow during the mining without any
user-specified help. A convenient algorithm to
efficiently discover the proposed unbounded episodes is also implemented.
Experimental results confirm that our approach results useful
and advantageous
Horn axiomatizations for sequential data
We propose a notion of deterministic association rules for ordered data. We prove that our proposed rules can be formally justified by a purely logical characterization, namely, a natural notion of empirical Horn approximation for ordered data which involves background Horn conditions; these ensure the consistency of the propositional theory obtained with the ordered context. The whole framework resorts to concept lattice models from of Formal Concept Analysis, but adapted to ordered contexts. We also discuss a general method to mine these rules that can be easily incorporated into any algorithm for mining closed sequences, of which there are already some in the literature.Postprint (published version
Horn axiomatizations for sequential data
We propose a notion of deterministic association rules for ordered data. We prove that our proposed rules can be formally justified by a purely logical characterization, namely, a natural notion of empirical Horn approximation for ordered data which involves background Horn conditions; these ensure the consistency of the propositional theory obtained with the ordered context. The whole framework resorts to concept lattice models from of Formal Concept Analysis, but adapted to ordered contexts. We also discuss a general method to mine these rules that can be easily incorporated into any algorithm for mining closed sequences, of which there are already some in the literature
Horn axiomatizations for sequential data
We propose a notion of deterministic association rules for ordered data. We prove that our proposed rules can be formally justified by a purely logical characterization, namely, a natural notion of empirical Horn approximation for ordered data which involves background Horn conditions; these ensure the consistency of the propositional theory obtained with the ordered context. The whole framework resorts to concept lattice models from of Formal Concept Analysis, but adapted to ordered contexts. We also discuss a general method to mine these rules that can be easily incorporated into any algorithm for mining closed sequences, of which there are already some in the literature
A lattice-based method for structural analysis
In this paper we revisit the foundations of formal concept analysis for ordered contexts of [4]. From the theoretical point of view, the obtained lattice has proved to be a proper unifying framework for reasoning about different sequential mining tasks: from the discovery of partial orders to the clustering of input sequences. Here we will show how these results on sequences can be naturally extended to the mining of any partial order structure. We empirically validate the approach by testing it on real world data. Our experimental evaluation shows that this lattice-based method is an intuitive tool for analyzing acyclic structures
A lattice-based method for structural analysis
In this paper we revisit the foundations of formal concept analysis for ordered contexts of [4]. From the theoretical point of view, the obtained lattice has proved to be a proper unifying framework for reasoning about different sequential mining tasks: from the discovery of partial orders to the clustering of input sequences. Here we will show how these results on sequences can be naturally extended to the mining of any partial order structure. We empirically validate the approach by testing it on real world data. Our experimental evaluation shows that this lattice-based method is an intuitive tool for analyzing acyclic structures.Postprint (published version