37,331 research outputs found
Evaluation and optimization of frequent association rule based classification
Deriving useful and interesting rules from a data mining system is an essential and important task. Problems
such as the discovery of random and coincidental patterns or patterns with no significant values, and the
generation of a large volume of rules from a database commonly occur. Works on sustaining the interestingness
of rules generated by data mining algorithms are actively and constantly being examined and developed. In this
paper, a systematic way to evaluate the association rules discovered from frequent itemset mining algorithms,
combining common data mining and statistical interestingness measures, and outline an appropriated sequence of usage is presented. The experiments are performed using a number of real-world datasets that represent diverse characteristics of data/items, and detailed evaluation of rule sets is provided. Empirical results show that with a proper combination of data mining and statistical analysis, the framework is capable of eliminating a large number of non-significant, redundant and contradictive rules while preserving relatively valuable high accuracy and coverage rules when used in the classification problem. Moreover, the results reveal the important characteristics of mining frequent itemsets, and the impact of confidence measure for the classification task
Image mining: trends and developments
[Abstract]: Advances in image acquisition and storage technology have led to tremendous growth in very large and detailed image databases. These images, if analyzed, can reveal useful information to the human users. Image mining deals with the extraction of implicit knowledge, image data relationship, or other patterns not explicitly stored in the images. Image mining is more than just an extension of data mining to image domain. It is an interdisciplinary endeavor that draws upon expertise in computer vision, image processing, image retrieval, data mining, machine learning, database, and artificial intelligence. In this paper, we will examine the research issues in image mining, current developments in image mining, particularly, image mining frameworks, state-of-the-art techniques and systems. We will also identify some future research directions for image mining
Quantitative Redundancy in Partial Implications
We survey the different properties of an intuitive notion of redundancy, as a
function of the precise semantics given to the notion of partial implication.
The final version of this survey will appear in the Proceedings of the Int.
Conf. Formal Concept Analysis, 2015.Comment: Int. Conf. Formal Concept Analysis, 201
From data towards knowledge: Revealing the architecture of signaling systems by unifying knowledge mining and data mining of systematic perturbation data
Genetic and pharmacological perturbation experiments, such as deleting a gene
and monitoring gene expression responses, are powerful tools for studying
cellular signal transduction pathways. However, it remains a challenge to
automatically derive knowledge of a cellular signaling system at a conceptual
level from systematic perturbation-response data. In this study, we explored a
framework that unifies knowledge mining and data mining approaches towards the
goal. The framework consists of the following automated processes: 1) applying
an ontology-driven knowledge mining approach to identify functional modules
among the genes responding to a perturbation in order to reveal potential
signals affected by the perturbation; 2) applying a graph-based data mining
approach to search for perturbations that affect a common signal with respect
to a functional module, and 3) revealing the architecture of a signaling system
organize signaling units into a hierarchy based on their relationships.
Applying this framework to a compendium of yeast perturbation-response data, we
have successfully recovered many well-known signal transduction pathways; in
addition, our analysis have led to many hypotheses regarding the yeast signal
transduction system; finally, our analysis automatically organized perturbed
genes as a graph reflecting the architect of the yeast signaling system.
Importantly, this framework transformed molecular findings from a gene level to
a conceptual level, which readily can be translated into computable knowledge
in the form of rules regarding the yeast signaling system, such as "if genes
involved in MAPK signaling are perturbed, genes involved in pheromone responses
will be differentially expressed"
- …