673 research outputs found
A review of associative classification mining
Associative classification mining is a promising approach in data mining that utilizes the
association rule discovery techniques to construct classification systems, also known as
associative classifiers. In the last few years, a number of associative classification algorithms
have been proposed, i.e. CPAR, CMAR, MCAR, MMAC and others. These algorithms
employ several different rule discovery, rule ranking, rule pruning, rule prediction and rule
evaluation methods. This paper focuses on surveying and comparing the state-of-the-art associative
classification techniques with regards to the above criteria. Finally, future directions in associative
classification, such as incremental learning and mining low-quality data sets, are also
highlighted in this paper
Fast pixelated detectors in scanning transmission electron microscopy. Part II: post acquisition data processing, visualisation, and structural characterisation
Fast pixelated detectors incorporating direct electron detection (DED)
technology are increasingly being regarded as universal detectors for scanning
transmission electron microscopy (STEM), capable of imaging under multiple
modes of operation. However, several issues remain around the post acquisition
processing and visualisation of the often very large multidimensional STEM
datasets produced by them. We discuss these issues and present open source
software libraries to enable efficient processing and visualisation of such
datasets. Throughout, we provide examples of the analysis methodologies
presented, utilising data from a 256×256 pixel Medipix3 hybrid DED
detector, with a particular focus on the STEM characterisation of the
structural properties of materials. These include the techniques of virtual
detector imaging; higher order Laue zone analysis; nanobeam electron
diffraction; and scanning precession electron diffraction. In the latter, we
demonstrate nanoscale lattice parameter mapping with a fractional precision
≤6×10−4 (0.06%)
Automated Segmentation of Large 3D Images of Nervous Systems Using a Higher-order Graphical Model
This thesis presents a new mathematical model for segmenting volume images. The model is an energy function defined on the state space of all possibilities to remove or preserve splitting faces from an initial over-segmentation of the 3D image into supervoxels. It decomposes into potential functions that are learned automatically from a small amount of empirical training data. The learning is based on features of the distribution of gray values in the volume image and on features of the geometry and topology of the supervoxel segmentation. To be able to extract these features from large 3D images that consist of several billion voxels, a new algorithm is presented that constructs a suitable representation of the geometry and topology of volume segmentations in a block-wise fashion, in log-linear runtime (in the number of voxels) and in parallel, using only a prescribed amount of memory. At the core of this thesis is the optimization problem of finding, for a learned energy function, a segmentation with minimal energy. This optimization problem is difficult because the energy function consists of 3rd and 4th order potential functions that are not submodular. For sufficiently small problems with 10,000 degrees of freedom, it can be solved to global optimality using Mixed Integer Linear Programming. For larger models with 10,000,000 degrees of freedom, an approximate optimizer is proposed and compared to state-of-the-art alternatives. Using these new techniques and a unified data structure for multi-variate data and functions, a complete processing chain for segmenting large volume images, from the restoration of the raw volume image to the visualization of the final segmentation, has been implemented in C++. Results are shown for an application in neuroscience, namely the segmentation of a part of the inner plexiform layer of rabbit retina in a volume image of 2048 x 1792 x 2048 voxels that was acquired by means of Serial Block Face Scanning Electron Microscopy (Denk and Horstmann, 2004) with a resolution of 22nm x 22nm x 30nm. The quality of the automated segmentation as well as the improvement over a simpler model that does not take geometric context into account, are confirmed by a quantitative comparison with the gold standard
Recommended from our members
Image processing and understanding based on graph similarity testing: algorithm design and software development
Image processing and understanding is a key task in the human visual system. Among all related topics, content based image retrieval and classification is the most typical and important problem. Successful image retrieval/classification models require an effective fundamental step of image representation and feature extraction. While traditional methods are not capable of capturing all structural information on the image, using graph to represent the image is not only biologically plausible but also has certain advantages.
Graphs have been widely used in image related applications. Traditional graph-based image analysis models include pixel-based graph-cut techniques for image segmentation, low-level and high-level image feature extraction based on graph statistics and other related approaches which utilize the idea of graph similarity testing. To compare the images through their graph representations, a graph similarity testing algorithm is essential. Most of the existing graph similarity measurement tools are not designed for generic tasks such as image classification and retrieval, and some other models are either not scalable or not always effective. Graph spectral theory is a powerful analytical tool for capturing and representing structural information of the graph, but to use it on image understanding remains a challenge.
In this dissertation, we focus on developing fast and effective image analysis models based on the spectral graph theory and other graph related mathematical tools. We first propose a fast graph similarity testing method based on the idea of the heat content and the mathematical theory of diffusion over manifolds. We then demonstrate the ability of our similarity testing model by comparing random graphs and power law graphs. Based on our graph analysis model, we develop a graph-based image representation and understanding framework. We propose the image heat content feature at first and then discuss several approaches to further improve the model. The first component in our improved framework is a novel graph generation model. The proposed model greatly reduces the size of the traditional pixel-based image graph representation and is shown to still be effective in representing an image. Meanwhile, we propose and discuss several low-level and high-level image features based on spectral graph information, including oscillatory image heat content, weighted eigenvalues and weighted heat content spectrum. Experiments show that the proposed models are invariant to non-structural changes on images and perform well in standard image classification benchmarks. Furthermore, our image features are robust to small distortions and changes of viewpoint. The model is also capable of capturing important image structural information on the image and performs well alone or in combination with other traditional techniques. We then introduce two real world software development projects using graph-based image processing techniques in this dissertation. Finally, we discuss the pros, cons and the intuition of our proposed model by demonstrating the properties of the proposed image feature and the correlation between different image features
Proportion frequency occurrence count with bat algorithm (FOCBA) for rule optimization and mining of proportion equivalence fuzzy constraint class association rules (PEFCARs)
Fuzzy Class Association Rules (FCARs) play an important role in decision support systems and have thus been extensively studied. Mining the important rules in FCARs becomes very difficult task, so Enhanced Equivalence Fuzzy Class Rule tree (EEFCR-tree) algorithm is proposed in this work. However, a major weakness of FCARs Miner is that when the number of constrained rules in a given class dominates the total constrained rules; its performance becomes slower than the normal method. To solve this problem this paper proposes a Proportion of Constraint Class Estimation (PPCE) algorithm for mining Enhanced Proportion Equivalence Fuzzy Constraint Class Association Rules (EPEFCARs) in order to save memory usage, run time and accuracy. Then, Proportion Frequency Occurrence count with Bat Algorithm (PFOCBA) is proposed for pruning rules which much satisfying the class constraints. Finally, an efficient algorithm is proposed for mining PEFCARs rules. Experimental results show that the proposed EPEFCR-tree algorithm is more efficient than Enhanced Equivalence Fuzzy Class Rule tree (EEFCR-tree), Novel Equivalence Fuzzy Class Rule tree (NECR-tree) Miner results are measured in terms of run time, accuracy and memory usage. Experiments show that the proposed method is faster than existing methods
A modified multi-class association rule for text mining
Classification and association rule mining are significant tasks in data mining. Integrating association rule discovery and classification in data mining brings us an approach known as the associative classification. One common shortcoming of existing Association Classifiers is the huge number of rules produced in order to obtain high classification accuracy. This study proposes s a Modified Multi-class Association Rule Mining (mMCAR) that consists of three procedures; rule discovery, rule pruning and group-based class assignment. The rule discovery and rule pruning
procedures are designed to reduce the number of classification rules. On the other hand, the group-based class assignment procedure contributes in improving the classification accuracy. Experiments on the structured and unstructured text datasets
obtained from the UCI and Reuters repositories are performed in order to evaluate the proposed Association Classifier. The proposed mMCAR classifier is benchmarked against the traditional classifiers and existing Association Classifiers.
Experimental results indicate that the proposed Association Classifier, mMCAR, produced high accuracy with a smaller number of classification rules. For the structured dataset, the mMCAR produces an average of 84.24% accuracy as compared to MCAR that obtains 84.23%. Even though the classification accuracy difference is small, the proposed mMCAR uses only 50 rules for the classification while its benchmark method involves 60 rules. On the other hand, mMCAR is at par
with MCAR when unstructured dataset is utilized. Both classifiers produce 89% accuracy but mMCAR uses less number of rules for the classification. This study contributes to the text mining domain as automatic classification of huge and widely
distributed textual data could facilitate the text representation and retrieval processes
- …