630 research outputs found

    DATA MINING AND IMAGE CLASSIFICATION USING GENETIC PROGRAMMING

    Get PDF
    Genetic programming (GP), a capable machine learning and search method, motivated by Darwinian-evolution, is an evolutionary learning algorithm which automatically evolves computer programs in the form of trees to solve problems. This thesis studies the application of GP for data mining and image processing. Knowledge discovery and data mining have been widely used in business, healthcare, and scientific fields. In data mining, classification is supervised learning that identifies new patterns and maps the data to predefined targets. A GP based classifier is developed in order to perform these mappings. GP has been investigated in a series of studies to classify data; however, there are certain aspects which have not formerly been studied. We propose an optimized GP classifier based on a combination of pruning subtrees and a new fitness function. An orthogonal least squares algorithm is also applied in the training phase to create a robust GP classifier. The proposed GP classifier is validated by 10-fold cross validation. Three areas were studied in this thesis. The first investigation resulted in an optimized genetic-programming-based classifier that directly solves multi-class classification problems. Instead of defining static thresholds as boundaries to differentiate between multiple labels, our work presents a method of classification where a GP system learns the relationships among experiential data and models them mathematically during the evolutionary process. Our approach has been assessed on six multiclass datasets. The second investigation was to develop a GP classifier to segment and detect brain tumors on magnetic resonance imaging (MRI) images. The findings indicated the high accuracy of brain tumor classification provided by our GP classifier. The results confirm the strong ability of the developed technique for complicated image classification problems. The third was to develop a hybrid system for multiclass imbalanced data classification using GP and SMOTE which was tested on satellite images. The finding showed that the proposed approach improves both training and test results when the SMOTE technique is incorporated. We compared our approach in terms of speed with previous GP algorithms as well. The analyzed results illustrate that the developed classifier produces a productive and rapid method for classification tasks that outperforms the previous methods for more challenging multiclass classification problems. We tested the approaches presented in this thesis on publicly available datasets, and images. The findings were statistically tested to conclude the robustness of the developed approaches

    Genetic Programming for Object Detection : a Two-Phase Approach with an Improved Fitness Function

    Get PDF
    This paper describes two innovations that improve the efficiency and effectiveness of a genetic programming approach to object detection problems. The approach uses genetic programming to construct object detection programs that are applied, in a moving window fashion, to the large images to locate the objects of interest. The first innovation is to break the GP search into two phases with the first phase applied to a selected subset of the training data, and a simplified fitness function. The second phase is initialised with the programs from the first phase, and uses the full set of training data with a complete fitness function to construct the final detection programs. The second innovation is to add a program size component to the fitness function. This approach is examined and compared with a neural network approach on three object detection problems of increasing difficulty. The results suggest that the innovations increase both the effectiveness and the efficiency of the genetic programming search, and also that the genetic programming approach outperforms a neural network approach for the most difficult data set in terms of the object detection accuracy

    GPU acceleration of object classification algorithms using NVIDIA CUDA

    Get PDF
    The field of computer vision has become an important part of today\u27s society, supporting crucial applications in the medical, manufacturing, military intelligence and surveillance domains. Many computer vision tasks can be divided into fundamental steps: image acquisition, pre-processing, feature extraction, detection or segmentation, and high-level processing. This work focuses on classification and object detection, specifically k-Nearest Neighbors, Support Vector Machine classification, and Viola & Jones object detection. Object detection and classification algorithms are computationally intensive, which makes it difficult to perform classification tasks in real-time. This thesis aims in overcoming the processing limitations of the above classification algorithms by offloading computation to the graphics processing unit (GPU) using NVIDIA\u27s Compute Unified Device Architecture (CUDA). The primary focus of this work is the implementation of the Viola and Jones object detector in CUDA. A multi-GPU implementation provides a speedup ranging from 1x to 6.5x over optimized OpenCV code for image sizes of 300 x 300 pixels up to 2900 x 1600 pixels while having comparable detection results. The second part of this thesis is the implementation of a multi-GPU multi-class SVM classifier. The classifier had the same accuracy as an identical implementation using LIBSVM with a speedup ranging from 89x to 263x on the tested datasets. The final part of this thesis was the extension of a previous CUDA k-Nearest Neighbor implementation by exploiting additional levels of parallelism. These extensions provided a speedup of 1.24x and 2.35x over the previous CUDA implementation. As an end result of this work, a library of these three CUDA classifiers has been compiled for use by future researchers

    An Information Theoretic Approach For Feature Selection And Segmentation In Posterior Fossa Tumors

    Get PDF
    Posterior Fossa (PF) is a type of brain tumor located in or near brain stem and cerebellum. About 55% - 70 % pediatric brain tumors arise in the posterior fossa, compared with only 15% - 20% of adult tumors. For segmenting PF tumors we should have features to study the characteristics of tumors. In literature, different types of texture features such as Fractal Dimension (FD) and Multifractional Brownian Motion (mBm) have been exploited for measuring randomness associated with brain and tumor tissues structures, and the varying appearance of tissues in magnetic resonance images (MRI). For selecting best features techniques such as neural network and boosting methods have been exploited. However, neural network cannot descirbe about the properties of texture features. We explore methods such as information theroetic methods which can perform feature selection based on properties of texture features. The primary contribution of this dissertation is investigating efficacy of different image features such as intensity, fractal texture, and level - set shape in segmentation of PF tumor for pediatric patients. We explore effectiveness of using four different feature selection and three different segmentation techniques respectively to discriminate tumor regions from normal tissue in multimodal brain MRI. Our research suggest that Kullback - Leibler Divergence (KLD) measure for feature ranking and selection and Expectation Maximization (EM) algorithm for feature fusion and tumor segmentation offer the best performance for the patient data in this study. To improve segmentation accuracy, we need to consider abnormalities such as cyst, edema and necrosis which surround tumors. In this work, we exploit features which describe properties of cyst and technique which can be used to segment it. To achieve this goal, we extend the two class KLD techniques to multiclass feature selection techniques, so that we can effectively select features for tumor, cyst and non tumor tissues. We compute segemntation accuracy by computing number of pixels segemented to total number of pixels for the best features. For automated process we integrate the inhomoheneity correction, feature selection using KLD and segmentation in an integrated EM framework. To validate results we have used similarity coefficients for computing the robustness of segmented tumor and cyst

    Parallelizing support vector machines for scalable image annotation

    Get PDF
    Machine learning techniques have facilitated image retrieval by automatically classifying and annotating images with keywords. Among them Support Vector Machines (SVMs) are used extensively due to their generalization properties. However, SVM training is notably a computationally intensive process especially when the training dataset is large. In this thesis distributed computing paradigms have been investigated to speed up SVM training, by partitioning a large training dataset into small data chunks and process each chunk in parallel utilizing the resources of a cluster of computers. A resource aware parallel SVM algorithm is introduced for large scale image annotation in parallel using a cluster of computers. A genetic algorithm based load balancing scheme is designed to optimize the performance of the algorithm in heterogeneous computing environments. SVM was initially designed for binary classifications. However, most classification problems arising in domains such as image annotation usually involve more than two classes. A resource aware parallel multiclass SVM algorithm for large scale image annotation in parallel using a cluster of computers is introduced. The combination of classifiers leads to substantial reduction of classification error in a wide range of applications. Among them SVM ensembles with bagging is shown to outperform a single SVM in terms of classification accuracy. However, SVM ensembles training are notably a computationally intensive process especially when the number replicated samples based on bootstrapping is large. A distributed SVM ensemble algorithm for image annotation is introduced which re-samples the training data based on bootstrapping and training SVM on each sample in parallel using a cluster of computers. The above algorithms are evaluated in both experimental and simulation environments showing that the distributed SVM algorithm, distributed multiclass SVM algorithm, and distributed SVM ensemble algorithm, reduces the training time significantly while maintaining a high level of accuracy in classifications.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    A Survey on Evolutionary Computation for Computer Vision and Image Analysis: Past, Present, and Future Trends

    Get PDF
    Computer vision (CV) is a big and important field in artificial intelligence covering a wide range of applications. Image analysis is a major task in CV aiming to extract, analyse and understand the visual content of images. However, imagerelated tasks are very challenging due to many factors, e.g., high variations across images, high dimensionality, domain expertise requirement, and image distortions. Evolutionary computation (EC) approaches have been widely used for image analysis with significant achievement. However, there is no comprehensive survey of existing EC approaches to image analysis. To fill this gap, this paper provides a comprehensive survey covering all essential EC approaches to important image analysis tasks including edge detection, image segmentation, image feature analysis, image classification, object detection, and others. This survey aims to provide a better understanding of evolutionary computer vision (ECV) by discussing the contributions of different approaches and exploring how and why EC is used for CV and image analysis. The applications, challenges, issues, and trends associated to this research field are also discussed and summarised to provide further guidelines and opportunities for future research

    Learning spatio-temporal representations for action recognition: A genetic programming approach

    Get PDF
    Extracting discriminative and robust features from video sequences is the first and most critical step in human action recognition. In this paper, instead of using handcrafted features, we automatically learn spatio-temporal motion features for action recognition. This is achieved via an evolutionary method, i.e., genetic programming (GP), which evolves the motion feature descriptor on a population of primitive 3D operators (e.g., 3D-Gabor and wavelet). In this way, the scale and shift invariant features can be effectively extracted from both color and optical flow sequences. We intend to learn data adaptive descriptors for different datasets with multiple layers, which makes fully use of the knowledge to mimic the physical structure of the human visual cortex for action recognition and simultaneously reduce the GP searching space to effectively accelerate the convergence of optimal solutions. In our evolutionary architecture, the average cross-validation classification error, which is calculated by an support-vector-machine classifier on the training set, is adopted as the evaluation criterion for the GP fitness function. After the entire evolution procedure finishes, the best-so-far solution selected by GP is regarded as the (near-)optimal action descriptor obtained. The GP-evolving feature extraction method is evaluated on four popular action datasets, namely KTH, HMDB51, UCF YouTube, and Hollywood2. Experimental results show that our method significantly outperforms other types of features, either hand-designed or machine-learned
    • …
    corecore