1,204 research outputs found

    Computational models and approaches for lung cancer diagnosis

    Full text link
    The success of treatment of patients with cancer depends on establishing an accurate diagnosis. To this end, the aim of this study is to developed novel lung cancer diagnostic models. New algorithms are proposed to analyse the biological data and extract knowledge that assists in achieving accurate diagnosis results

    Studying elements ofgenetic programming for multiclass classification

    Get PDF
    Tese de mestrado, Engenharia InformĂĄtica (Interação e Conhecimento) Universidade de Lisboa, Faculdade de CiĂȘncias, 2018Although Genetic Programming (GP) has been very successful in both symbolic regression and binary classification by solving many difficult problems from various domains, it requires improvements in multiclass classification, which due to the high complexity of this kind of problems, requires specialized classifiers. In this project, we explored a multiclass classification GP-based algorithm, the M3GP [4]. The individuals in standard GP only have one node at their root. This means that their output space is in R. Unlike standard GP, M3GP allows each individual to have n nodes at its root. This variation changes the output space to Rn, allowing them to construct clusters of samples and use a cluster-based classification. Although M3GP is capable of creating interpretable models while having competitive results with state-of-the-art classifiers, such as Random Forests and Neural Networks, it has downsides. The focus of this project is to improve the algorithm by exploring two components, the fitness function, and the genetic operators’ selection method. The original fitness function was accuracy-based. Since using this kind of functions does not allow a smooth evolution of the output space, we tried to improve the algorithm by exploring two distance-based fitness functions as an attempt to separate the clusters while bringing the samples closer to their respective centroids. Until now, the genetic operators in M3GP were selected with a fixed probability. Since some operators have a better effect on the fitness at different stages of the evolution, the fixed probabilities allow operators to be selected at the wrong stages of the evolution, slowing down the learning process. In this project, we try to evolve the probability the genetic operators have of being chosen over the generations. On a later stage, we proposed a new crossover genetic operator that uses three individuals for the M3GP algorithm. The results obtained show significantly better results in the training set in half the datasets, while improving the test accuracy in two datasets

    Hybrid ACO and SVM algorithm for pattern classification

    Get PDF
    Ant Colony Optimization (ACO) is a metaheuristic algorithm that can be used to solve a variety of combinatorial optimization problems. A new direction for ACO is to optimize continuous and mixed (discrete and continuous) variables. Support Vector Machine (SVM) is a pattern classification approach originated from statistical approaches. However, SVM suffers two main problems which include feature subset selection and parameter tuning. Most approaches related to tuning SVM parameters discretize the continuous value of the parameters which will give a negative effect on the classification performance. This study presents four algorithms for tuning the SVM parameters and selecting feature subset which improved SVM classification accuracy with smaller size of feature subset. This is achieved by performing the SVM parameters’ tuning and feature subset selection processes simultaneously. Hybridization algorithms between ACO and SVM techniques were proposed. The first two algorithms, ACOR-SVM and IACOR-SVM, tune the SVM parameters while the second two algorithms, ACOMV-R-SVM and IACOMV-R-SVM, tune the SVM parameters and select the feature subset simultaneously. Ten benchmark datasets from University of California, Irvine, were used in the experiments to validate the performance of the proposed algorithms. Experimental results obtained from the proposed algorithms are better when compared with other approaches in terms of classification accuracy and size of the feature subset. The average classification accuracies for the ACOR-SVM, IACOR-SVM, ACOMV-R and IACOMV-R algorithms are 94.73%, 95.86%, 97.37% and 98.1% respectively. The average size of feature subset is eight for the ACOR-SVM and IACOR-SVM algorithms and four for the ACOMV-R and IACOMV-R algorithms. This study contributes to a new direction for ACO that can deal with continuous and mixed-variable ACO

    A hybrid neural network and genetic programming approach to the automatic construction of computer vision systems

    Get PDF
    Both genetic programming and neural networks are machine learning techniques that have had a wide range of success in the world of computer vision. Recently, neural networks have been able to achieve excellent results on problems that even just ten years ago would have been considered intractable, especially in the area of image classification. Additionally, genetic programming has been shown capable of evolving computer vision programs that are capable of classifying objects in images using conventional computer vision operators. While genetic algorithms have been used to evolve neural network structures and tune the hyperparameters of said networks, this thesis explores an alternative combination of these two techniques. The author asks if integrating trained neural networks with genetic programming, by framing said networks as components for a computer vision system evolver, would increase the final classification accuracy of the evolved classifier. The author also asks that if so, can such a system learn to assemble multiple simple neural networks to solve a complex problem. No claims are made to having discovered a new state of the art method for classification. Instead, the main focus of this research was to learn if it is possible to combine these two techniques in this manner. The results presented from this research indicate that such a combination does improve accuracy compared to a vision system evolved without the use of these networks

    On learning and visualizing lexicographic preference trees

    Get PDF
    Preferences are very important in research fields such as decision making, recommendersystemsandmarketing. The focus of this thesis is on preferences over combinatorial domains, which are domains of objects configured with categorical attributes. For example, the domain of cars includes car objects that are constructed withvaluesforattributes, such as ‘make’, ‘year’, ‘model’, ‘color’, ‘body type’ and ‘transmission’.Different values can instantiate an attribute. For instance, values for attribute ‘make’canbeHonda, Toyota, Tesla or BMW, and attribute ‘transmission’ can haveautomaticormanual. To this end,thisthesis studiesproblemsonpreference visualization and learning for lexicographic preference trees, graphical preference models that often are compact over complex domains of objects built of categorical attributes. Visualizing preferences is essential to provide users with insights into the process of decision making, while learning preferences from data is practically important, as it is ineffective to elicit preference models directly from users. The results obtained from this thesis are two parts: 1) for preference visualization, aweb- basedsystem is created that visualizes various types of lexicographic preference tree models learned by a greedy learning algorithm; 2) for preference learning, a genetic algorithm is designed and implemented, called GA, that learns a restricted type of lexicographic preference tree, called unconditional importance and unconditional preference tree, or UIUP trees for short. Experiments show that GA achieves higher accuracy compared to the greedy algorithm at the cost of more computational time. Moreover, a Dynamic Programming Algorithm (DPA) was devised and implemented that computes an optimal UIUP tree model in the sense that it satisfies as many examples as possible in the dataset. This novel exact algorithm (DPA), was used to evaluate the quality of models computed by GA, and it was found to reduce the factorial time complexity of the brute force algorithm to exponential. The major contribution to the field of machine learning and data mining in this thesis would be the novel learning algorithm (DPA) which is an exact algorithm. DPA learns and finds the best UIUP tree model in the huge search space which classifies accurately the most number of examples in the training dataset; such model is referred to as the optimal model in this thesis. Finally, using datasets produced from randomly generated UIUP trees, this thesis presents experimental results on the performances (e.g., accuracy and computational time) of GA compared to the existent greedy algorithm and DPA
    • 

    corecore