5,162 research outputs found
Data fusion by using machine learning and computational intelligence techniques for medical image analysis and classification
Data fusion is the process of integrating information from multiple sources to produce specific, comprehensive, unified data about an entity. Data fusion is categorized as low level, feature level and decision level. This research is focused on both investigating and developing feature- and decision-level data fusion for automated image analysis and classification. The common procedure for solving these problems can be described as: 1) process image for region of interest\u27 detection, 2) extract features from the region of interest and 3) create learning model based on the feature data. Image processing techniques were performed using edge detection, a histogram threshold and a color drop algorithm to determine the region of interest. The extracted features were low-level features, including textual, color and symmetrical features. For image analysis and classification, feature- and decision-level data fusion techniques are investigated for model learning using and integrating computational intelligence and machine learning techniques. These techniques include artificial neural networks, evolutionary algorithms, particle swarm optimization, decision tree, clustering algorithms, fuzzy logic inference, and voting algorithms. This work presents both the investigation and development of data fusion techniques for the application areas of dermoscopy skin lesion discrimination, content-based image retrieval, and graphic image type classification --Abstract, page v
Hybrid ACO and SVM algorithm for pattern classification
Ant Colony Optimization (ACO) is a metaheuristic algorithm that can be used to
solve a variety of combinatorial optimization problems. A new direction for ACO is to optimize continuous and mixed (discrete and continuous) variables. Support Vector Machine (SVM) is a pattern classification approach originated from statistical approaches. However, SVM suffers two main problems which include feature subset selection and parameter tuning. Most approaches related to tuning SVM parameters discretize the continuous value of the parameters which will give a negative effect on the classification performance. This study presents four algorithms for tuning the
SVM parameters and selecting feature subset which improved SVM classification accuracy with smaller size of feature subset. This is achieved by performing the SVM parameters’ tuning and feature subset selection processes simultaneously. Hybridization algorithms between ACO and SVM techniques were proposed. The first two algorithms, ACOR-SVM and IACOR-SVM, tune the SVM parameters while
the second two algorithms, ACOMV-R-SVM and IACOMV-R-SVM, tune the SVM parameters and select the feature subset simultaneously. Ten benchmark datasets from University of California, Irvine, were used in the experiments to validate the performance of the proposed algorithms. Experimental results obtained from the proposed algorithms are better when compared with other approaches in terms of classification accuracy and size of the feature subset. The average classification
accuracies for the ACOR-SVM, IACOR-SVM, ACOMV-R and IACOMV-R algorithms are 94.73%, 95.86%, 97.37% and 98.1% respectively. The average size of feature subset is eight for the ACOR-SVM and IACOR-SVM algorithms and four for the ACOMV-R and IACOMV-R algorithms. This study contributes to a new direction for ACO that can deal with continuous and mixed-variable ACO
One-Class Classification: Taxonomy of Study and Review of Techniques
One-class classification (OCC) algorithms aim to build classification models
when the negative class is either absent, poorly sampled or not well defined.
This unique situation constrains the learning of efficient classifiers by
defining class boundary just with the knowledge of positive class. The OCC
problem has been considered and applied under many research themes, such as
outlier/novelty detection and concept learning. In this paper we present a
unified view of the general problem of OCC by presenting a taxonomy of study
for OCC problems, which is based on the availability of training data,
algorithms used and the application domains applied. We further delve into each
of the categories of the proposed taxonomy and present a comprehensive
literature review of the OCC algorithms, techniques and methodologies with a
focus on their significance, limitations and applications. We conclude our
paper by discussing some open research problems in the field of OCC and present
our vision for future research.Comment: 24 pages + 11 pages of references, 8 figure
Topic Models Conditioned on Arbitrary Features with Dirichlet-multinomial Regression
Although fully generative models have been successfully used to model the
contents of text documents, they are often awkward to apply to combinations of
text data and document metadata. In this paper we propose a
Dirichlet-multinomial regression (DMR) topic model that includes a log-linear
prior on document-topic distributions that is a function of observed features
of the document, such as author, publication venue, references, and dates. We
show that by selecting appropriate features, DMR topic models can meet or
exceed the performance of several previously published topic models designed
for specific data.Comment: Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty
in Artificial Intelligence (UAI2008
Effect of nano black rice husk ash on the chemical and physical properties of porous concrete pavement
Black rice husk is a waste from this agriculture industry. It has been found that majority inorganic element in rice husk is silica. In this study, the effect of Nano from black rice husk ash (BRHA) on the chemical and physical properties of concrete pavement was investigated. The BRHA produced from uncontrolled burning at rice factory was taken. It was then been ground using laboratory mill with steel balls and steel rods. Four different grinding grades of BRHA were examined. A rice husk ash dosage of 10% by weight of binder was used throughout the experiments. The chemical and physical properties of the Nano BRHA mixtures were evaluated using fineness test, X-ray Fluorescence spectrometer (XRF) and X-ray diffraction (XRD). In addition, the compressive strength test was used to evaluate the performance of porous concrete pavement. Generally, the results show that the optimum grinding time was 63 hours. The result also indicated that the use of Nano black rice husk ash ground for 63hours produced concrete with good strengt
New techniques for Arabic document classification
Text classification (TC) concerns automatically assigning a class (category) label to
a text document, and has increasingly many applications, particularly in the domain
of organizing, for browsing in large document collections. It is typically achieved
via machine learning, where a model is built on the basis of a typically large collection
of document features. Feature selection is critical in this process, since there
are typically several thousand potential features (distinct words or terms). In text
classification, feature selection aims to improve the computational e ciency and
classification accuracy by removing irrelevant and redundant terms (features), while
retaining features (words) that contain su cient information that help with the
classification task.
This thesis proposes binary particle swarm optimization (BPSO) hybridized with
either K Nearest Neighbour (KNN) or Support Vector Machines (SVM) for feature
selection in Arabic text classi cation tasks. Comparison between feature selection
approaches is done on the basis of using the selected features in conjunction with
SVM, Decision Trees (C4.5), and Naive Bayes (NB), to classify a hold out test
set. Using publically available Arabic datasets, results show that BPSO/KNN and
BPSO/SVM techniques are promising in this domain. The sets of selected features
(words) are also analyzed to consider the di erences between the types of features
that BPSO/KNN and BPSO/SVM tend to choose. This leads to speculation concerning
the appropriate feature selection strategy, based on the relationship between
the classes in the document categorization task at hand.
The thesis also investigates the use of statistically extracted phrases of length
two as terms in Arabic text classi cation. In comparison with Bag of Words text
representation, results show that using phrases alone as terms in Arabic TC task
decreases the classification accuracy of Arabic TC classifiers significantly while combining
bag of words and phrase based representations may increase the classification
accuracy of the SVM classifier slightly
- …