1,273 research outputs found
An Improvement Study of the Decomposition-based Algorithm Global WASF-GA for Evolutionary Multiobjective Optimization
The convergence and the diversity of the decompositionbased evolutionary algorithm Global WASF-GA (GWASF-GA) relies
on a set of weight vectors that determine the search directions for new non-dominated solutions in the objective space. Although using weight vectors whose search directions are widely distributed may lead to a well-diversified approximation of the Pareto front (PF), this may not be enough to obtain a good approximation for complicated PFs (discontinuous, non-convex, etc.). Thus, we propose to dynamically adjust the weight vectors once GWASF-GA has been run for a certain number of generations. This adjustment is aimed at re-calculating some of the weight vectors, so that search directions pointing to overcrowded regions of the PF are redirected toward parts with a lack of solutions that may be hard to be approximated. We test different parameters settings of the dynamic adjustment in optimization problems with three, five, and six objectives, concluding that GWASF-GA performs better when adjusting the weight vectors dynamically than without applying the adjustment.Universidad de Málaga. Campus de Excelencia Internacional AndalucĂa Tech
MACOC: a medoid-based ACO clustering algorithm
The application of ACO-based algorithms in data mining is growing over the last few years and several supervised and unsupervised learning algorithms have been developed using this bio-inspired approach. Most recent works concerning unsupervised learning have been focused on clustering, showing great potential of ACO-based techniques. This work presents an ACO-based clustering algorithm inspired by the ACO Clustering (ACOC) algorithm. The proposed approach restructures ACOC from a centroid-based technique to a medoid-based technique, where the properties of the search space are not necessarily known. Instead, it only relies on the information about the distances amongst data. The new algorithm, called MACOC, has been compared against well-known algorithms (K-means and Partition Around Medoids) and with ACOC. The experiments measure the accuracy of the algorithm for both synthetic datasets and real-world datasets extracted from the UCI Machine Learning Repository
A rigorous evaluation of crossover and mutation in genetic programming
The role of crossover and mutation in Genetic Programming (GP) has been the subject of much debate since the emergence of the field. In this paper, we contribute new empirical evidence to this argument using a rigorous and principled experimental method applied to six problems common in the GP literature. The approach tunes the algorithm parameters to enable a fair and objective comparison of two different GP algorithms, the first using a combination of crossover and reproduction, and secondly using a combination of mutation and reproduction. We find that crossover does not significantly outperform mutation on most of the problems examined. In addition, we demonstrate that the use of a straightforward Design of Experiments methodology is effective at tuning GP algorithm parameters
Searching for dark clouds in the outer galactic plane I -- A statistical approach for identifying extended red(dened) regions in 2MASS
[Abridged] Though the exact role of infrared dark clouds in the formation
process is still somewhat unclear, they seem to provide useful laboratories to
study the very early stages of clustered star formation. Infrared dark clouds
have been identified predominantly toward the bright inner parts of the
galactic plane. The low background emission makes it more difficult to identify
similar objects in mid-infrared absorption in the outer parts. This is
unfortunate, because the outer Galaxy represents the only nearby region where
we can study effects of different (external) conditions on the star formation
process. The aim of this paper is to identify extended red regions in the outer
galactic plane based on reddening of stars in the near-infrared. We argue that
these regions appear reddened mainly due to extinction caused by molecular
clouds and young stellar objects. The work presented here is used as a basis
for identifying star forming regions and in particular the very early stages.
We use the Mann-Whitney U-test, in combination with a friends-of-friends
algorithm, to identify extended reddened regions in the 2MASS all-sky JHK
survey. We process the data on a regular grid using two different resolutions,
60" and 90". The two resolutions have been chosen because the stellar surface
density varies between the crowded spiral arm regions and the sparsely
populated galactic anti-center region. We identify 1320 extended red regions at
the higher resolution and 1589 at the lower resolution run. The majority of
regions are associated with major molecular cloud complexes, supporting our
hypothesis that the reddening is mostly due to foreground clouds and embedded
objects.Comment: Accepted for publication in A&A -- 9 pages, 5 figures (+ on-line only
tables
Recommended from our members
Studies of photoredox reactions on nanosize semiconductors
Light induced electron transfer (ET) from nanosize semiconductors Of MoS{sub 2} to organic electron acceptors such as 2,2`-bipyridine (bpy) and methyl substituted 4,4`,5,5`-tetramethyl- 2,2`-bipyridine (tmb) was studied by static and time resolved photoluminescence spectroscopy. The kinetics of ET were varied by changing the nanocluster size (the band gap), the electron acceptor, and the polarity of the solvent. MoS{sub 2} is an especially interesting semiconductor material as it is an indirect semiconductor in bulk form, and has a layered covalent bonding arrangement which is highly resistant to photocorrosion
A Layer-Wise Information Reinforcement Approach to Improve Learning in Deep Belief Networks
With the advent of deep learning, the number of works proposing new methods
or improving existent ones has grown exponentially in the last years. In this
scenario, "very deep" models were emerging, once they were expected to extract
more intrinsic and abstract features while supporting a better performance.
However, such models suffer from the gradient vanishing problem, i.e.,
backpropagation values become too close to zero in their shallower layers,
ultimately causing learning to stagnate. Such an issue was overcome in the
context of convolution neural networks by creating "shortcut connections"
between layers, in a so-called deep residual learning framework. Nonetheless, a
very popular deep learning technique called Deep Belief Network still suffers
from gradient vanishing when dealing with discriminative tasks. Therefore, this
paper proposes the Residual Deep Belief Network, which considers the
information reinforcement layer-by-layer to improve the feature extraction and
knowledge retaining, that support better discriminative performance.
Experiments conducted over three public datasets demonstrate its robustness
concerning the task of binary image classification
Determining appropriate approaches for using data in feature selection
Feature selection is increasingly important in data analysis and machine learning in big data era. However, how to use the data in feature selection, i.e. using either ALL or PART of a dataset, has become a serious and tricky issue. Whilst the conventional practice of using all the data in feature selection may lead to selection bias, using part of the data may, on the other hand, lead to underestimating the relevant features under some conditions. This paper investigates these two strategies systematically in terms of reliability and effectiveness, and then determines their suitability for datasets with different characteristics. The reliability is measured by the Average Tanimoto Index and the Inter-method Average Tanimoto Index, and the effectiveness is measured by the mean generalisation accuracy of classification. The computational experiments are carried out on ten real-world benchmark datasets and fourteen synthetic datasets. The synthetic datasets are generated with a pre-set number of relevant features and varied numbers of irrelevant features and instances, and added with different levels of noise. The results indicate that the PART approach is more effective in reducing the bias when the size of a dataset is small but starts to lose its advantage as the dataset size increases
Estimating the F1 score for learning from positive and unlabeled examples
Semi-supervised learning can be applied to datasets that contain both labeled and unlabeled instances and can result in more accurate predictions compared to fully supervised or unsupervised learning in case limited labeled data is available. A subclass of problems, called Positive-Unlabeled (PU) learning, focuses on cases in which the labeled insta
Randomized Reference Classifier with Gaussian Distribution and Soft Confusion Matrix Applied to the Improving Weak Classifiers
In this paper, an issue of building the RRC model using probability
distributions other than beta distribution is addressed. More precisely, in
this paper, we propose to build the RRR model using the truncated normal
distribution. Heuristic procedures for expected value and the variance of the
truncated-normal distribution are also proposed. The proposed approach is
tested using SCM-based model for testing the consequences of applying the
truncated normal distribution in the RRC model. The experimental evaluation is
performed using four different base classifiers and seven quality measures. The
results showed that the proposed approach is comparable to the RRC model built
using beta distribution. What is more, for some base classifiers, the
truncated-normal-based SCM algorithm turned out to be better at discovering
objects coming from minority classes.Comment: arXiv admin note: text overlap with arXiv:1901.0882
BGrowth: an efficient approach for the segmentation of vertebral compression fractures in magnetic resonance imaging
Segmentation of medical images is a critical issue: several process of
analysis and classification rely on this segmentation. With the growing number
of people presenting back pain and problems related to it, the automatic or
semi-automatic segmentation of fractured vertebral bodies became a challenging
task. In general, those fractures present several regions with non-homogeneous
intensities and the dark regions are quite similar to the structures nearby.
Aimed at overriding this challenge, in this paper we present a semi-automatic
segmentation method, called Balanced Growth (BGrowth). The experimental results
on a dataset with 102 crushed and 89 normal vertebrae show that our approach
significantly outperforms well-known methods from the literature. We have
achieved an accuracy up to 95% while keeping acceptable processing time
performance, that is equivalent to the state-of-the-artmethods. Moreover,
BGrowth presents the best results even with a rough (sloppy) manual annotation
(seed points).Comment: This is a pre-print of an article published in Symposium on Applied
Computing. The final authenticated version is available online at
https://doi.org/10.1145/3297280.329972
- …