29 research outputs found
Lexicase selection in Learning Classifier Systems
The lexicase parent selection method selects parents by considering
performance on individual data points in random order instead of using a
fitness function based on an aggregated data accuracy. While the method has
demonstrated promise in genetic programming and more recently in genetic
algorithms, its applications in other forms of evolutionary machine learning
have not been explored. In this paper, we investigate the use of lexicase
parent selection in Learning Classifier Systems (LCS) and study its effect on
classification problems in a supervised setting. We further introduce a new
variant of lexicase selection, called batch-lexicase selection, which allows
for the tuning of selection pressure. We compare the two lexicase selection
methods with tournament and fitness proportionate selection methods on binary
classification problems. We show that batch-lexicase selection results in the
creation of more generic rules which is favorable for generalization on future
data. We further show that batch-lexicase selection results in better
generalization in situations of partial or missing data.Comment: Genetic and Evolutionary Computation Conference, 201
Benchmarking a Genetic Algorithm with Configurable Crossover Probability
We investigate a family of Genetic Algorithms (GAs) which
creates offspring either from mutation or by recombining two randomly chosen
parents. By scaling the crossover probability, we can thus interpolate from a
fully mutation-only algorithm towards a fully crossover-based GA. We analyze,
by empirical means, how the performance depends on the interplay of population
size and the crossover probability.
Our comparison on 25 pseudo-Boolean optimization problems reveals an
advantage of crossover-based configurations on several easy optimization tasks,
whereas the picture for more complex optimization problems is rather mixed.
Moreover, we observe that the ``fast'' mutation scheme with its are power-law
distributed mutation strengths outperforms standard bit mutation on complex
optimization tasks when it is combined with crossover, but performs worse in
the absence of crossover.
We then take a closer look at the surprisingly good performance of the
crossover-based GAs on the well-known LeadingOnes benchmark
problem. We observe that the optimal crossover probability increases with
increasing population size . At the same time, it decreases with
increasing problem dimension, indicating that the advantages of the crossover
are not visible in the asymptotic view classically applied in runtime analysis.
We therefore argue that a mathematical investigation for fixed dimensions might
help us observe effects which are not visible when focusing exclusively on
asymptotic performance bounds
A Stochastic Online Forecast-and-Optimize Framework for Real-Time Energy Dispatch in Virtual Power Plants under Uncertainty
Aggregating distributed energy resources in power systems significantly
increases uncertainties, in particular caused by the fluctuation of renewable
energy generation. This issue has driven the necessity of widely exploiting
advanced predictive control techniques under uncertainty to ensure long-term
economics and decarbonization. In this paper, we propose a real-time
uncertainty-aware energy dispatch framework, which is composed of two key
elements: (i) A hybrid forecast-and-optimize sequential task, integrating deep
learning-based forecasting and stochastic optimization, where these two stages
are connected by the uncertainty estimation at multiple temporal resolutions;
(ii) An efficient online data augmentation scheme, jointly involving model
pre-training and online fine-tuning stages. In this way, the proposed framework
is capable to rapidly adapt to the real-time data distribution, as well as to
target on uncertainties caused by data drift, model discrepancy and environment
perturbations in the control process, and finally to realize an optimal and
robust dispatch solution. The proposed framework won the championship in
CityLearn Challenge 2022, which provided an influential opportunity to
investigate the potential of AI application in the energy domain. In addition,
comprehensive experiments are conducted to interpret its effectiveness in the
real-life scenario of smart building energy management.Comment: Preprint. Accepted by CIKM 2
Evolutionary Strategies for the Design of Binary Linear Codes
The design of binary error-correcting codes is a challenging optimization
problem with several applications in telecommunications and storage, which has
also been addressed with metaheuristic techniques and evolutionary algorithms.
Still, all these efforts focused on optimizing the minimum distance of
unrestricted binary codes, i.e., with no constraints on their linearity, which
is a desirable property for efficient implementations. In this paper, we
present an Evolutionary Strategy (ES) algorithm that explores only the subset
of linear codes of a fixed length and dimension. To that end, we represent the
candidate solutions as binary matrices and devise variation operators that
preserve their ranks. Our experiments show that up to length , our ES
always converges to an optimal solution with a full success rate, and the
evolved codes are all inequivalent to the Best-Known Linear Code (BKLC) given
by MAGMA. On the other hand, for larger lengths, both the success rate of the
ES as well as the diversity of the evolved codes start to drop, with the
extreme case of codes which all turn out to be equivalent to MAGMA's
BKLC.Comment: 15 pages, 3 figures, 3 table
Big Data Optimization : Algorithmic Framework for Data Analysis Guided by Semantics
Fecha de Lectura de Tesis: 9 noviembre 2018.Over the past decade the rapid rise of creating data in all domains of knowledge such as traffic, medicine, social network, industry, etc., has highlighted the need for enhancing the process of analyzing large data volumes, in order to be able to manage them with more easiness and in addition, discover new relationships which are hidden in them
Optimization problems, which are commonly found in current industry, are not unrelated to this trend, therefore Multi-Objective Optimization Algorithms (MOA) should bear in mind this new scenario. This means that, MOAs have to deal with problems, which have either various data sources (typically streaming) of huge amount of data. Indeed these features, in particular, are found in Dynamic Multi-Objective Problems (DMOPs), which are related to Big Data optimization problems. Mostly with regards to velocity and variability. When dealing with DMOPs, whenever there exist changes in the environment that affect the solutions of the problem (i.e., the Pareto set, the Pareto front, or both), therefore in the fitness landscape, the optimization algorithm must react to adapt the search to the new features of the problem.
Big Data analytics are long and complex processes therefore, with the aim of simplify them, a series of steps are carried out through. A typical analysis is composed of data collection, data manipulation, data analysis and finally result visualization.
In the process of creating a Big Data workflow the analyst should bear in mind the semantics involving the problem domain knowledge and its data. Ontology is the standard way for describing the knowledge about a domain.
As a global target of this PhD Thesis, we are interested in investigating the use of the semantic in the process of Big Data analysis, not only focused on machine learning analysis, but also in optimization
Real-time pedestrian recognition on low computational resources
Pedestrian recognition has successfully been applied to security, autonomous
cars, Aerial photographs. For most applications, pedestrian recognition on
small mobile devices is important. However, the limitations of the computing
hardware make this a challenging task. In this work, we investigate real-time
pedestrian recognition on small physical-size computers with low computational
resources for faster speed. This paper presents three methods that work on the
small physical size CPUs system. First, we improved the Local Binary Pattern
(LBP) features and Adaboost classifier. Second, we optimized the Histogram of
Oriented Gradients (HOG) and Support Vector Machine. Third, We implemented fast
Convolutional Neural Networks (CNNs). The results demonstrate that the three
methods achieved real-time pedestrian recognition at an accuracy of more than
95% and a speed of more than 5 fps on a small physical size computational
platform with a 1.8 GHz Intel i5 CPU. Our methods can be easily applied to
small mobile devices with high compatibility and generality
More effective randomized search heuristics for graph coloring through dynamic optimization
Dynamic optimization problems have gained significant attention in evolutionary computation as evolutionary algorithms (EAs) can easily adapt to changing environments. We show that EAs can solve the graph coloring problem for bipartite graphs more efficiently by using dynamic optimization. In our approach the graph instance is given incrementally such that the EA can reoptimize its coloring when a new edge introduces a conflict. We show that, when edges are inserted in a way that preserves graph connectivity, Randomized Local Search (RLS) efficiently finds a proper 2-coloring for all bipartite graphs. This includes graphs for which RLS and other EAs need exponential expected time in a static optimization scenario. We investigate different ways of building up the graph by popular graph traversals such as breadth-first-search and depth-first-search and analyse the resulting runtime behavior. We further show that offspring populations (e. g. a (1 + 位) RLS) lead to an exponential speedup in 位. Finally, an island model using 3 islands succeeds in an optimal time of 螛(m) on every m-edge bipartite graph, outperforming offspring populations. This is the first example where an island model guarantees a speedup that is not bounded in the number of islands