2 research outputs found
An ensemble based on a bi-objective evolutionary spectral algorithm for graph clustering
Graph clustering is a challenging pattern recognition problem whose goal is
to identify vertex partitions with high intra-group connectivity. This paper
investigates a bi-objective problem that maximizes the number of intra-cluster
edges of a graph and minimizes the expected number of inter-cluster edges in a
random graph with the same degree sequence as the original one. The difference
between the two investigated objectives is the definition of the well-known
measure of graph clustering quality: the modularity. We introduce a spectral
decomposition hybridized with an evolutionary heuristic, called MOSpecG, to
approach this bi-objective problem and an ensemble strategy to consolidate the
solutions found by MOSpecG into a final robust partition. The results of
computational experiments with real and artificial LFR networks demonstrated a
significant improvement in the results and performance of the introduced method
in regard to another bi-objective algorithm found in the literature. The
crossover operator based on the geometric interpretation of the modularity
maximization problem to match the communities of a pair of individuals was of
utmost importance for the good performance of MOSpecG. Hybridizing spectral
graph theory and intelligent systems allowed us to define significantly
high-quality community structures.Comment: Preprint accepted for publication in Expert Systems with Application
Importance measures derived from random forests: characterisation and extension
Nowadays new technologies, and especially artificial intelligence, are more
and more established in our society. Big data analysis and machine learning,
two sub-fields of artificial intelligence, are at the core of many recent
breakthroughs in many application fields (e.g., medicine, communication,
finance, ...), including some that are strongly related to our day-to-day life
(e.g., social networks, computers, smartphones, ...). In machine learning,
significant improvements are usually achieved at the price of an increasing
computational complexity and thanks to bigger datasets. Currently, cutting-edge
models built by the most advanced machine learning algorithms typically became
simultaneously very efficient and profitable but also extremely complex. Their
complexity is to such an extent that these models are commonly seen as
black-boxes providing a prediction or a decision which can not be interpreted
or justified. Nevertheless, whether these models are used autonomously or as a
simple decision-making support tool, they are already being used in machine
learning applications where health and human life are at stake. Therefore, it
appears to be an obvious necessity not to blindly believe everything coming out
of those models without a detailed understanding of their predictions or
decisions. Accordingly, this thesis aims at improving the interpretability of
models built by a specific family of machine learning algorithms, the so-called
tree-based methods. Several mechanisms have been proposed to interpret these
models and we aim along this thesis to improve their understanding, study their
properties, and define their limitations.Comment: PhD thesis, Li\`ege, Belgium, June 2019. Permalink :
http://hdl.handle.net/2268/23686