2 research outputs found

    An ensemble based on a bi-objective evolutionary spectral algorithm for graph clustering

    Full text link
    Graph clustering is a challenging pattern recognition problem whose goal is to identify vertex partitions with high intra-group connectivity. This paper investigates a bi-objective problem that maximizes the number of intra-cluster edges of a graph and minimizes the expected number of inter-cluster edges in a random graph with the same degree sequence as the original one. The difference between the two investigated objectives is the definition of the well-known measure of graph clustering quality: the modularity. We introduce a spectral decomposition hybridized with an evolutionary heuristic, called MOSpecG, to approach this bi-objective problem and an ensemble strategy to consolidate the solutions found by MOSpecG into a final robust partition. The results of computational experiments with real and artificial LFR networks demonstrated a significant improvement in the results and performance of the introduced method in regard to another bi-objective algorithm found in the literature. The crossover operator based on the geometric interpretation of the modularity maximization problem to match the communities of a pair of individuals was of utmost importance for the good performance of MOSpecG. Hybridizing spectral graph theory and intelligent systems allowed us to define significantly high-quality community structures.Comment: Preprint accepted for publication in Expert Systems with Application

    Importance measures derived from random forests: characterisation and extension

    Full text link
    Nowadays new technologies, and especially artificial intelligence, are more and more established in our society. Big data analysis and machine learning, two sub-fields of artificial intelligence, are at the core of many recent breakthroughs in many application fields (e.g., medicine, communication, finance, ...), including some that are strongly related to our day-to-day life (e.g., social networks, computers, smartphones, ...). In machine learning, significant improvements are usually achieved at the price of an increasing computational complexity and thanks to bigger datasets. Currently, cutting-edge models built by the most advanced machine learning algorithms typically became simultaneously very efficient and profitable but also extremely complex. Their complexity is to such an extent that these models are commonly seen as black-boxes providing a prediction or a decision which can not be interpreted or justified. Nevertheless, whether these models are used autonomously or as a simple decision-making support tool, they are already being used in machine learning applications where health and human life are at stake. Therefore, it appears to be an obvious necessity not to blindly believe everything coming out of those models without a detailed understanding of their predictions or decisions. Accordingly, this thesis aims at improving the interpretability of models built by a specific family of machine learning algorithms, the so-called tree-based methods. Several mechanisms have been proposed to interpret these models and we aim along this thesis to improve their understanding, study their properties, and define their limitations.Comment: PhD thesis, Li\`ege, Belgium, June 2019. Permalink : http://hdl.handle.net/2268/23686
    corecore