7,504 research outputs found

    Semantic distillation: a method for clustering objects by their contextual specificity

    Full text link
    Techniques for data-mining, latent semantic analysis, contextual search of databases, etc. have long ago been developed by computer scientists working on information retrieval (IR). Experimental scientists, from all disciplines, having to analyse large collections of raw experimental data (astronomical, physical, biological, etc.) have developed powerful methods for their statistical analysis and for clustering, categorising, and classifying objects. Finally, physicists have developed a theory of quantum measurement, unifying the logical, algebraic, and probabilistic aspects of queries into a single formalism. The purpose of this paper is twofold: first to show that when formulated at an abstract level, problems from IR, from statistical data analysis, and from physical measurement theories are very similar and hence can profitably be cross-fertilised, and, secondly, to propose a novel method of fuzzy hierarchical clustering, termed \textit{semantic distillation} -- strongly inspired from the theory of quantum measurement --, we developed to analyse raw data coming from various types of experiments on DNA arrays. We illustrate the method by analysing DNA arrays experiments and clustering the genes of the array according to their specificity.Comment: Accepted for publication in Studies in Computational Intelligence, Springer-Verla

    Evolutionary Algorithms for Community Detection in Continental-Scale High-Voltage Transmission Grids

    Get PDF
    Symmetry is a key concept in the study of power systems, not only because the admittance and Jacobian matrices used in power flow analysis are symmetrical, but because some previous studies have shown that in some real-world power grids there are complex symmetries. In order to investigate the topological characteristics of power grids, this paper proposes the use of evolutionary algorithms for community detection using modularity density measures on networks representing supergrids in order to discover densely connected structures. Two evolutionary approaches (generational genetic algorithm, GGA+, and modularity and improved genetic algorithm, MIGA) were applied. The results obtained in two large networks representing supergrids (European grid and North American grid) provide insights on both the structure of the supergrid and the topological differences between different regions. Numerical and graphical results show how these evolutionary approaches clearly outperform to the well-known Louvain modularity method. In particular, the average value of modularity obtained by GGA+ in the European grid was 0.815, while an average of 0.827 was reached in the North American grid. These results outperform those obtained by MIGA and Louvain methods (0.801 and 0.766 in the European grid and 0.813 and 0.798 in the North American grid, respectively)

    Randomized heuristics for the Capacitated Clustering Problem

    Get PDF
    In this paper, we investigate the adaptation of the Greedy Randomized Adaptive Search Procedure (GRASP) and Iterated Greedy methodologies to the Capacitated Clustering Problem (CCP). In particular, we focus on the effect of the balance between randomization and greediness on the performance of these multi-start heuristic search methods when solving this NP-hard problem. The former is a memory-less approach that constructs independent solutions, while the latter is a memory-based method that constructs linked solutions, obtained by partially rebuilding previous ones. Both are based on the combination of greediness and randomization in the constructive process, and coupled with a subsequent local search phase. We propose these two multi-start methods and their hybridization and compare their performance on the CCP. Additionally, we propose a heuristic based on the mathematical programming formulation of this problem, which constitutes a so-called matheuristic. We also implement a classical randomized method based on simulated annealing to complete the picture of randomized heuristics. Our extensive experimentation reveals that Iterated Greedy performs better than GRASP in this problem, and improved outcomes are obtained when both methods are hybridized and coupled with the matheuristic. In fact, the hybridization is able to outperform the best approaches previously published for the CCP. This study shows that memory-based construction is an effective mechanism within multi-start heuristic search techniques

    A nonmonotone GRASP

    Get PDF
    A greedy randomized adaptive search procedure (GRASP) is an itera- tive multistart metaheuristic for difficult combinatorial optimization problems. Each GRASP iteration consists of two phases: a construction phase, in which a feasible solution is produced, and a local search phase, in which a local optimum in the neighborhood of the constructed solution is sought. Repeated applications of the con- struction procedure yields different starting solutions for the local search and the best overall solution is kept as the result. The GRASP local search applies iterative improvement until a locally optimal solution is found. During this phase, starting from the current solution an improving neighbor solution is accepted and considered as the new current solution. In this paper, we propose a variant of the GRASP framework that uses a new “nonmonotone” strategy to explore the neighborhood of the current solu- tion. We formally state the convergence of the nonmonotone local search to a locally optimal solution and illustrate the effectiveness of the resulting Nonmonotone GRASP on three classical hard combinatorial optimization problems: the maximum cut prob- lem (MAX-CUT), the weighted maximum satisfiability problem (MAX-SAT), and the quadratic assignment problem (QAP)

    Construction of near-optimal vertex clique covering for real-world networks

    Get PDF
    We propose a method based on combining a constructive and a bounding heuristic to solve the vertex clique covering problem (CCP), where the aim is to partition the vertices of a graph into the smallest number of classes, which induce cliques. Searching for the solution to CCP is highly motivated by analysis of social and other real-world networks, applications in graph mining, as well as by the fact that CCP is one of the classical NP-hard problems. Combining the construction and the bounding heuristic helped us not only to find high-quality clique coverings but also to determine that in the domain of real-world networks, many of the obtained solutions are optimal, while the rest of them are near-optimal. In addition, the method has a polynomial time complexity and shows much promise for its practical use. Experimental results are presented for a fairly representative benchmark of real-world data. Our test graphs include extracts of web-based social networks, including some very large ones, several well-known graphs from network science, as well as coappearance networks of literary works' characters from the DIMACS graph coloring benchmark. We also present results for synthetic pseudorandom graphs structured according to the Erdös-Renyi model and Leighton's model

    A Tabu Search Based Approach for Graph Layout

    Get PDF
    This paper describes an automated tabu search based method for drawing general graph layouts with straight lines. To our knowledge, this is the first time tabu methods have been applied to graph drawing. We formulated the task as a multi-criteria optimization problem with a number of metrics which are used in a weighted fitness function to measure the aesthetic quality of the graph layout. The main goal of this work is to speed up the graph layout process without sacrificing layout quality. To achieve this, we use a tabu search based method that goes through a predefined number of iterations to minimize the value of the fitness function. Tabu search always chooses the best solution in the neighbourhood. This may lead to cycling, so a tabu list is used to store moves that are not permitted, meaning that the algorithm does not choose previous solutions for a set period of time. We evaluate the method according to the time spent to draw a graph and the quality of the drawn graphs. We give experimental results applied on random graphs and we provide statistical evidence that our method outperforms a fast search-based drawing method (hill climbing) in execution time while it produces comparably good graph layouts.We also demonstrate the method on real world graph datasets to show that we can reproduce similar results in a real world setting

    ASPECT: A spectra clustering tool for exploration of large spectral surveys

    Full text link
    We present the novel, semi-automated clustering tool ASPECT for analysing voluminous archives of spectra. The heart of the program is a neural network in form of Kohonen's self-organizing map. The resulting map is designed as an icon map suitable for the inspection by eye. The visual analysis is supported by the option to blend in individual object properties such as redshift, apparent magnitude, or signal-to-noise ratio. In addition, the package provides several tools for the selection of special spectral types, e.g. local difference maps which reflect the deviations of all spectra from one given input spectrum (real or artificial). ASPECT is able to produce a two-dimensional topological map of a huge number of spectra. The software package enables the user to browse and navigate through a huge data pool and helps him to gain an insight into underlying relationships between the spectra and other physical properties and to get the big picture of the entire data set. We demonstrate the capability of ASPECT by clustering the entire data pool of 0.6 million spectra from the Data Release 4 of the Sloan Digital Sky Survey (SDSS). To illustrate the results regarding quality and completeness we track objects from existing catalogues of quasars and carbon stars, respectively, and connect the SDSS spectra with morphological information from the GalaxyZoo project.Comment: 15 pages, 14 figures; accepted for publication in Astronomy and Astrophysic
    corecore