197 research outputs found

    Community Detection in Networks using Bio-inspired Optimization: Latest Developments, New Results and Perspectives with a Selection of Recent Meta-Heuristics

    Get PDF
    Detecting groups within a set of interconnected nodes is a widely addressed prob- lem that can model a diversity of applications. Unfortunately, detecting the opti- mal partition of a network is a computationally demanding task, usually conducted by means of optimization methods. Among them, randomized search heuristics have been proven to be efficient approaches. This manuscript is devoted to pro- viding an overview of community detection problems from the perspective of bio-inspired computation. To this end, we first review the recent history of this research area, placing emphasis on milestone studies contributed in the last five years. Next, we present an extensive experimental study to assess the performance of a selection of modern heuristics over weighted directed network instances. Specifically, we combine seven global search heuristics based on two different similarity metrics and eight heterogeneous search operators designed ad-hoc. We compare our methods with six different community detection techniques over a benchmark of 17 Lancichinetti-Fortunato-Radicchi network instances. Ranking statistics of the tested algorithms reveal that the proposed methods perform com- petitively, but the high variability of the rankings leads to the main conclusion: no clear winner can be declared. This finding aligns with community detection tools available in the literature that hinge on a sequential application of different algorithms in search for the best performing counterpart. We end our research by sharing our envisioned status of this area, for which we identify challenges and opportunities which should stimulate research efforts in years to come

    Optimizing community detection in social networks using antlion and K-median

    Get PDF
    Antlion Optimization (ALO) is one of the latest population based optimization methods that proved its good performance in a variety of applications. The ALO algorithm copies the hunting mechanism of antlions to ants in nature. Community detection in social networks is conclusive to understanding the concepts of the networks. Identifying network communities can be viewed as a problem of clustering a set of nodes into communities. k-median clustering is one of the popular techniques that has been applied in clustering. The problem of clustering network can be formalized as an optimization problem where a qualitatively objective function that captures the intuition of a cluster as a set of nodes with better in ternal connectivity than external connectivity is selected to be optimized. In this paper, a mixture antlion optimization and k-median for solving the community detection problem is proposed and named as K-median Modularity ALO. Experimental results which are applied on real life networks show the ability of the mixture antlion optimization and k-median to detect successfully an optimized community structure based on putting the modularity as an objective function

    Reading the news through its structure: new hybrid connectivity based approaches

    Get PDF
    In this thesis a solution for the problem of identifying the structure of news published by online newspapers is presented. This problem requires new approaches and algorithms that are capable of dealing with the massive number of online publications in existence (and that will grow in the future). The fact that news documents present a high degree of interconnection makes this an interesting and hard problem to solve. The identification of the structure of the news is accomplished both by descriptive methods that expose the dimensionality of the relations between different news, and by clustering the news into topic groups. To achieve this analysis this integrated whole was studied using different perspectives and approaches. In the identification of news clusters and structure, and after a preparatory data collection phase, where several online newspapers from different parts of the globe were collected, two newspapers were chosen in particular: the Portuguese daily newspaper Público and the British newspaper The Guardian. In the first case, it was shown how information theory (namely variation of information) combined with adaptive networks was able to identify topic clusters in the news published by the Portuguese online newspaper Público. In the second case, the structure of news published by the British newspaper The Guardian is revealed through the construction of time series of news clustered by a kmeans process. After this approach an unsupervised algorithm, that filters out irrelevant news published online by taking into consideration the connectivity of the news labels entered by the journalists, was developed. This novel hybrid technique is based on Qanalysis for the construction of the filtered network followed by a clustering technique to identify the topical clusters. Presently this work uses a modularity optimisation clustering technique but this step is general enough that other hybrid approaches can be used without losing generality. A novel second order swarm intelligence algorithm based on Ant Colony Systems was developed for the travelling salesman problem that is consistently better than the traditional benchmarks. This algorithm is used to construct Hamiltonian paths over the news published using the eccentricity of the different documents as a measure of distance. This approach allows for an easy navigation between published stories that is dependent on the connectivity of the underlying structure. The results presented in this work show the importance of taking topic detection in large corpora as a multitude of relations and connectivities that are not in a static state. They also influence the way of looking at multi-dimensional ensembles, by showing that the inclusion of the high dimension connectivities gives better results to solving a particular problem as was the case in the clustering problem of the news published online.Neste trabalho resolvemos o problema da identificação da estrutura das notícias publicadas em linha por jornais e agências noticiosas. Este problema requer novas abordagens e algoritmos que sejam capazes de lidar com o número crescente de publicações em linha (e que se espera continuam a crescer no futuro). Este facto, juntamente com o elevado grau de interconexão que as notícias apresentam tornam este problema num problema interessante e de difícil resolução. A identificação da estrutura do sistema de notícias foi conseguido quer através da utilização de métodos descritivos que expõem a dimensão das relações existentes entre as diferentes notícias, quer através de algoritmos de agrupamento das mesmas em tópicos. Para atingir este objetivo foi necessário proceder a ao estudo deste sistema complexo sob diferentes perspectivas e abordagens. Após uma fase preparatória do corpo de dados, onde foram recolhidos diversos jornais publicados online optou-se por dois jornais em particular: O Público e o The Guardian. A escolha de jornais em línguas diferentes deve-se à vontade de encontrar estratégias de análise que sejam independentes do conhecimento prévio que se tem sobre estes sistemas. Numa primeira análise é empregada uma abordagem baseada em redes adaptativas e teoria de informação (nomeadamente variação de informação) para identificar tópicos noticiosos que são publicados no jornal português Público. Numa segunda abordagem analisamos a estrutura das notícias publicadas pelo jornal Britânico The Guardian através da construção de séries temporais de notícias. Estas foram seguidamente agrupadas através de um processo de k-means. Para além disso desenvolveuse um algoritmo que permite filtrar de forma não supervisionada notícias irrelevantes que apresentam baixa conectividade às restantes notícias através da utilização de Q-analysis seguida de um processo de clustering. Presentemente este método utiliza otimização de modularidade, mas a técnica é suficientemente geral para que outras abordagens híbridas possam ser utilizadas sem perda de generalidade do método. Desenvolveu-se ainda um novo algoritmo baseado em sistemas de colónias de formigas para solução do problema do caixeiro viajante que consistentemente apresenta resultados melhores que os tradicionais bancos de testes. Este algoritmo foi aplicado na construção de caminhos Hamiltonianos das notícias publicadas utilizando a excentricidade obtida a partir da conectividade do sistema estudado como medida da distância entre notícias. Esta abordagem permitiu construir um sistema de navegação entre as notícias publicadas que é dependente da conectividade observada na estrutura de notícias encontrada. Os resultados apresentados neste trabalho mostram a importância de analisar sistemas complexos na sua multitude de relações e conectividades que não são estáticas e que influenciam a forma como tradicionalmente se olha para sistema multi-dimensionais. Mostra-se que a inclusão desta dimensões extra produzem melhores resultados na resolução do problema de identificar a estrutura subjacente a este problema da publicação de notícias em linha

    An Order-based Algorithm for Minimum Dominating Set with Application in Graph Mining

    Full text link
    Dominating set is a set of vertices of a graph such that all other vertices have a neighbour in the dominating set. We propose a new order-based randomised local search (RLSo_o) algorithm to solve minimum dominating set problem in large graphs. Experimental evaluation is presented for multiple types of problem instances. These instances include unit disk graphs, which represent a model of wireless networks, random scale-free networks, as well as samples from two social networks and real-world graphs studied in network science. Our experiments indicate that RLSo_o performs better than both a classical greedy approximation algorithm and two metaheuristic algorithms based on ant colony optimisation and local search. The order-based algorithm is able to find small dominating sets for graphs with tens of thousands of vertices. In addition, we propose a multi-start variant of RLSo_o that is suitable for solving the minimum weight dominating set problem. The application of RLSo_o in graph mining is also briefly demonstrated

    Real-time big data processing for anomaly detection : a survey

    Get PDF
    The advent of connected devices and omnipresence of Internet have paved way for intruders to attack networks, which leads to cyber-attack, financial loss, information theft in healthcare, and cyber war. Hence, network security analytics has become an important area of concern and has gained intensive attention among researchers, off late, specifically in the domain of anomaly detection in network, which is considered crucial for network security. However, preliminary investigations have revealed that the existing approaches to detect anomalies in network are not effective enough, particularly to detect them in real time. The reason for the inefficacy of current approaches is mainly due the amassment of massive volumes of data though the connected devices. Therefore, it is crucial to propose a framework that effectively handles real time big data processing and detect anomalies in networks. In this regard, this paper attempts to address the issue of detecting anomalies in real time. Respectively, this paper has surveyed the state-of-the-art real-time big data processing technologies related to anomaly detection and the vital characteristics of associated machine learning algorithms. This paper begins with the explanation of essential contexts and taxonomy of real-time big data processing, anomalous detection, and machine learning algorithms, followed by the review of big data processing technologies. Finally, the identified research challenges of real-time big data processing in anomaly detection are discussed. © 2018 Elsevier Lt

    RecMem: Time Aware Recommender Systems Based on Memetic Evolutionary Clustering Algorithm

    Get PDF
    Nowadays, the recommendation is an important task in the decision-making process about the selection of items especially when item space is large, diverse, and constantly updating. As a challenge in the recent systems, the preference and interest of users change over time, and existing recommender systems do not evolve optimal clustering with sufficient accuracy over time. Moreover, the behavior history of the users is determined by their neighbours. The purpose of the time parameter for this system is to extend the time-based priority. This paper has been carried out a time-aware recommender systems based on memetic evolutionary clustering algorithm called RecMem for recommendations. In this system, clusters that evolve over time using the memetic evolutionary algorithm and extract the best clusters at every timestamp, and improve the memetic algorithm using the chaos criterion. The system provides appropriate suggestions to the user based on optimum clustering. The system uses optimal evolutionary clustering using item attributes for the cold-start item problem and demographic information for the cold start user problem. The results show that the proposed method has an accuracy of approximately 0.95, which is more effective than existing systems

    Graph-Transfromational Swarms : A Graph-Transformational Approach to Swarm Computation

    Get PDF
    Computer systems are becoming increasingly distributed and interconnected. Various emerging notions, such as smart grids, system of systems, industry 4.0 or cyber-physical systems have gained more and more importance during the last few years. All of them propose to solve engineering problems by using several autonomous components that act in parallel and are interconnected, foremost using Internet technologies. These emerging concepts look very promising, but also exhibit various technical challenges. For instance, how is it possible to develop decentralized control mechanisms that produce a desired emerging behavior to solve a given task or how to model such solutions in order to analyze their behavior in terms of complexity and correctness? These are two major questions that this thesis attempts to answer. Indeed, it provides graph-transformational swarms as a novel concept that combines the ideas and principles of swarms and swarm computing and the formal methods of graph transformation to model distributed systems. Graph-transformational swarms captures the advantages of swarms and swarm computing and of graph transformation

    Finding maximal bicliques in bipartite networks using node similarity

    Get PDF
    In real world complex networks, communities are usually both overlapping and hierarchical. A very important class of complex networks is the bipartite networks. Maximal bicliques are the strongest possible structural communities within them. Here we consider overlapping communities in bipartite networks and propose a method that detects an order-limited number of overlapping maximal bicliques covering the network. We formalise a measure of relative community strength by which communities can be categorised, compared and ranked. There are very few real bipartite datasets for which any external ground truth about overlapping communities is known. Here we test three such datasets. We categorise and rank the maximal biclique communities found by our algorithm according to our measure of strength. Deeper analysis of these bicliques shows they accord with ground truth and give useful additional insight. Based on this we suggest our algorithm can find true communities at the first level of a hierarchy. We add a heuristic merging stage to the maximal biclique algorithm to produce a second level hierarchy with fewer communities and obtain positive results when compared with other overlapping community detection algorithms for bipartite networks
    corecore