197 research outputs found
Community Detection in Networks using Bio-inspired Optimization: Latest Developments, New Results and Perspectives with a Selection of Recent Meta-Heuristics
Detecting groups within a set of interconnected nodes is a widely addressed prob- lem that can model a diversity of applications. Unfortunately, detecting the opti- mal partition of a network is a computationally demanding task, usually conducted by means of optimization methods. Among them, randomized search heuristics have been proven to be efficient approaches. This manuscript is devoted to pro- viding an overview of community detection problems from the perspective of bio-inspired computation. To this end, we first review the recent history of this research area, placing emphasis on milestone studies contributed in the last five years. Next, we present an extensive experimental study to assess the performance of a selection of modern heuristics over weighted directed network instances. Specifically, we combine seven global search heuristics based on two different similarity metrics and eight heterogeneous search operators designed ad-hoc. We compare our methods with six different community detection techniques over a benchmark of 17 Lancichinetti-Fortunato-Radicchi network instances. Ranking statistics of the tested algorithms reveal that the proposed methods perform com- petitively, but the high variability of the rankings leads to the main conclusion: no clear winner can be declared. This finding aligns with community detection tools available in the literature that hinge on a sequential application of different algorithms in search for the best performing counterpart. We end our research by sharing our envisioned status of this area, for which we identify challenges and opportunities which should stimulate research efforts in years to come
Optimizing community detection in social networks using antlion and K-median
Antlion Optimization (ALO) is one of the latest population based optimization methods that proved its good performance in a variety of applications. The ALO algorithm copies the hunting mechanism of antlions to ants in nature. Community detection in social networks is conclusive to understanding the concepts of the networks. Identifying network communities can be viewed as a problem of clustering a set of nodes into communities. k-median clustering is one of the popular techniques that has been applied in clustering. The problem of clustering network can be formalized as an optimization problem where a qualitatively objective function that captures the intuition of a cluster as a set of nodes with better in ternal connectivity than external connectivity is selected to be optimized. In this paper, a mixture antlion optimization and k-median for solving the community detection problem is proposed and named as K-median Modularity ALO. Experimental results which are applied on real life networks show the ability of the mixture antlion optimization and k-median to detect successfully an optimized community structure based on putting the modularity as an objective function
Reading the news through its structure: new hybrid connectivity based approaches
In this thesis a solution for the problem of identifying the structure of news published
by online newspapers is presented. This problem requires new approaches and algorithms
that are capable of dealing with the massive number of online publications in existence
(and that will grow in the future). The fact that news documents present a high degree of
interconnection makes this an interesting and hard problem to solve. The identification
of the structure of the news is accomplished both by descriptive methods that expose the
dimensionality of the relations between different news, and by clustering the news into
topic groups. To achieve this analysis this integrated whole was studied using different
perspectives and approaches.
In the identification of news clusters and structure, and after a preparatory data collection
phase, where several online newspapers from different parts of the globe were
collected, two newspapers were chosen in particular: the Portuguese daily newspaper
Público and the British newspaper The Guardian.
In the first case, it was shown how information theory (namely variation of information)
combined with adaptive networks was able to identify topic clusters in the news published
by the Portuguese online newspaper Público.
In the second case, the structure of news published by the British newspaper The
Guardian is revealed through the construction of time series of news clustered by a kmeans
process. After this approach an unsupervised algorithm, that filters out irrelevant
news published online by taking into consideration the connectivity of the news labels
entered by the journalists, was developed. This novel hybrid technique is based on Qanalysis
for the construction of the filtered network followed by a clustering technique to
identify the topical clusters. Presently this work uses a modularity optimisation clustering technique but this step is general enough that other hybrid approaches can be used without
losing generality.
A novel second order swarm intelligence algorithm based on Ant Colony Systems
was developed for the travelling salesman problem that is consistently better than the
traditional benchmarks. This algorithm is used to construct Hamiltonian paths over the
news published using the eccentricity of the different documents as a measure of distance.
This approach allows for an easy navigation between published stories that is dependent
on the connectivity of the underlying structure.
The results presented in this work show the importance of taking topic detection in
large corpora as a multitude of relations and connectivities that are not in a static state.
They also influence the way of looking at multi-dimensional ensembles, by showing that
the inclusion of the high dimension connectivities gives better results to solving a particular
problem as was the case in the clustering problem of the news published online.Neste trabalho resolvemos o problema da identificação da estrutura das notícias publicadas
em linha por jornais e agências noticiosas. Este problema requer novas abordagens e
algoritmos que sejam capazes de lidar com o número crescente de publicações em linha
(e que se espera continuam a crescer no futuro). Este facto, juntamente com o elevado
grau de interconexão que as notícias apresentam tornam este problema num problema
interessante e de difícil resolução. A identificação da estrutura do sistema de notícias foi
conseguido quer através da utilização de métodos descritivos que expõem a dimensão das
relações existentes entre as diferentes notícias, quer através de algoritmos de agrupamento
das mesmas em tópicos. Para atingir este objetivo foi necessário proceder a ao estudo deste
sistema complexo sob diferentes perspectivas e abordagens.
Após uma fase preparatória do corpo de dados, onde foram recolhidos diversos jornais
publicados online optou-se por dois jornais em particular: O Público e o The Guardian.
A escolha de jornais em línguas diferentes deve-se à vontade de encontrar estratégias de
análise que sejam independentes do conhecimento prévio que se tem sobre estes sistemas.
Numa primeira análise é empregada uma abordagem baseada em redes adaptativas
e teoria de informação (nomeadamente variação de informação) para identificar tópicos
noticiosos que são publicados no jornal português Público.
Numa segunda abordagem analisamos a estrutura das notícias publicadas pelo jornal
Britânico The Guardian através da construção de séries temporais de notícias. Estas foram
seguidamente agrupadas através de um processo de k-means. Para além disso desenvolveuse
um algoritmo que permite filtrar de forma não supervisionada notícias irrelevantes que
apresentam baixa conectividade às restantes notícias através da utilização de Q-analysis
seguida de um processo de clustering. Presentemente este método utiliza otimização de modularidade, mas a técnica é suficientemente geral para que outras abordagens híbridas
possam ser utilizadas sem perda de generalidade do método.
Desenvolveu-se ainda um novo algoritmo baseado em sistemas de colónias de formigas
para solução do problema do caixeiro viajante que consistentemente apresenta resultados
melhores que os tradicionais bancos de testes. Este algoritmo foi aplicado na construção
de caminhos Hamiltonianos das notícias publicadas utilizando a excentricidade obtida a
partir da conectividade do sistema estudado como medida da distância entre notícias. Esta
abordagem permitiu construir um sistema de navegação entre as notícias publicadas que é
dependente da conectividade observada na estrutura de notícias encontrada.
Os resultados apresentados neste trabalho mostram a importância de analisar sistemas
complexos na sua multitude de relações e conectividades que não são estáticas e que
influenciam a forma como tradicionalmente se olha para sistema multi-dimensionais.
Mostra-se que a inclusão desta dimensões extra produzem melhores resultados na resolução
do problema de identificar a estrutura subjacente a este problema da publicação de notícias em linha
An Order-based Algorithm for Minimum Dominating Set with Application in Graph Mining
Dominating set is a set of vertices of a graph such that all other vertices
have a neighbour in the dominating set. We propose a new order-based randomised
local search (RLS) algorithm to solve minimum dominating set problem in
large graphs. Experimental evaluation is presented for multiple types of
problem instances. These instances include unit disk graphs, which represent a
model of wireless networks, random scale-free networks, as well as samples from
two social networks and real-world graphs studied in network science. Our
experiments indicate that RLS performs better than both a classical greedy
approximation algorithm and two metaheuristic algorithms based on ant colony
optimisation and local search. The order-based algorithm is able to find small
dominating sets for graphs with tens of thousands of vertices. In addition, we
propose a multi-start variant of RLS that is suitable for solving the
minimum weight dominating set problem. The application of RLS in graph
mining is also briefly demonstrated
Real-time big data processing for anomaly detection : a survey
The advent of connected devices and omnipresence of Internet have paved way for intruders to attack networks, which leads to cyber-attack, financial loss, information theft in healthcare, and cyber war. Hence, network security analytics has become an important area of concern and has gained intensive attention among researchers, off late, specifically in the domain of anomaly detection in network, which is considered crucial for network security. However, preliminary investigations have revealed that the existing approaches to detect anomalies in network are not effective enough, particularly to detect them in real time. The reason for the inefficacy of current approaches is mainly due the amassment of massive volumes of data though the connected devices. Therefore, it is crucial to propose a framework that effectively handles real time big data processing and detect anomalies in networks. In this regard, this paper attempts to address the issue of detecting anomalies in real time. Respectively, this paper has surveyed the state-of-the-art real-time big data processing technologies related to anomaly detection and the vital characteristics of associated machine learning algorithms. This paper begins with the explanation of essential contexts and taxonomy of real-time big data processing, anomalous detection, and machine learning algorithms, followed by the review of big data processing technologies. Finally, the identified research challenges of real-time big data processing in anomaly detection are discussed. © 2018 Elsevier Lt
RecMem: Time Aware Recommender Systems Based on Memetic Evolutionary Clustering Algorithm
Nowadays, the recommendation is an important task in the decision-making process about the selection of items especially when item space is large, diverse, and constantly updating. As a challenge in the recent systems, the preference and interest of users change over time, and existing recommender systems do not evolve optimal clustering with sufficient accuracy over time. Moreover, the behavior history of the users is determined by their neighbours. The purpose of the time parameter for this system is to extend the time-based priority. This paper has been carried out a time-aware recommender systems based on memetic evolutionary clustering algorithm called RecMem for recommendations. In this system, clusters that evolve over time using the memetic evolutionary algorithm and extract the best clusters at every timestamp, and improve the memetic algorithm using the chaos criterion. The system provides appropriate suggestions to the user based on optimum clustering. The system uses optimal evolutionary clustering using item attributes for the cold-start item problem and demographic information for the cold start user problem. The results show that the proposed method has an accuracy of approximately 0.95, which is more effective than existing systems
Graph-Transfromational Swarms : A Graph-Transformational Approach to Swarm Computation
Computer systems are becoming increasingly distributed and interconnected. Various emerging notions, such as smart grids, system of systems, industry 4.0 or cyber-physical systems have gained more and more importance during the last few years. All of them propose to solve engineering problems by using several autonomous components that act in parallel and are interconnected, foremost using Internet technologies. These emerging concepts look very promising, but also exhibit various technical challenges. For instance, how is it possible to develop decentralized control mechanisms that produce a desired emerging behavior to solve a given task or how to model such solutions in order to analyze their behavior in terms of complexity and correctness? These are two major questions that this thesis attempts to answer. Indeed, it provides graph-transformational swarms as a novel concept that combines the ideas and principles of swarms and swarm computing and the formal methods of graph transformation to model distributed systems. Graph-transformational swarms captures the advantages of swarms and swarm computing and of graph transformation
Finding maximal bicliques in bipartite networks using node similarity
In real world complex networks, communities are usually both overlapping and hierarchical. A very important class of complex networks is the bipartite networks. Maximal bicliques are the strongest possible structural communities within them. Here we consider overlapping communities in bipartite networks and propose a method that detects an order-limited number of overlapping maximal bicliques covering the network. We formalise a measure of relative community strength by which communities can be categorised, compared and ranked. There are very few real bipartite datasets for which any external ground truth about overlapping communities is known. Here we test three such datasets. We categorise and rank the maximal biclique communities found by our algorithm according to our measure of strength. Deeper analysis of these bicliques shows they accord with ground truth and give useful additional insight. Based on this we suggest our algorithm can find true communities at the first level of a hierarchy. We add a heuristic merging stage to the maximal biclique algorithm to produce a second level hierarchy with fewer communities and obtain positive results when compared with other overlapping community detection algorithms for bipartite networks
- …