6,577 research outputs found
An optimisation tool for robust community detection algorithms using content and topology information
With the recent prevalence of information networks, the topic of community detection has gained much interest among researchers. In real-world networks, node attribute (content information) is also available in addition to topology information. However, the collected topology information for networks is usually noisy when there are missing edges. Furthermore, the existing community detection methods generally focus on topology information and largely ignore the content information. This makes the task of community detection for incomplete networks very challenging. A new method is proposed that seeks to address this issue and help improve the performance of the existing community detection algorithms by considering both sources of information, i.e. topology and content. Empirical results demonstrate that our proposed method is robust and can detect more meaningful community structures within networks having incomplete information, than the conventional methods that consider only topology information
Reading the news through its structure: new hybrid connectivity based approaches
In this thesis a solution for the problem of identifying the structure of news published
by online newspapers is presented. This problem requires new approaches and algorithms
that are capable of dealing with the massive number of online publications in existence
(and that will grow in the future). The fact that news documents present a high degree of
interconnection makes this an interesting and hard problem to solve. The identification
of the structure of the news is accomplished both by descriptive methods that expose the
dimensionality of the relations between different news, and by clustering the news into
topic groups. To achieve this analysis this integrated whole was studied using different
perspectives and approaches.
In the identification of news clusters and structure, and after a preparatory data collection
phase, where several online newspapers from different parts of the globe were
collected, two newspapers were chosen in particular: the Portuguese daily newspaper
PĂşblico and the British newspaper The Guardian.
In the first case, it was shown how information theory (namely variation of information)
combined with adaptive networks was able to identify topic clusters in the news published
by the Portuguese online newspaper PĂşblico.
In the second case, the structure of news published by the British newspaper The
Guardian is revealed through the construction of time series of news clustered by a kmeans
process. After this approach an unsupervised algorithm, that filters out irrelevant
news published online by taking into consideration the connectivity of the news labels
entered by the journalists, was developed. This novel hybrid technique is based on Qanalysis
for the construction of the filtered network followed by a clustering technique to
identify the topical clusters. Presently this work uses a modularity optimisation clustering technique but this step is general enough that other hybrid approaches can be used without
losing generality.
A novel second order swarm intelligence algorithm based on Ant Colony Systems
was developed for the travelling salesman problem that is consistently better than the
traditional benchmarks. This algorithm is used to construct Hamiltonian paths over the
news published using the eccentricity of the different documents as a measure of distance.
This approach allows for an easy navigation between published stories that is dependent
on the connectivity of the underlying structure.
The results presented in this work show the importance of taking topic detection in
large corpora as a multitude of relations and connectivities that are not in a static state.
They also influence the way of looking at multi-dimensional ensembles, by showing that
the inclusion of the high dimension connectivities gives better results to solving a particular
problem as was the case in the clustering problem of the news published online.Neste trabalho resolvemos o problema da identificação da estrutura das notĂcias publicadas
em linha por jornais e agĂŞncias noticiosas. Este problema requer novas abordagens e
algoritmos que sejam capazes de lidar com o número crescente de publicações em linha
(e que se espera continuam a crescer no futuro). Este facto, juntamente com o elevado
grau de interconexĂŁo que as notĂcias apresentam tornam este problema num problema
interessante e de difĂcil resolução. A identificação da estrutura do sistema de notĂcias foi
conseguido quer através da utilização de métodos descritivos que expõem a dimensão das
relações existentes entre as diferentes notĂcias, quer atravĂ©s de algoritmos de agrupamento
das mesmas em tópicos. Para atingir este objetivo foi necessário proceder a ao estudo deste
sistema complexo sob diferentes perspectivas e abordagens.
ApĂłs uma fase preparatĂłria do corpo de dados, onde foram recolhidos diversos jornais
publicados online optou-se por dois jornais em particular: O PĂşblico e o The Guardian.
A escolha de jornais em lĂnguas diferentes deve-se Ă vontade de encontrar estratĂ©gias de
análise que sejam independentes do conhecimento prévio que se tem sobre estes sistemas.
Numa primeira análise é empregada uma abordagem baseada em redes adaptativas
e teoria de informação (nomeadamente variação de informação) para identificar tópicos
noticiosos que sĂŁo publicados no jornal portuguĂŞs PĂşblico.
Numa segunda abordagem analisamos a estrutura das notĂcias publicadas pelo jornal
Britânico The Guardian atravĂ©s da construção de sĂ©ries temporais de notĂcias. Estas foram
seguidamente agrupadas através de um processo de k-means. Para além disso desenvolveuse
um algoritmo que permite filtrar de forma nĂŁo supervisionada notĂcias irrelevantes que
apresentam baixa conectividade Ă s restantes notĂcias atravĂ©s da utilização de Q-analysis
seguida de um processo de clustering. Presentemente este mĂ©todo utiliza otimização de modularidade, mas a tĂ©cnica Ă© suficientemente geral para que outras abordagens hĂbridas
possam ser utilizadas sem perda de generalidade do método.
Desenvolveu-se ainda um novo algoritmo baseado em sistemas de colĂłnias de formigas
para solução do problema do caixeiro viajante que consistentemente apresenta resultados
melhores que os tradicionais bancos de testes. Este algoritmo foi aplicado na construção
de caminhos Hamiltonianos das notĂcias publicadas utilizando a excentricidade obtida a
partir da conectividade do sistema estudado como medida da distância entre notĂcias. Esta
abordagem permitiu construir um sistema de navegação entre as notĂcias publicadas que Ă©
dependente da conectividade observada na estrutura de notĂcias encontrada.
Os resultados apresentados neste trabalho mostram a importância de analisar sistemas
complexos na sua multitude de relações e conectividades que não são estáticas e que
influenciam a forma como tradicionalmente se olha para sistema multi-dimensionais.
Mostra-se que a inclusão desta dimensões extra produzem melhores resultados na resolução
do problema de identificar a estrutura subjacente a este problema da publicação de notĂcias em linha
AI Solutions for MDS: Artificial Intelligence Techniques for Misuse Detection and Localisation in Telecommunication Environments
This report considers the application of Articial Intelligence (AI) techniques to
the problem of misuse detection and misuse localisation within telecommunications
environments. A broad survey of techniques is provided, that covers inter alia
rule based systems, model-based systems, case based reasoning, pattern matching,
clustering and feature extraction, articial neural networks, genetic algorithms, arti
cial immune systems, agent based systems, data mining and a variety of hybrid
approaches. The report then considers the central issue of event correlation, that
is at the heart of many misuse detection and localisation systems. The notion of
being able to infer misuse by the correlation of individual temporally distributed
events within a multiple data stream environment is explored, and a range of techniques,
covering model based approaches, `programmed' AI and machine learning
paradigms. It is found that, in general, correlation is best achieved via rule based approaches,
but that these suffer from a number of drawbacks, such as the difculty of
developing and maintaining an appropriate knowledge base, and the lack of ability
to generalise from known misuses to new unseen misuses. Two distinct approaches
are evident. One attempts to encode knowledge of known misuses, typically within
rules, and use this to screen events. This approach cannot generally detect misuses
for which it has not been programmed, i.e. it is prone to issuing false negatives.
The other attempts to `learn' the features of event patterns that constitute normal
behaviour, and, by observing patterns that do not match expected behaviour, detect
when a misuse has occurred. This approach is prone to issuing false positives,
i.e. inferring misuse from innocent patterns of behaviour that the system was not
trained to recognise. Contemporary approaches are seen to favour hybridisation,
often combining detection or localisation mechanisms for both abnormal and normal
behaviour, the former to capture known cases of misuse, the latter to capture
unknown cases. In some systems, these mechanisms even work together to update
each other to increase detection rates and lower false positive rates. It is concluded
that hybridisation offers the most promising future direction, but that a rule or state
based component is likely to remain, being the most natural approach to the correlation
of complex events. The challenge, then, is to mitigate the weaknesses of
canonical programmed systems such that learning, generalisation and adaptation
are more readily facilitated
State-of-the-art in aerodynamic shape optimisation methods
Aerodynamic optimisation has become an indispensable component for any aerodynamic design over the past 60 years, with applications to aircraft, cars, trains, bridges, wind turbines, internal pipe flows, and cavities, among others, and is thus relevant in many facets of technology. With advancements in computational power, automated design optimisation procedures have become more competent, however, there is an ambiguity and bias throughout the literature with regards to relative performance of optimisation architectures and employed algorithms. This paper provides a well-balanced critical review of the dominant optimisation approaches that have been integrated with aerodynamic theory for the purpose of shape optimisation. A total of 229 papers, published in more than 120 journals and conference proceedings, have been classified into 6 different optimisation algorithm approaches. The material cited includes some of the most well-established authors and publications in the field of aerodynamic optimisation. This paper aims to eliminate bias toward certain algorithms by analysing the limitations, drawbacks, and the benefits of the most utilised optimisation approaches. This review provides comprehensive but straightforward insight for non-specialists and reference detailing the current state for specialist practitioners
- …