Search CORE

67 research outputs found

Minimising Entropy Changes in Dynamic Network Evolution

Author: Hancock Edwin R.
Wang Jianjia
Wilson Richard C.
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

Adaptive feature selection based on the most informative graph-based features

Author: Bai Lu
Cui Lixin
Hancock Edwin R.
Jiao Yuhang
Rossi Luca
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

In this paper, we propose a novel method to adaptively select the most informative and least redundant feature subset, which has strong discriminating power with respect to the target label. Unlike most traditional methods using vectorial features, our proposed approach is based on graph-based features and thus incorporates the relationships between feature samples into the feature selection process. To efficiently encapsulate the main characteristics of the graph-based features, we probe each graph structure using the steady state random walk and compute a probability distribution of the walk visiting the vertices. Furthermore, we propose a new information theoretic criterion to measure the joint relevance of different pairwise feature combinations with respect to the target feature, through the Jensen-Shannon divergence measure between the probability distributions from the random walk on different graphs. By solving a quadratic programming problem, we use the new measure to automatically locate the subset of the most informative features, that have both low redundancy and strong discriminating power. Unlike most existing state-of-the-art feature selection methods, the proposed information theoretic feature selection method can accommodate both continuous and discrete target features. Experiments on the problem of P2P lending platforms in China demonstrate the effectiveness of the proposed method

Aston Publications Explorer

White Rose Research Online

Designing labeled graph classifiers by exploiting the R\'enyi entropy of the dissimilarity representation

Author: Livi Lorenzo
Publication venue: 'MDPI AG'
Publication date: 20/04/2017
Field of study

Representing patterns as labeled graphs is becoming increasingly common in the broad field of computational intelligence. Accordingly, a wide repertoire of pattern recognition tools, such as classifiers and knowledge discovery procedures, are nowadays available and tested for various datasets of labeled graphs. However, the design of effective learning procedures operating in the space of labeled graphs is still a challenging problem, especially from the computational complexity viewpoint. In this paper, we present a major improvement of a general-purpose classifier for graphs, which is conceived on an interplay between dissimilarity representation, clustering, information-theoretic techniques, and evolutionary optimization algorithms. The improvement focuses on a specific key subroutine devised to compress the input data. We prove different theorems which are fundamental to the setting of the parameters controlling such a compression operation. We demonstrate the effectiveness of the resulting classifier by benchmarking the developed variants on well-known datasets of labeled graphs, considering as distinct performance indicators the classification accuracy, computing time, and parsimony in terms of structural complexity of the synthesized classification models. The results show state-of-the-art standards in terms of test set accuracy and a considerable speed-up for what concerns the computing time.Comment: Revised versio

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Information Theoretic Graph Kernels

Author: Bai Lu
Publication venue: University of York
Publication date: 04/05/2014
Field of study

This thesis addresses the problems that arise in state-of-the-art structural learning methods for (hyper)graph classification or clustering, particularly focusing on developing novel information theoretic kernels for graphs. To this end, we commence in Chapter 3 by defining a family of Jensen-Shannon diffusion kernels, i.e., the information theoretic kernels, for (un)attributed graphs. We show that our kernels overcome the shortcomings of inefficiency (for the unattributed diffusion kernel) and discarding un-isomorphic substructures (for the attributed diffusion kernel) that arise in the R-convolution kernels. In Chapter 4, we present a novel framework of computing depth-based complexity traces rooted at the centroid vertices for graphs, which can be efficiently computed for graphs with large sizes. We show that our methods can characterize a graph in a higher dimensional complexity feature space than state-of-the-art complexity measures. In Chapter 5, we develop a novel unattributed graph kernel by matching the depth-based substructures in graphs, based on the contribution in Chapter 4. Unlike most existing graph kernels in the literature which merely enumerate similar substructure pairs of limited sizes, our method incorporates explicit local substructure correspondence into the process of kernelization. The new kernel thus overcomes the shortcoming of neglecting structural correspondence that arises in most state-of-the-art graph kernels. The novel methods developed in Chapters 3, 4, and 5 are only restricted to graphs. However, real-world data usually tends to be represented by higher order relationships (i.e., hypergraphs). To overcome the shortcoming, in Chapter 6 we present a new hypergraph kernel using substructure isomorphism tests. We show that our kernel limits tottering that arises in the existing walk and subtree based (hyper)graph kernels. In Chapter 7, we summarize the contributions of this thesis. Furthermore, we analyze the proposed methods. Finally, we give some suggestions for the future work

White Rose E-theses Online

Structural Data Recognition with Graph Model Boosting

Author: Miyazaki Tomo
Omachi Shinichiro
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 07/03/2017
Field of study

This paper presents a novel method for structural data recognition using a large number of graph models. In general, prevalent methods for structural data recognition have two shortcomings: 1) Only a single model is used to capture structural variation. 2) Naive recognition methods are used, such as the nearest neighbor method. In this paper, we propose strengthening the recognition performance of these models as well as their ability to capture structural variation. The proposed method constructs a large number of graph models and trains decision trees using the models. This paper makes two main contributions. The first is a novel graph model that can quickly perform calculations, which allows us to construct several models in a feasible amount of time. The second contribution is a novel approach to structural data recognition: graph model boosting. Comprehensive structural variations can be captured with a large number of graph models constructed in a boosting framework, and a sophisticated classifier can be formed by aggregating the decision trees. Consequently, we can carry out structural data recognition with powerful recognition capability in the face of comprehensive structural variation. The experiments shows that the proposed method achieves impressive results and outperforms existing methods on datasets of IAM graph database repository.Comment: 8 page

arXiv.org e-Print Archive

Tohoku University Repository (TOUR) / 東北大学機関リポジトリ

Identifying the most informative features using a structurally interacting elastic net

Author: Bai Lu
Cui Lixin
Hancock Edwin R
Wang Yue
Zhang Zhihong
Publication venue: 'Elsevier BV'
Publication date: 01/01/2018
Field of study

Feature selection can efficiently identify the most informative features with respect to the target feature used in training. However, state-of-the-art vector-based methods are unable to encapsulate the relationships between feature samples into the feature selection process, thus leading to significant information loss. To address this problem, we propose a new graph-based structurally interacting elastic net method for feature selection. Specifically, we commence by constructing feature graphs that can incorporate pairwise relationship between samples. With the feature graphs to hand, we propose a new information theoretic criterion to measure the joint relevance of different pairwise feature combinations with respect to the target feature graph representation. This measure is used to obtain a structural interaction matrix where the elements represent the proposed information theoretic measure between feature pairs. We then formulate a new optimization model through the combination of the structural interaction matrix and an elastic net regression model for the feature subset selection problem. This allows us to (a) preserve the information of the original vectorial space, (b) remedy the information loss of the original feature space caused by using graph representation, and (c) promote a sparse solution and also encourage correlated features to be selected. Because the proposed optimization problem is non-convex, we develop an efficient alternating direction multiplier method (ADMM) to locate the optimal solutions. Extensive experiments on various datasets demonstrate the effectiveness of the proposed method

arXiv.org e-Print Archive

White Rose Research Online

Graph Embedding Using Frequency Filtering

Author: Bahonar Hoda
Mirzaei Abdolreza
Sadri Saeed
Wilson Richard Charles
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2021
Field of study

The target of graph embedding is to embed graphs in vector space such that the embedded feature vectors follow the differences and similarities of the source graphs. In this paper, a novel method named Frequency Filtering Embedding (FFE) is proposed which uses graph Fourier transform and Frequency filtering as a graph Fourier domain operator for graph feature extraction. Frequency filtering amplifies or attenuates selected frequencies using appropriate filter functions. Here, heat, anti-heat, part-sine and identity filter sets are proposed as the filter functions. A generalized version of FFE named GeFFE is also proposed by defining pseudo-Fourier operators. This method can be considered as a general framework for formulating some previously defined invariants in other works by choosing a suitable filter bank and defining suitable pseudo-Fourier operators. This flexibility empowers GeFFE to adapt itself to the properties of each graph dataset unlike the previous spectral embedding methods and leads to superior classification accuracy relative to the others. Utilizing the proposed part-sine filter set which its members filter different parts of the spectrum in turn improves the classification accuracy of GeFFE method. Additionally, GeFFE resolves the cospectrality problem entirely in tested datasets

Crossref

White Rose Research Online

A lightweight, graph-theoretic model of class-based similarity to support object-oriented code reuse.

Author: MacLean Angus
Publication venue
Publication date: 31/01/2003
Field of study

The work presented in this thesis is principally concerned with the development of a method and set of tools designed to support the identification of class-based similarity in collections of object-oriented code. Attention is focused on enhancing the potential for software reuse in situations where a reuse process is either absent or informal, and the characteristics of the organisation are unsuitable, or resources unavailable, to promote and sustain a systematic approach to reuse. The approach builds on the definition of a formal, attributed, relational model that captures the inherent structure of class-based, object-oriented code. Based on code-level analysis, it relies solely on the structural characteristics of the code and the peculiarly object-oriented features of the class as an organising principle: classes, those entities comprising a class, and the intra and inter-class relationships existing between them, are significant factors in defining a two-phase similarity measure as a basis for the comparison process. Established graph-theoretic techniques are adapted and applied via this model to the problem of determining similarity between classes. This thesis illustrates a successful transfer of techniques from the domains of molecular chemistry and computer vision. Both domains provide an existing template for the analysis and comparison of structures as graphs. The inspiration for representing classes as attributed relational graphs, and the application of graph-theoretic techniques and algorithms to their comparison, arose out of a well-founded intuition that a common basis in graph-theory was sufficient to enable a reasonable transfer of these techniques to the problem of determining similarity in object-oriented code. The practical application of this work relates to the identification and indexing of instances of recurring, class-based, common structure present in established and evolving collections of object-oriented code. A classification so generated additionally provides a framework for class-based matching over an existing code-base, both from the perspective of newly introduced classes, and search "templates" provided by those incomplete, iteratively constructed and refined classes associated with current and on-going development. The tools and techniques developed here provide support for enabling and improving shared awareness of reuse opportunity, based on analysing structural similarity in past and ongoing development, tools and techniques that can in turn be seen as part of a process of domain analysis, capable of stimulating the evolution of a systematic reuse ethic

Open Access Institutional Repository at Robert Gordon University

Mining subjectively interesting patterns in rich data

Author: Deng Junning
Publication venue: Universiteit Gent. Faculteit Ingenieurswetenschappen en Architectuur
Publication date: 01/01/2021
Field of study

Ghent University Academic Bibliography

Diffusion Wavelet Embedding: a Multi-resolution Approach for Graph Embedding in Vector Space

Author: Abdolreza Mirzaei
Aziz
Bai
Bai
Bonev
Bulitko
Bunke
Coifman
Crovella
Cvetković
Debnath
Emms
Escolano
Escolano
Farge
Gibert
Godsil
Haemers
Hammond
Hoda Bahonar
Kumar
Kuncheva
Li
Li
Lubiw
Luo
Luqman
Parlett
Ren
Ren
Ren
Rensink
Richard C. Wilson
Riesen
Riesen
Schwenk
Shokoufandeh
Shuman
Torsello
Wagner
Wang
Wang
Wilson
Wilson
Wilson
Xiao
Xiao
Xiao
Publication venue: 'Elsevier BV'
Publication date: 01/09/2017
Field of study

In this article, we propose a multiscale method of embedding a graph into a vector space using diffusion wavelets. At each scale, we extract a detail subspace and a corresponding lower-scale approximation subspace to represent the graph. Representative features are then extracted at each scale to provide a scale-space description of the graph. The lower-scale is constructed using a super-node merging strategy based on nearest neighbor or maximum participation and the new adjacency matrix is generated using vertex identification. This approach allows the comparison of graphs where the important structural differences may be present at varying scales. Additionally, this method can improve the differentiating power of the embedded vectors and this property reduces the possibility of cospectrality typical in spectral methods, substantially. The experimental results show that augmenting the features of abstract levels to the graph features increases the graph classification accuracies in different datasets

Crossref

White Rose Research Online