248 research outputs found
GRAPES-DD: exploiting decision diagrams for index-driven search in biological graph databases
BACKGROUND: Graphs are mathematical structures widely used for expressing relationships among elements when representing biomedical and biological information. On top of these representations, several analyses are performed. A common task is the search of one substructure within one graph, called target. The problem is referred to as one-to-one subgraph search, and it is known to be NP-complete. Heuristics and indexing techniques can be applied to facilitate the search. Indexing techniques are also exploited in the context of searching in a collection of target graphs, referred to as one-to-many subgraph problem. Filter-and-verification methods that use indexing approaches provide a fast pruning of target graphs or parts of them that do not contain the query. The expensive verification phase is then performed only on the subset of promising targets. Indexing strategies extract graph features at a sufficient granularity level for performing a powerful filtering step. Features are memorized in data structures allowing an efficient access. Indexing size, querying time and filtering power are key points for the development of efficient subgraph searching solutions.RESULTS: An existing approach, GRAPES, has been shown to have good performance in terms of speed-up for both one-to-one and one-to-many cases. However, it suffers in the size of the built index. For this reason, we propose GRAPES-DD, a modified version of GRAPES in which the indexing structure has been replaced with a Decision Diagram. Decision Diagrams are a broad class of data structures widely used to encode and manipulate functions efficiently. Experiments on biomedical structures and synthetic graphs have confirmed our expectation showing that GRAPES-DD has substantially reduced the memory utilization compared to GRAPES without worsening the searching time.CONCLUSION: The use of Decision Diagrams for searching in biochemical and biological graphs is completely new and potentially promising thanks to their ability to encode compactly sets by exploiting their structure and regularity, and to manipulate entire sets of elements at once, instead of exploring each single element explicitly. Search strategies based on Decision Diagram makes the indexing for biochemical graphs, and not only, more affordable allowing us to potentially deal with huge and ever growing collections of biochemical and biological structures
Hierarchical stochastic graphlet embedding for graph-based pattern recognition
This is the final version. Available on open access from Springer via the DOI in this recordDespite being very successful within the pattern recognition and machine learning community, graph-based methods are often unusable with many machine learning tools. This is because of the incompatibility of most of the mathematical operations in graph domain. Graph embedding has been proposed as a way to tackle these difficulties, which maps graphs to a vector space and makes the standard machine learning techniques applicable for them. However, it is well known that graph embedding techniques usually suffer from the loss of structural information. In this paper, given a graph, we consider its hierarchical structure for mapping it into a vector space. The hierarchical structure is constructed by topologically clustering the graph nodes, and considering each cluster as a node in the upper hierarchical level. Once this hierarchical structure of graph is constructed, we consider its various configurations of its parts, and use stochastic graphlet embedding (SGE) for mapping them into vector space. Broadly speaking, SGE produces a distribution of uniformly sampled low to high order graphlets as a way to embed graphs into the vector space. In what follows, the coarse-to-fine structure of a graph hierarchy and the statistics fetched through the distribution of low to high order stochastic graphlets complements each other and include important structural information with varied contexts. Altogether, these two techniques substantially cope with the usual information loss involved in graph embedding techniques, and it is not a surprise that we obtain more robust vector space embedding of graphs. This fact has been corroborated through a detailed experimental evaluation on various benchmark graph datasets, where we outperform the state-of-the-art methods.European Union Horizon 2020Ministerio de Educación, Cultura y Deporte, SpainGeneralitat de Cataluny
Algebraic graph theory in the analysis of frequency assignment problems
Frequency Assignment Problems (FAPs) arise when transmitters need to be allocated
frequencies with the aim of minimizing interference, whilst maintaining an efficient use of the radio spectrum. In this thesis FAPs are seen as generalised graph colouring problems, where transmitters are represented by vertices, and their interactions by weighted edges.
Solving FAPs often relies on known structural properties to facilitate algorithms.
When no structural information is available explicitly, obtaining it from numerical
data is difficult. This lack of structural information is a key underlying motivation
for the research work in this thesis.
If there are TV transmitters to be assigned, we assume as given an N x N "influence
matrix" W with entries Wij representing influence between transmitters i and j.
From this matrix we derive the Laplacian matrix L = D—W, where D is a diagonal
matrix whose entries da are the sum of all influences working in transmitter i.
The focus of this thesis is the study of mathematical properties of the matrix L.
We généralisé certain properties of the Laplacian eigenvalues and eigenvectors that
hold for simple graphs. We also observe and discuss changes in the shape of the
Laplacian eigenvalue spectrum due to modifications of a FAP. We include a number
of computational experiments and generated simulated examples of FAPs for which
we explicitly calculate eigenvalues and eigenvectors in order to test the developed
theoretical results.
We find that the Laplacians prove useful in identifying certain types of problems,
providing structured approach to reducing the original FAP to smaller size subproblems,
hence assisting existing heuristic algorithms for solving frequency assignments.
In that sense we conclude that analysis of the Laplacians is a useful tool for better understanding of FAPs
Frequent subgraph mining algorithms on weighted graphs
This thesis describes research work undertaken in the field of graph-based knowledge
discovery (or graph mining). The objective of the research is to investigate the benefits
that the concept of weighted frequent subgraph mining can offer in the context of the
graph model based classification. Weighted subgraphs are graphs where some of the
vertexes/edges are considered to be more significant than others. How to discover
frequent sub-structures with different strengths is the main issue to be resolved in this
thesis. The main approach to addressing this issue is to integrate weight constraints into
the frequent subgraph mining process. It is suggested that the utilization of weighted
frequent subgraph mining generates more discriminate and significant subgraphs, which
will have application in, for example, the classification and clustering of graph data
- …