115 research outputs found

    Hybrid Graph: A Unified Graph Representation with Datasets and Benchmarks for Complex Graphs

    Full text link
    Graphs are widely used to encapsulate a variety of data formats, but real-world networks often involve complex node relations beyond only being pairwise. While hypergraphs and hierarchical graphs have been developed and employed to account for the complex node relations, they cannot fully represent these complexities in practice. Additionally, though many Graph Neural Networks (GNNs) have been proposed for representation learning on higher-order graphs, they are usually only evaluated on simple graph datasets. Therefore, there is a need for a unified modelling of higher-order graphs, and a collection of comprehensive datasets with an accessible evaluation framework to fully understand the performance of these algorithms on complex graphs. In this paper, we introduce the concept of hybrid graphs, a unified definition for higher-order graphs, and present the Hybrid Graph Benchmark (HGB). HGB contains 23 real-world hybrid graph datasets across various domains such as biology, social media, and e-commerce. Furthermore, we provide an extensible evaluation framework and a supporting codebase to facilitate the training and evaluation of GNNs on HGB. Our empirical study of existing GNNs on HGB reveals various research opportunities and gaps, including (1) evaluating the actual performance improvement of hypergraph GNNs over simple graph GNNs; (2) comparing the impact of different sampling strategies on hybrid graph learning methods; and (3) exploring ways to integrate simple graph and hypergraph information. We make our source code and full datasets publicly available at https://zehui127.github.io/hybrid-graph-benchmark/.Comment: Preprint. Under review. 16 pages, 5 figures, 11 table

    Beyond Flatland : exploring graphs in many dimensions

    Get PDF
    Societies, technologies, economies, ecosystems, organisms, . . . Our world is composed of complex networks—systems with many elements that interact in nontrivial ways. Graphs are natural models of these systems, and scientists have made tremendous progress in developing tools for their analysis. However, research has long focused on relatively simple graph representations and problem specifications, often discarding valuable real-world information in the process. In recent years, the limitations of this approach have become increasingly apparent, but we are just starting to comprehend how more intricate data representations and problem formulations might benefit our understanding of relational phenomena. Against this background, our thesis sets out to explore graphs in five dimensions: descriptivity, multiplicity, complexity, expressivity, and responsibility. Leveraging tools from graph theory, information theory, probability theory, geometry, and topology, we develop methods to (1) descriptively compare individual graphs, (2) characterize similarities and differences between groups of multiple graphs, (3) critically assess the complexity of relational data representations and their associated scientific culture, (4) extract expressive features from and for hypergraphs, and (5) responsibly mitigate the risks induced by graph-structured content recommendations. Thus, our thesis is naturally situated at the intersection of graph mining, graph learning, and network analysis.Gesellschaften, Technologien, Volkswirtschaften, Ökosysteme, Organismen, . . . Unsere Welt besteht aus komplexen Netzwerken—Systemen mit vielen Elementen, die auf nichttriviale Weise interagieren. Graphen sind natürliche Modelle dieser Systeme, und die Wissenschaft hat bei der Entwicklung von Methoden zu ihrer Analyse große Fortschritte gemacht. Allerdings hat sich die Forschung lange auf relativ einfache Graphrepräsentationen und Problemspezifikationen beschränkt, oft unter Vernachlässigung wertvoller Informationen aus der realen Welt. In den vergangenen Jahren sind die Grenzen dieser Herangehensweise zunehmend deutlich geworden, aber wir beginnen gerade erst zu erfassen, wie unser Verständnis relationaler Phänomene von intrikateren Datenrepräsentationen und Problemstellungen profitieren kann. Vor diesem Hintergrund erkundet unsere Dissertation Graphen in fünf Dimensionen: Deskriptivität, Multiplizität, Komplexität, Expressivität, und Verantwortung. Mithilfe von Graphentheorie, Informationstheorie, Wahrscheinlichkeitstheorie, Geometrie und Topologie entwickeln wir Methoden, welche (1) einzelne Graphen deskriptiv vergleichen, (2) Gemeinsamkeiten und Unterschiede zwischen Gruppen multipler Graphen charakterisieren, (3) die Komplexität relationaler Datenrepräsentationen und der mit ihnen verbundenen Wissenschaftskultur kritisch beleuchten, (4) expressive Merkmale von und für Hypergraphen extrahieren, und (5) verantwortungsvoll den Risiken begegnen, welche die Graphstruktur von Inhaltsempfehlungen mit sich bringt. Damit liegt unsere Dissertation naturgemäß an der Schnittstelle zwischen Graph Mining, Graph Learning und Netzwerkanalyse

    Indoor Scene Recognition for Micro Aerial Vehicles Navigation using Enhanced-GIST Descriptors

    Get PDF
    An indoor scene recognition algorithm combining histogram of horizontal and vertical directional morphological gradient features and GIST features is proposed in this paper. New visual descriptor is called enhanced-GIST. Three different classifiers, k-nearest neighbour classifier, NaĂŻve Bayes classifier and support vector machine, are employed for the classification of indoor scenes into corridor, staircase or room. The evaluation was performed on two indoor scene datasets. The scene recognition algorithm consists of training phase and a testing phase. In the training phase, GIST, CENTRIST, LBP, HODMG and enhanced-GIST feature vectors are extracted for all the training images in the datasets and classifiers are trained for these image feature vectors and image labels (corridor-1, staircase-2 and room-3). In the test phase, GIST, CENTRIST, LBP, HODMG and enhanced-GIST feature vectors are extracted for each unknown test image sample and classification is performed using a trained scene recognition model. The experimental results show that indoor scene recognition algorithm employing SVM with enhanced GIST descriptors produces very high recognition rates of 97.22 per cent and 99.33 per cent for dataset-1 and dataset-2, compared to kNN and NaĂŻve Bayes classifiers. In addition to its accuracy and robustness, the algorithm is suitable for real-time operations

    Towards a unified approach

    Get PDF
    "Decision-making in the presence of uncertainty is a pervasive computation. Latent variable decoding—inferring hidden causes underlying visible effects—is commonly observed in nature, and it is an unsolved challenge in modern machine learning. On many occasions, animals need to base their choices on uncertain evidence; for instance, when deciding whether to approach or avoid an obfuscated visual stimulus that could be either a prey or a predator. Yet, their strategies are, in general, poorly understood. In simple cases, these problems admit an optimal, explicit solution. However, in more complex real-life scenarios, it is difficult to determine the best possible behavior. The most common approach in modern machine learning relies on artificial neural networks—black boxes that map each input to an output. This input-output mapping depends on a large number of parameters, the weights of the synaptic connections, which are optimized during learning.(...)

    Machine learning with neuroimaging data to identify autism spectrum disorder: a systematic review and meta-analysis

    Get PDF
    Purpose: Autism Spectrum Disorder (ASD) is diagnosed through observation or interview assessments, which is time-consuming, subjective, and with questionable validity and reliability. Thus, we aimed to evaluate the role of machine learning (ML) with neuroimaging data to provide a reliable classification of ASD. Methods: A systematic search of PubMed, Scopus, and Embase was conducted to identify relevant publications. Quality Assessment of Diagnostic Accuracy Studies-2 (QUADAS-2) was used to assess the studies’ quality. A bivariate random-effects model meta-analysis was employed to evaluate the pooled sensitivity, the pooled specificity, and the diagnostic performance through the hierarchical summary receiver operating characteristic (HSROC) curve of ML with neuroimaging data in classifying ASD. Meta-regression was also performed. Results: Forty-four studies (5697 ASD and 6013 typically developing individuals [TD] in total) were included in the quantitative analysis. The pooled sensitivity for differentiating ASD from TD individuals was 86.25 95% confidence interval [CI] (81.24, 90.08), while the pooled specificity was 83.31 95% CI (78.12, 87.48) with a combined area under the HSROC (AUC) of 0.889. Higgins I2 (> 90%) and Cochran’s Q (p < 0.0001) suggest a high degree of heterogeneity. In the bivariate model meta-regression, a higher pooled specificity was observed in studies not using a brain atlas (90.91 95% CI [80.67, 96.00], p = 0.032). In addition, a greater pooled sensitivity was seen in studies recruiting both males and females (89.04 95% CI [83.84, 92.72], p = 0.021), and combining imaging modalities (94.12 95% [85.43, 97.76], p = 0.036). Conclusion: ML with neuroimaging data is an exciting prospect in detecting individuals with ASD but further studies are required to improve its reliability for usage in clinical practice

    Classification of Cognitive States using Task-Specific Connectivity Features

    Get PDF
    Human brain activity maps are produced by functional MRI (fMRI) research that describes the average level of engagement during a specific task of various brain regions. Functional connectivity describes the interrelationship, integrated performance, and organization of these different brain regions. This study investigates functional connectivity to quantify the interactions between different brain regions engaged concurrently in a specific task. The key focus of this study was to introduce and demonstrate task-specific functional connectivity among brain regions using fMRI data and decode cognitive states by proposing a novel classifier using connectivity features. Two connectivity models were considered: a graph-based task-specific functional connectivity and a Granger causality-transfer entropy framework. Connectivity strengths obtained among brain regions were used for cognitive state classification. The parameters of the nodal and global graph analysis from the graph-based connectivity framework were considered, and the transfer entropy values of the causal connectivity model were considered as features for the cognitive state classification. The proposed model achieved an average accuracy of 95% on the StarPlus fMRI dataset and showed an improvement of 5% compared to the existing Tensor-SVD classification algorithm

    On the role of metaheuristic optimization in bioinformatics

    Get PDF
    Metaheuristic algorithms are employed to solve complex and large-scale optimization problems in many different fields, from transportation and smart cities to finance. This paper discusses how metaheuristic algorithms are being applied to solve different optimization problems in the area of bioinformatics. While the text provides references to many optimization problems in the area, it focuses on those that have attracted more interest from the optimization community. Among the problems analyzed, the paper discusses in more detail the molecular docking problem, the protein structure prediction, phylogenetic inference, and different string problems. In addition, references to other relevant optimization problems are also given, including those related to medical imaging or gene selection for classification. From the previous analysis, the paper generates insights on research opportunities for the Operations Research and Computer Science communities in the field of bioinformatics
    • …
    corecore