8,104 research outputs found

    A Unifying Theory for Graph Transformation

    Get PDF
    The field of graph transformation studies the rule-based transformation of graphs. An important branch is the algebraic graph transformation tradition, in which approaches are defined and studied using the language of category theory. Most algebraic graph transformation approaches (such as DPO, SPO, SqPO, and AGREE) are opinionated about the local contexts that are allowed around matches for rules, and about how replacement in context should work exactly. The approaches also differ considerably in their underlying formal theories and their general expressiveness (e.g., not all frameworks allow duplication). This dissertation proposes an expressive algebraic graph transformation approach, called PBPO+, which is an adaptation of PBPO by Corradini et al. The central contribution is a proof that PBPO+ subsumes (under mild restrictions) DPO, SqPO, AGREE, and PBPO in the important categorical setting of quasitoposes. This result allows for a more unified study of graph transformation metatheory, methods, and tools. A concrete example of this is found in the second major contribution of this dissertation: a graph transformation termination method for PBPO+, based on decreasing interpretations, and defined for general categories. By applying the proposed encodings into PBPO+, this method can also be applied for DPO, SqPO, AGREE, and PBPO

    Computational Analyses of Metagenomic Data

    Get PDF
    Metagenomics studies the collective microbial genomes extracted from a particular environment without requiring the culturing or isolation of individual genomes, addressing questions revolving around the composition, functionality, and dynamics of microbial communities. The intrinsic complexity of metagenomic data and the diversity of applications call for efficient and accurate computational methods in data handling. In this thesis, I present three primary projects that collectively focus on the computational analysis of metagenomic data, each addressing a distinct topic. In the first project, I designed and implemented an algorithm named Mapbin for reference-free genomic binning of metagenomic assemblies. Binning aims to group a mixture of genomic fragments based on their genome origin. Mapbin enhances binning results by building a multilayer network that combines the initial binning, assembly graph, and read-pairing information from paired-end sequencing data. The network is further partitioned by the community-detection algorithm, Infomap, to yield a new binning result. Mapbin was tested on multiple simulated and real datasets. The results indicated an overall improvement in the common binning quality metrics. The second and third projects are both derived from ImMiGeNe, a collaborative and multidisciplinary study investigating the interplay between gut microbiota, host genetics, and immunity in stem-cell transplantation (SCT) patients. In the second project, I conducted microbiome analyses for the metagenomic data. The workflow included the removal of contaminant reads and multiple taxonomic and functional profiling. The results revealed that the SCT recipients' samples yielded significantly fewer reads with heavy contamination of the host DNA, and their microbiomes displayed evident signs of dysbiosis. Finally, I discussed several inherent challenges posed by extremely low levels of target DNA and high levels of contamination in the recipient samples, which cannot be rectified solely through bioinformatics approaches. The primary goal of the third project is to design a set of primers that can be used to cover bacterial flagellin genes present in the human gut microbiota. Considering the notable diversity of flagellins, I incorporated a method to select representative bacterial flagellin gene sequences, a heuristic approach based on established primer design methods to generate a degenerate primer set, and a selection method to filter genes unlikely to occur in the human gut microbiome. As a result, I successfully curated a reduced yet representative set of primers that would be practical for experimental implementation

    Robustness, Heterogeneity and Structure Capturing for Graph Representation Learning and its Application

    Get PDF
    Graph neural networks (GNNs) are potent methods for graph representation learn- ing (GRL), which extract knowledge from complicated (graph) structured data in various real-world scenarios. However, GRL still faces many challenges. Firstly GNN-based node classification may deteriorate substantially by overlooking the pos- sibility of noisy data in graph structures, as models wrongly process the relation among nodes in the input graphs as the ground truth. Secondly, nodes and edges have different types in the real-world and it is essential to capture this heterogeneity in graph representation learning. Next, relations among nodes are not restricted to pairwise relations and it is necessary to capture the complex relations accordingly. Finally, the absence of structural encodings, such as positional information, deterio- rates the performance of GNNs. This thesis proposes novel methods to address the aforementioned problems: 1. Bayesian Graph Attention Network (BGAT): Developed for situations with scarce data, this method addresses the influence of spurious edges. Incor- porating Bayesian principles into the graph attention mechanism enhances robustness, leading to competitive performance against benchmarks (Chapter 3). 2. Neighbour Contrastive Heterogeneous Graph Attention Network (NC-HGAT): By enhancing a cutting-edge self-supervised heterogeneous graph neural net- work model (HGAT) with neighbour contrastive learning, this method ad- dresses heterogeneity and uncertainty simultaneously. Extra attention to edge relations in heterogeneous graphs also aids in subsequent classification tasks (Chapter 4). 3. A novel ensemble learning framework is introduced for predicting stock price movements. It adeptly captures both group-level and pairwise relations, lead- ing to notable advancements over the existing state-of-the-art. The integration of hypergraph and graph models, coupled with the utilisation of auxiliary data via GNNs before recurrent neural network (RNN), provides a deeper under- standing of long-term dependencies between similar entities in multivariate time series analysis (Chapter 5). 4. A novel framework for graph structure learning is introduced, segmenting graphs into distinct patches. By harnessing the capabilities of transformers and integrating other position encoding techniques, this approach robustly capture intricate structural information within a graph. This results in a more comprehensive understanding of its underlying patterns (Chapter 6)

    Exponential-time approximation schemes via compression

    Get PDF
    In this paper, we give a framework to design exponential-time approximation schemes for basic graph partitioning problems such as k-way cut, Multiway Cut, Steiner k-cut and Multicut, where the goal is to minimize the number of edges going across the parts. Our motivation to focus on approximation schemes for these problems comes from the fact that while it is possible to solve them exactly in 2^nn^{

    The infrared structure of perturbative gauge theories

    Get PDF
    Infrared divergences in the perturbative expansion of gauge theory amplitudes and cross sections have been a focus of theoretical investigations for almost a century. New insights still continue to emerge, as higher perturbative orders are explored, and high-precision phenomenological applications demand an ever more refined understanding. This review aims to provide a pedagogical overview of the subject. We briefly cover some of the early historical results, we provide some simple examples of low-order applications in the context of perturbative QCD, and discuss the necessary tools to extend these results to all perturbative orders. Finally, we describe recent developments concerning the calculation of soft anomalous dimensions in multi-particle scattering amplitudes at high orders, and we provide a brief introduction to the very active field of infrared subtraction for the calculation of differential distributions at colliders. © 2022 Elsevier B.V

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    Streaming Euclidean Max-Cut: Dimension vs Data Reduction

    Full text link
    Max-Cut is a fundamental problem that has been studied extensively in various settings. We design an algorithm for Euclidean Max-Cut, where the input is a set of points in Rd\mathbb{R}^d, in the model of dynamic geometric streams, where the input X[Δ]dX\subseteq [\Delta]^d is presented as a sequence of point insertions and deletions. Previously, Frahling and Sohler [STOC 2005] designed a (1+ϵ)(1+\epsilon)-approximation algorithm for the low-dimensional regime, i.e., it uses space exp(d)\exp(d). To tackle this problem in the high-dimensional regime, which is of growing interest, one must improve the dependence on the dimension dd, ideally to space complexity poly(ϵ1dlogΔ)\mathrm{poly}(\epsilon^{-1} d \log\Delta). Lammersen, Sidiropoulos, and Sohler [WADS 2009] proved that Euclidean Max-Cut admits dimension reduction with target dimension d=poly(ϵ1)d' = \mathrm{poly}(\epsilon^{-1}). Combining this with the aforementioned algorithm that uses space exp(d)\exp(d'), they obtain an algorithm whose overall space complexity is indeed polynomial in dd, but unfortunately exponential in ϵ1\epsilon^{-1}. We devise an alternative approach of \emph{data reduction}, based on importance sampling, and achieve space bound poly(ϵ1dlogΔ)\mathrm{poly}(\epsilon^{-1} d \log\Delta), which is exponentially better (in ϵ\epsilon) than the dimension-reduction approach. To implement this scheme in the streaming model, we employ a randomly-shifted quadtree to construct a tree embedding. While this is a well-known method, a key feature of our algorithm is that the embedding's distortion O(dlogΔ)O(d\log\Delta) affects only the space complexity, and the approximation ratio remains 1+ϵ1+\epsilon
    corecore