10,604 research outputs found
A Unifying Theory for Graph Transformation
The field of graph transformation studies the rule-based transformation of graphs. An important branch is the algebraic graph transformation tradition, in which approaches are defined and studied using the language of category theory. Most algebraic graph transformation approaches (such as DPO, SPO, SqPO, and AGREE) are opinionated about the local contexts that are allowed around matches for rules, and about how replacement in context should work exactly. The approaches also differ considerably in their underlying formal theories and their general expressiveness (e.g., not all frameworks allow duplication). This dissertation proposes an expressive algebraic graph transformation approach, called PBPO+, which is an adaptation of PBPO by Corradini et al. The central contribution is a proof that PBPO+ subsumes (under mild restrictions) DPO, SqPO, AGREE, and PBPO in the important categorical setting of quasitoposes. This result allows for a more unified study of graph transformation metatheory, methods, and tools. A concrete example of this is found in the second major contribution of this dissertation: a graph transformation termination method for PBPO+, based on decreasing interpretations, and defined for general categories. By applying the proposed encodings into PBPO+, this method can also be applied for DPO, SqPO, AGREE, and PBPO
Computational Analyses of Metagenomic Data
Metagenomics studies the collective microbial genomes extracted from a particular environment without requiring the culturing or isolation of individual genomes, addressing questions revolving around the composition, functionality, and dynamics of microbial communities. The intrinsic complexity of metagenomic data and the diversity of applications call for efficient and accurate computational methods in data handling. In this thesis, I present three primary projects that collectively focus on the computational analysis of metagenomic data, each addressing a distinct topic.
In the first project, I designed and implemented an algorithm named Mapbin for reference-free genomic binning of metagenomic assemblies. Binning aims to group a mixture of genomic fragments based on their genome origin. Mapbin enhances binning results by building a multilayer network that combines the initial binning, assembly graph, and read-pairing information from paired-end sequencing data. The network is further partitioned by the community-detection algorithm, Infomap, to yield a new binning result. Mapbin was tested on multiple simulated and real datasets. The results indicated an overall improvement in the common binning quality metrics.
The second and third projects are both derived from ImMiGeNe, a collaborative and multidisciplinary study investigating the interplay between gut microbiota, host genetics, and immunity in stem-cell transplantation (SCT) patients. In the second project, I conducted microbiome analyses for the metagenomic data. The workflow included the removal of contaminant reads and multiple taxonomic and functional profiling. The results revealed that the SCT recipients' samples yielded significantly fewer reads with heavy contamination of the host DNA, and their microbiomes displayed evident signs of dysbiosis. Finally, I discussed several inherent challenges posed by extremely low levels of target DNA and high levels of contamination in the recipient samples, which cannot be rectified solely through bioinformatics approaches.
The primary goal of the third project is to design a set of primers that can be used to cover bacterial flagellin genes present in the human gut microbiota. Considering the notable diversity of flagellins, I incorporated a method to select representative bacterial flagellin gene sequences, a heuristic approach based on established primer design methods to generate a degenerate primer set, and a selection method to filter genes unlikely to occur in the human gut microbiome. As a result, I successfully curated a reduced yet representative set of primers that would be practical for experimental implementation
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Almost covering all the layers of hypercube with multiplicities
Given a hypercube in and , the -th layer of
denotes the set of all points in whose
coordinates contain exactly many ones. For a fixed and
, let
be a polynomial that has zeroes of multiplicity at least at all points of
, and has zeros of
multiplicity exactly at all points of . In this
short note, we show that Matching the above lower bound we give an explicit
construction of a family of hyperplanes in
, where , such that every
point of will be covered exactly times, and every
other point of will be covered at least times. Note that
putting and , we recover the much celebrated covering result of
Alon and F\"uredi (European Journal of Combinatorics, 1993). Using the above
family of hyperplanes we disprove a conjecture of Venkitesh (The Electronic
Journal of Combinatorics, 2022) on exactly covering symmetric subsets of
hypercube with hyperplanes. To prove the above results we
have introduced a new measure of complexity of a subset of the hypercube called
index complexity which we believe will be of independent interest.
We also study a new interesting variant of the restricted sumset problem
motivated by the ideas behind the proof of the above result.Comment: 16 pages, substantial changes from previous version, title and
abstract changed to better reflect the content of the pape
A tamed family of triangle-free graphs with unbounded chromatic number
We construct a hereditary class of triangle-free graphs with unbounded
chromatic number, in which every non-trivial graph either contains a pair of
non-adjacent twins or has an edgeless vertex cutset of size at most two. This
answers in the negative a question of Chudnovsky, Penev, Scott, and Trotignon.
The class is the hereditary closure of a family of (triangle-free) twincut
graphs such that has chromatic number . We also
show that every twincut graph is edge-critical
Reconfiguration of Digraph Homomorphisms
For a fixed graph H, the H-Recoloring problem asks whether, given two homomorphisms from a graph G to H, one homomorphism can be transformed into the other by changing the image of a single vertex in each step and maintaining a homomorphism to H throughout. The most general algorithmic result for H-Recoloring so far has been proposed by Wrochna in 2014, who introduced a topological approach to obtain a polynomial-time algorithm for any undirected loopless square-free graph H. We show that the topological approach can be used to recover essentially all previous algorithmic results for H-Recoloring and that it is applicable also in the more general setting of digraph homomorphisms. In particular, we show that H-Recoloring admits a polynomial-time algorithm i) if H is a loopless digraph that does not contain a 4-cycle of algebraic girth 0 and ii) if H is a reflexive digraph that contains no triangle of algebraic girth 1 and no 4-cycle of algebraic girth 0
Fast Macroscopic Forcing Method
The macroscopic forcing method (MFM) of Mani and Park and similar methods for
obtaining turbulence closure operators, such as the Green's function-based
approach of Hamba, recover reduced solution operators from repeated direct
numerical simulations (DNS). MFM has been used to quantify RANS-like operators
for homogeneous isotropic turbulence and turbulent channel flows. Standard
algorithms for MFM force each coarse-scale degree of freedom (i.e., degree of
freedom in the RANS space) and conduct a corresponding fine-scale simulation
(i.e., DNS), which is expensive. We combine this method with an approach
recently proposed by Sch\"afer and Owhadi (2023) to recover elliptic integral
operators from a polylogarithmic number of matrix-vector products. The
resulting Fast MFM introduced in this work applies sparse reconstruction to
expose local features in the closure operator and reconstructs this
coarse-grained differential operator in only a few matrix-vector products and
correspondingly, a few MFM simulations. For flows with significant nonlocality,
the algorithm first "peels" long-range effects with dense matrix-vector
products to expose a local operator. We demonstrate the algorithm's performance
for scalar transport in a laminar channel flow and momentum transport in a
turbulent one. For these, we recover eddy diffusivity operators at 1% of the
cost of computing the exact operator via a brute-force approach for the laminar
channel flow problem and 13% for the turbulent one. We observe that we can
reconstruct these operators with an increase in accuracy by about a factor of
100 over randomized low-rank methods. We glean that for problems in which the
RANS space is reducible to one dimension, eddy diffusivity and eddy viscosity
operators can be reconstructed with reasonable accuracy using only a few
simulations, regardless of simulation resolution or degrees of freedom.Comment: 16 pages, 10 figures. S. H. Bryngelson and F. Sch\"afer contributed
equally to this wor
Computation of the von Neumann entropy of large matrices via trace estimators and rational Krylov methods
We consider the problem of approximating the von Neumann entropy of a large,
sparse, symmetric positive semidefinite matrix , defined as
where . After establishing some useful
properties of this matrix function, we consider the use of both polynomial and
rational Krylov subspace algorithms within two types of approximations methods,
namely, randomized trace estimators and probing techniques based on graph
colorings. We develop error bounds and heuristics which are employed in the
implementation of the algorithms. Numerical experiments on density matrices of
different types of networks illustrate the performance of the methods.Comment: 32 pages, 10 figure
- …