7,710 research outputs found
Computational Analyses of Metagenomic Data
Metagenomics studies the collective microbial genomes extracted from a particular environment without requiring the culturing or isolation of individual genomes, addressing questions revolving around the composition, functionality, and dynamics of microbial communities. The intrinsic complexity of metagenomic data and the diversity of applications call for efficient and accurate computational methods in data handling. In this thesis, I present three primary projects that collectively focus on the computational analysis of metagenomic data, each addressing a distinct topic.
In the first project, I designed and implemented an algorithm named Mapbin for reference-free genomic binning of metagenomic assemblies. Binning aims to group a mixture of genomic fragments based on their genome origin. Mapbin enhances binning results by building a multilayer network that combines the initial binning, assembly graph, and read-pairing information from paired-end sequencing data. The network is further partitioned by the community-detection algorithm, Infomap, to yield a new binning result. Mapbin was tested on multiple simulated and real datasets. The results indicated an overall improvement in the common binning quality metrics.
The second and third projects are both derived from ImMiGeNe, a collaborative and multidisciplinary study investigating the interplay between gut microbiota, host genetics, and immunity in stem-cell transplantation (SCT) patients. In the second project, I conducted microbiome analyses for the metagenomic data. The workflow included the removal of contaminant reads and multiple taxonomic and functional profiling. The results revealed that the SCT recipients' samples yielded significantly fewer reads with heavy contamination of the host DNA, and their microbiomes displayed evident signs of dysbiosis. Finally, I discussed several inherent challenges posed by extremely low levels of target DNA and high levels of contamination in the recipient samples, which cannot be rectified solely through bioinformatics approaches.
The primary goal of the third project is to design a set of primers that can be used to cover bacterial flagellin genes present in the human gut microbiota. Considering the notable diversity of flagellins, I incorporated a method to select representative bacterial flagellin gene sequences, a heuristic approach based on established primer design methods to generate a degenerate primer set, and a selection method to filter genes unlikely to occur in the human gut microbiome. As a result, I successfully curated a reduced yet representative set of primers that would be practical for experimental implementation
UMSL Bulletin 2023-2024
The 2023-2024 Bulletin and Course Catalog for the University of Missouri St. Louis.https://irl.umsl.edu/bulletin/1088/thumbnail.jp
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Optimal Sketching Bounds for Sparse Linear Regression
We study oblivious sketching for -sparse linear regression under various
loss functions such as an norm, or from a broad class of hinge-like
loss functions, which includes the logistic and ReLU losses. We show that for
sparse norm regression, there is a distribution over oblivious
sketches with rows, which is tight up to a
constant factor. This extends to loss with an additional additive
term in the upper bound. This
establishes a surprising separation from the related sparse recovery problem,
which is an important special case of sparse regression. For this problem,
under the norm, we observe an upper bound of rows, showing that sparse recovery is
strictly easier to sketch than sparse regression. For sparse regression under
hinge-like loss functions including sparse logistic and sparse ReLU regression,
we give the first known sketching bounds that achieve rows showing that
rows suffice, where
is a natural complexity parameter needed to obtain relative error bounds for
these loss functions. We again show that this dimension is tight, up to lower
order terms and the dependence on . Finally, we show that similar
sketching bounds can be achieved for LASSO regression, a popular convex
relaxation of sparse regression, where one aims to minimize
over . We show that sketching
dimension suffices and that the dependence
on and is tight.Comment: AISTATS 202
Generic multiplicative endomorphism of a field
We introduce the model-companion of the theory of fields expanded by a unary
function for a multiplicative map, which we call ACFH. Among others, we prove
that this theory is NSOP and not simple, that the kernel of the map is a
generic pseudo-finite abelian group. We also prove that if forking satisfies
existence, then ACFH has elimination of imaginaries.Comment: 34 page
Implicit Loss of Surjectivity and Facial Reduction: Theory and Applications
Facial reduction, pioneered by Borwein and Wolkowicz, is a preprocessing method that is commonly used to obtain strict feasibility in the reformulated, reduced constraint system.
The importance of strict feasibility is often addressed in the context of the convergence results for interior point methods.
Beyond the theoretical properties that the facial reduction conveys, we show that facial reduction, not only limited to interior point methods, leads to strong numerical performances in different classes of algorithms.
In this thesis we study various consequences and the broad applicability of facial reduction.
The thesis is organized in two parts.
In the first part, we show the instabilities accompanied by the absence
of strict feasibility through the lens of facially reduced systems.
In particular, we exploit the implicit redundancies, revealed by each nontrivial facial reduction step, resulting in the implicit loss of surjectivity.
This leads to the two-step facial reduction and two novel related notions of singularity.
For the area of semidefinite programming, we use these singularities to strengthen a known bound on the solution rank, the Barvinok-Pataki bound.
For the area of linear programming, we reveal degeneracies caused by the implicit redundancies.
Furthermore, we propose a preprocessing tool that uses the simplex method.
In the second part of this thesis, we continue with the semidefinite programs that do not have strictly feasible points.
We focus on the doubly-nonnegative relaxation of the binary quadratic program and a semidefinite program with a nonlinear objective function.
We closely work with two classes of algorithms, the splitting method and the Gauss-Newton interior point method.
We elaborate on the advantages in building models from facial reduction. Moreover, we develop algorithms for real-world problems including the quadratic assignment problem, the protein side-chain positioning problem, and the key rate computation for quantum key distribution.
Facial reduction continues to play an important role for
providing robust reformulated models in both the theoretical and the practical aspects, resulting in successful numerical performances
Systemic Circular Economy Solutions for Fiber Reinforced Composites
This open access book provides an overview of the work undertaken within the FiberEUse project, which developed solutions enhancing the profitability of composite recycling and reuse in value-added products, with a cross-sectorial approach. Glass and carbon fiber reinforced polymers, or composites, are increasingly used as structural materials in many manufacturing sectors like transport, constructions and energy due to their better lightweight and corrosion resistance compared to metals. However, composite recycling is still a challenge since no significant added value in the recycling and reprocessing of composites is demonstrated. FiberEUse developed innovative solutions and business models towards sustainable Circular Economy solutions for post-use composite-made products. Three strategies are presented, namely mechanical recycling of short fibers, thermal recycling of long fibers and modular car parts design for sustainable disassembly and remanufacturing. The validation of the FiberEUse approach within eight industrial demonstrators shows the potentials towards new Circular Economy value-chains for composite materials
- …