122 research outputs found
LIPIcs, Volume 251, ITCS 2023, Complete Volume
LIPIcs, Volume 251, ITCS 2023, Complete Volum
Specificity of the innate immune responses to different classes of non-tuberculous mycobacteria
Mycobacterium avium is the most common nontuberculous mycobacterium (NTM) species causing infectious disease. Here, we characterized a M. avium infection model in zebrafish larvae, and compared it to M. marinum infection, a model of tuberculosis. M. avium bacteria are efficiently phagocytosed and frequently induce granuloma-like structures in zebrafish larvae. Although macrophages can respond to both mycobacterial infections, their migration speed is faster in infections caused by M. marinum. Tlr2 is conservatively involved in most aspects of the defense against both mycobacterial infections. However, Tlr2 has a function in the migration speed of macrophages and neutrophils to infection sites with M. marinum that is not observed with M. avium. Using RNAseq analysis, we found a distinct transcriptome response in cytokine-cytokine receptor interaction for M. avium and M. marinum infection. In addition, we found differences in gene expression in metabolic pathways, phagosome formation, matrix remodeling, and apoptosis in response to these mycobacterial infections. In conclusion, we characterized a new M. avium infection model in zebrafish that can be further used in studying pathological mechanisms for NTM-caused diseases
LIPIcs, Volume 277, GIScience 2023, Complete Volume
LIPIcs, Volume 277, GIScience 2023, Complete Volum
12th International Conference on Geographic Information Science: GIScience 2023, September 12–15, 2023, Leeds, UK
No abstract available
Systematic Approaches for Telemedicine and Data Coordination for COVID-19 in Baja California, Mexico
Conference proceedings info:
ICICT 2023: 2023 The 6th International Conference on Information and Computer Technologies
Raleigh, HI, United States, March 24-26, 2023
Pages 529-542We provide a model for systematic implementation of telemedicine within a large evaluation center for COVID-19 in the area of Baja California, Mexico. Our model is based on human-centric design factors and cross disciplinary collaborations for scalable data-driven enablement of smartphone, cellular, and video Teleconsul-tation technologies to link hospitals, clinics, and emergency medical services for point-of-care assessments of COVID testing, and for subsequent treatment and quar-antine decisions. A multidisciplinary team was rapidly created, in cooperation with different institutions, including: the Autonomous University of Baja California, the Ministry of Health, the Command, Communication and Computer Control Center
of the Ministry of the State of Baja California (C4), Colleges of Medicine, and the College of Psychologists. Our objective is to provide information to the public and to evaluate COVID-19 in real time and to track, regional, municipal, and state-wide data in real time that informs supply chains and resource allocation with the anticipation of a surge in COVID-19 cases. RESUMEN Proporcionamos un modelo para la implementación sistemática de la telemedicina dentro de un gran centro de evaluación de COVID-19 en el área de Baja California, México. Nuestro modelo se basa en factores de diseño centrados en el ser humano y colaboraciones interdisciplinarias para la habilitación escalable basada en datos de tecnologías de teleconsulta de teléfonos inteligentes, celulares y video para vincular hospitales, clínicas y servicios médicos de emergencia para evaluaciones de COVID en el punto de atención. pruebas, y para el tratamiento posterior y decisiones de cuarentena. Rápidamente se creó un equipo multidisciplinario, en cooperación con diferentes instituciones, entre ellas: la Universidad Autónoma de Baja California, la Secretaría de Salud, el Centro de Comando, Comunicaciones y Control Informático.
de la Secretaría del Estado de Baja California (C4), Facultades de Medicina y Colegio de Psicólogos. Nuestro objetivo es proporcionar información al público y evaluar COVID-19 en tiempo real y rastrear datos regionales, municipales y estatales en tiempo real que informan las cadenas de suministro y la asignación de recursos con la anticipación de un aumento de COVID-19. 19 casos.ICICT 2023: 2023 The 6th International Conference on Information and Computer Technologieshttps://doi.org/10.1007/978-981-99-3236-
Computational Methods for Compositional Epistasis Detection
In genetics, the term “epistasis” refers to the phenomenon that the effect of one gene
or single-nucleotide polymorphism (SNP) is dependent on the presence of others. Various
possibilities of epistasis exist, and the understanding of them is limited. In recent years,
failure of replication for single-locus effects in genome-wide association studies (GWAS)
motivates the exploration of epistasis for human complex disease.
This thesis is thus dedicated to the study of computational approaches for two-way
compositional epistasis (SNP-SNP interaction) detection. Epistasis of this sort is best
described by disease models, which can be simply understood as disease probability patterns
associated with the genotype combinations of SNP-pairs. Because the epistasis detection
problem requires determination of proper disease models to capture the compositional epistasis
effect, it is more complicated than a typical variable selection task.
Three projects are pursued in this thesis. The first two target epistasis that is characterized
by a set of “two-locus, two-allele, two-phenotype and complete-penetrance” (TTTC) disease
model, and the third one extends to more general epistasis.
There are theoretically 2^9 = 512 TTTC disease models. For a given SNP-pair, the first step
of the problem is to find a proper TTTC model to capture its epistasis effect. It is found that
existing methods that use data to determine best-fitting disease models prior to screening
may be too greedy. Motivated by this, the first project proposes a less greedy strategy by
limiting the search of disease models to a set of prototypes. The prototypes are determined a
priori. Specifically, a distance metric is defined and used to cluster all disease models, and
then a “representative” from each cluster is selected to form the prototypes. Compared to
the existing approaches, the proposed method provides a more satisfying balance between
precision and recall in epistasis detection.
If one uses data to determine a best-fitting disease model for a pair of SNPs, the nominal
statistical evidence of association between the SNP-pair and the disease outcome is inflated.
Therefore, the second project aims to directly correct inflation of this type. To make it feasible
for genome-wide studies, a first-order correction method is proposed that can be applied in
practice with no additional computational cost. Simulation studies are performed on two
popular existing methods, which show that the correction is quite effective in improving an
overall epistasis detection.
The TTTC disease models can be viewed as coding two risk levels, i.e., high and low risk.
Compared to them, some other disease models code multiple risk levels, which capture more
general epistasis patterns. Two methods are proposed in the third project, which are centered
on epistasis detection using multi-level risk disease models. One method is inspired by the
fused lasso under a regression-based framework, and adopts the post-model selection test to
account for inflation incurred during disease model searching. The other one makes sequential
split of the genotype combinations of a SNP-pair and uses a stopping criterion to determine
the final disease model; after that, it also applies a first-order correction to the testing
statistic to effectively account for inflation. It is shown that the two methods with totally
different starting framework are equivalent in terms of the disease model searching process.
Subsequent simulation studies show that use of multi-level disease models achieves better
detection efficiency in terms of a balance between precision and recall than the two-level ones.
In summary, it is a rather complicated task to uncover the underlying mechanism of locus
interaction effects, and endeavours are only beginning to be made. The epistasis detection
methods in this thesis are practically useful at genome-wide level, which complements the
single SNP screening in genome-wide association studies. What’s more, the method of
first-order correction for inflation is simple and effective, which is practically valuable for the
epistasis detection methods involving inflated testing statistics
Stinging the Predators: A collection of papers that should never have been published
This ebook collects academic papers and conference abstracts that were meant to be so terrible that nobody in their right mind would publish them. All were submitted to journals and conferences to expose weak or non-existent peer review and other exploitative practices. Each paper has a brief introduction. Short essays round out the collection
New Algorithms for Fast and Economic Assembly: Advances in Transcriptome and Genome Assembly
Great efforts have been devoted to decipher the sequence composition of
the genomes and transcriptomes of diverse organisms. Continuing advances in
high-throughput sequencing technologies have led to a decline in associated
costs, facilitating a rapid increase in the amount of available genetic data. In
particular genome studies have undergone a fundamental paradigm shift where
genome projects are no longer limited by sequencing costs, but rather by
computational problems associated with assembly. There is an urgent demand
for more efficient and more accurate methods. Most recently, “hybrid”
methods that integrate short- and long-read data have been devised to address
this need. LazyB is a new, low-cost hybrid genome assembler. It starts from a
bipartite overlap graph between long reads and restrictively filtered short-read
unitigs. This graph is translated into a long-read overlap graph. By design,
unitigs are both unique and almost free of assembly errors. As a consequence,
only few spurious overlaps are introduced into the graph. Instead of the more
conventional approach of removing tips, bubbles, and other local features,
LazyB extracts subgraphs whose global properties approach a disjoint union of
paths in multiple steps, utilizing properties of proper interval graphs. A
prototype implementation of LazyB, entirely written in Python, not only yields
significantly more accurate assemblies of the yeast, fruit fly, and human
genomes compared to state-of-the-art pipelines, but also requires much less
computational effort. An optimized C++ implementation dubbed MuCHSALSA
further significantly reduces resource demands.
Advances in RNA-seq have facilitated tremendous insights into the role of
both coding and non-coding transcripts. Yet, the complete and accurate
annotation of the transciptomes of even model organisms has remained elusive.
RNA-seq produces reads significantly shorter than the average distance
between related splice events and presents high noise levels and other biases
The computational reconstruction remains a critical bottleneck.
Ryūtō implements an extension of common splice graphs facilitating the integration
of reads spanning multiple splice sites and paired-end reads bridging distant
transcript parts. The decomposition of read coverage patterns is modeled as a
minimum-cost flow problem. Using phasing information from multi-splice and
paired-end reads, nodes with uncertain connections are decomposed step-wise
via Linear Programming.
Ryūtōs performance compares favorably with
state-of-the-art methods on both simulated and real-life datasets. Despite
ongoing research and our own contributions, progress on traditional single
sample assembly has brought no major breakthrough. Multi-sample RNA-Seq
experiments provide more information which, however, is challenging to utilize
due to the large amount of accumulating errors. An extension to Ryūtō
enables the reconstruction of consensus transcriptomes from multiple RNA-seq
data sets, incorporating consensus calling at low level features. Benchmarks
show stable improvements already at 3 replicates.
Ryūtō outperforms competing approaches, providing a better and user-adjustable
sensitivity-precision trade-off. Ryūtō consistently improves assembly on
replicates, demonstrable also when mixing conditions or time series and for
differential expression analysis. Ryūtōs approach towards guided assembly is
equally unique. It allows users to adjust results based on the quality of the
guide, even for multi-sample assembly.:1 Preface
1.1 Assembly: A vast and fast evolving field
1.2 Structure of this Work
1.3 Available
2 Introduction
2.1 Mathematical Background
2.2 High-Throughput Sequencing
2.3 Assembly
2.4 Transcriptome Expression
3 From LazyB to MuCHSALSA - Fast and Cheap Genome Assembly
3.1 Background
3.2 Strategy
3.3 Data preprocessing
3.4 Processing of the overlap graph
3.5 Post Processing of the Path Decomposition
3.6 Benchmarking
3.7 MuCHSALSA – Moving towards the future
4 Ryūtō - Versatile, Fast, and Effective Transcript Assembly
4.1 Background
4.2 Strategy
4.3 The Ryūtō core algorithm
4.4 Improved Multi-sample transcript assembly with Ryūtō
5 Conclusion & Future Work
5.1 Discussion and Outlook
5.2 Summary and Conclusio
Pan-genomics and the structural diversity of plant genomes
A central task of genetics research is to uncover genotypes linked to important phenotypes. However, many genomic loci are incompletely or inaccurately represented in genetics studies, thus obscuring their function and evolution. New technology can accurately and continuously sequence large segments of genomic DNA at affordable cost and unprecedented scale, raising the possibility of complete and accurate representations of genomes across the tree of life. However, new computational methods are required to automatically finish, validate, and curate the forthcoming wave of genome assemblies enabled by these technologies. Researchers must also devise analytical approaches to comparing previously unresolved and usually repetitive genomic loci within and between species. Here, we introduce RaGOO and RagTag, new methods that leverage genome maps to automatically scaffold and improve draft genome assemblies into chromosome-scale representations. By applying these new methods to a bread wheat genome, we show how the established reference falsely collapsed functional paralogs genome-wide. In Arabidopsis thaliana, we present a new reference assembly that completely resolves all five centromeres for the first time, revealing centromere architecture, genetics, epigenetics, and evolution. Finally, we present a catalog of natural structural variants (SVs) across 100 diverse tomato accessions revealing exceptional genetic diversity via artificial introgression as well as broad and specific examples of how SVs influence molecular, domestication, and improvement phenotypes. This work underscores the potential to accelerate genetics research with complete and diverse genotype data and apply these findings to plant breeding and engineering
- …