97 research outputs found

    A Comprehensive Survey on Generative Diffusion Models for Structured Data

    Full text link
    In recent years, generative diffusion models have achieved a rapid paradigm shift in deep generative models by showing groundbreaking performance across various applications. Meanwhile, structured data, encompassing tabular and time series data, has been received comparatively limited attention from the deep learning research community, despite its omnipresence and extensive applications. Thus, there is still a lack of literature and its reviews on structured data modelling via diffusion models, compared to other data modalities such as visual and textual data. To address this gap, we present a comprehensive review of recently proposed diffusion models in the field of structured data. First, this survey provides a concise overview of the score-based diffusion model theory, subsequently proceeding to the technical descriptions of the majority of pioneering works that used structured data in both data-driven general tasks and domain-specific applications. Thereafter, we analyse and discuss the limitations and challenges shown in existing works and suggest potential research directions. We hope this review serves as a catalyst for the research community, promoting developments in generative diffusion models for structured data.Comment: 20 pages, 1 figure, 2 table

    BioVLAB-MMIA-NGS: MicroRNA-mRNA Integrated Analysis using High Throughput Sequencing Data

    Get PDF
    Motivation: It is now well established that microRNAs (miRNAs) play a critical role in regulating gene expression in a sequence-specific manner, and genome-wide efforts are underway to predict known and novel miRNA targets. However, the integrated miRNA–mRNA analysis remains a major computational challenge, requiring powerful informatics systems and bioinformatics expertise. Results: The objective of this study was to modify our widely recognized Web server for the integrated mRNA–miRNA analysis (MMIA) and its subsequent deployment on the Amazon cloud (BioVLAB-MMIA) to be compatible with high-throughput platforms, including next-generation sequencing (NGS) data (e.g. RNA-seq). We developed a new version called the BioVLAB-MMIA-NGS, deployed on both Amazon cloud and on a high-performance publicly available server called MAHA. By using NGS data and integrating various bioinformatics tools and databases, BioVLAB-MMIA-NGS offers several advantages. First, sequencing data is more accurate than array-based methods for determining miRNA expression levels. Second, potential novel miRNAs can be detected by using various computational methods for characterizing miRNAs. Third, because miRNA-mediated gene regulation is due to hybridization of an miRNA to its target mRNA, sequencing data can be used to identify many-to-many relationship between miRNAs and target genes with high accuracy

    Motility increase of adherent invasive Escherichia coli (AIEC) induced by a sub-inhibitory concentration of recombinant endolysin LysPA90

    Get PDF
    Endolysins are bacteriophage enzymes required for the eruption of phages from inside host bacteria via the degradation of the peptidoglycan cell wall. Recombinant endolysins are increasingly being seen as potential antibacterial candidates, with a number currently undergoing clinical trials. Bacteriophage PBPA90 infecting Pseudomonas aeruginosa harbors a gene encoding an endolysin, lysPA90. Herein, recombinant LysPA90 demonstrated an intrinsic antibacterial activity against Escherichia coli in vitro. It was observed that a sub-inhibitory concentration of the recombinant protein induced the upregulation of genes related to flagella biosynthesis in a commensal E. coli strain. Increases in the number of bacterial flagella, and in motility, were experimentally substantiated. The treatment caused membrane stress, leading to the upregulation of genes rpoE, rpoH, dnaK, dnaJ, and flhC, which are upstream regulators of flagella biosynthesis. When adherent invasive Escherichia coli (AIEC) strains were treated with subinhibitory concentrations of the endolysin, bacterial adhesion and invasion into intestinal epithelial Caco-2 cells was seen to visibly increase under microscopic examination. Bacterial counting further corroborated this adhesion and invasion of AIEC strains into Caco-2 cells, with a resultant slight decrease in the viability of Caco-2 cells then being observed. Additionally, genes related to flagella expression were also upregulated in the AIEC strains. Finally, the enhanced expression of the proinflammatory cytokine genes TNF-α, IL-6, IL-8, and MCP1 in Caco-2 cells was noted after the increased invasion of the AIEC strains. While novel treatments involving endolysins offer great promise, these results highlight the need for the further exploration of possible unanticipated and unintended effects

    HTRgene: a computational method to perform the integrated analysis of multiple heterogeneous time-series data: case analysis of cold and heat stress response signaling genes in Arabidopsis

    Get PDF
    Background Integrated analysis that uses multiple sample gene expression data measured under the same stress can detect stress response genes more accurately than analysis of individual sample data. However, the integrated analysis is challenging since experimental conditions (strength of stress and the number of time points) are heterogeneous across multiple samples. Results HTRgene is a computational method to perform the integrated analysis of multiple heterogeneous time-series data measured under the same stress condition. The goal of HTRgene is to identify response order preserving DEGs that are defined as genes not only which are differentially expressed but also whose response order is preserved across multiple samples. The utility of HTRgene was demonstrated using 28 and 24 time-series sample gene expression data measured under cold and heat stress in Arabidopsis. HTRgene analysis successfully reproduced known biological mechanisms of cold and heat stress in Arabidopsis. Also, HTRgene showed higher accuracy in detecting the documented stress response genes than existing tools. Conclusions HTRgene, a method to find the ordering of response time of genes that are commonly observed among multiple time-series samples, successfully integrated multiple heterogeneous time-series gene expression datasets. It can be applied to many research problems related to the integration of time series data analysis.This work, including publication costs, was supported by National Research Foundation of Korea(NRF) funded by the Ministry of Science, ICT (No.NRF-2017M3C4A7065887). This work was also supported by the Collaborative Genome Program for Fostering New Post-Genome Industry of the National Research Foundation (NRF) funded by the Ministry of Science and ICT (MSIT) (No. NRF-2014M3C9A3063541), and a grant of the Korea Health Technology R&D Project through the Korea Health Industry Development Institute (KHIDI), funded by the Ministry of Health & Welfare, Republic of Korea (grant number: HI15C3224). This work was supported for W.J. by the Agenda program (No.PJ012465032019), Rural Development of dministration of Republic of Korea

    Insulin Sensitivity Is Retained in Mice with Endothelial Loss of Carcinoembryonic Antigen Cell Adhesion Molecule 1

    Get PDF
    CEACAM1 regulates endothelial barrier integrity. Because insulin signaling in extrahepatic target tissues is regulated by insulin transport through the endothelium, we aimed at investigating the metabolic role of endothelial CEACAM1. To this end, we generated endothelial cell-specific Ceacam1 null mice (VECadCre+Cc1(fl/fl)) and carried out their metabolic phenotyping and mechanistic analysis by comparison to littermate controls. Hyperinsulinemic-euglycemic clamp analysis showed intact insulin sensitivity in VECadCre+Cc1(fl/fl) mice. This was associated with the absence of visceral obesity and lipolysis and normal levels of circulating non-esterified fatty acids, leptin, and adiponectin. Whereas the loss of endothelial Ceacam1 did not affect insulin-stimulated receptor phosphorylation, it reduced IRS-1/Akt/eNOS activation to lower nitric oxide production resulting from limited SHP2 sequestration. It also reduced Shc sequestration to activate NF-kappaB and increase the transcription of matrix metalloproteases, ultimately inducing plasma IL-6 and TNFalpha levels. Loss of endothelial Ceacam1 also induced the expression of the anti-inflammatory CEACAM1-4L variant in M2 macrophages in white adipose tissue. Together, this could cause endothelial barrier dysfunction and facilitate insulin transport, sustaining normal glucose homeostasis and retaining fat accumulation in adipocytes. The data assign a significant role for endothelial cell CEACAM1 in maintaining insulin sensitivity in peripheral extrahepatic target tissues

    The Integrated Genomic Landscape of Thymic Epithelial Tumors

    Get PDF
    Thymic epithelial tumors (TETs) are one of the rarest adult malignancies. Among TETs, thymoma is the most predominant, characterized by a unique association with autoimmune diseases, followed by thymic carcinoma, which is less common but more clinically aggressive. Using multi-platform omics analyses on 117 TETs, we define four subtypes of these tumors defined by genomic hallmarks and an association with survival and World Health Organization histological subtype. We further demonstrate a marked prevalence of a thymoma-specific mutated oncogene, GTF2I, and explore its biological effects on multi-platform analysis. We further observe enrichment of mutations in HRAS, NRAS, and TP53. Last, we identify a molecular link between thymoma and the autoimmune disease myasthenia gravis, characterized by tumoral overexpression of muscle autoantigens, and increased aneuploidy

    Subtype-specific CpG island shore methylation and mutation patterns in 30 breast cancer cell lines

    Get PDF
    BACKGROUND: Aberrant epigenetic modifications, including DNA methylation, are key regulators of gene activity in tumorigenesis. Breast cancer is a heterogeneous disease, and large-scale analyses indicate that tumor from normal and benign tissues, as well as molecular subtypes of breast cancer, can be distinguished based on their distinct genomic, transcriptomic, and epigenomic profiles. In this study, we used affinity-based methylation sequencing data in 30 breast cancer cell lines representing functionally distinct cancer subtypes to investigate methylation and mutation patterns at the whole genome level. RESULTS: Our analysis revealed significant differences in CpG island (CpGI) shore methylation and mutation patterns among breast cancer subtypes. In particular, the basal-like B type, a highly aggressive form of the disease, displayed distinct CpGI shore hypomethylation patterns that were significantly associated with downstream gene regulation. We determined that mutation rates at CpG sites were highly correlated with DNA methylation status and observed distinct mutation rates among the breast cancer subtypes. These findings were validated by using targeted bisulfite sequencing of differentially expressed genes (n=85) among the cell lines. CONCLUSIONS: Our results suggest that alterations in DNA methylation play critical roles in gene regulatory process as well as cytosine substitution rates at CpG sites in molecular subtypes of breast cancer. ELECTRONIC SUPPLEMENTARY MATERIAL: The online version of this article (doi:10.1186/s12918-016-0356-2) contains supplementary material, which is available to authorized users
    corecore