47 research outputs found

    Storing Digital Information in Long-Read DNA

    Get PDF
    There is urgent need for effective and cost-efficient data storage, as the worldwide requirement for data storage is rapidly growing. DNA has introduced a new tool for storing digital information. Recent studies have successfully stored digital information, such as text and gif animation. Previous studies tackled technical hurdles due to errors from DNA synthesis and sequencing. Studies also have focused on a strategy that makes use of 100‒150-bp read sizes in both synthesis and sequencing. In this paper, we a suggest novel data encoding/decoding scheme that makes use of long-read DNA (~1,000 bp). This enables accurate recovery of stored digital information with a smaller number of reads than the previous approach. Also, this approach reduces sequencing time

    Prediction of a time-to-event trait using genome wide SNP data

    Get PDF
    BACKGROUND: A popular objective of many high-throughput genome projects is to discover various genomic markers associated with traits and develop statistical models to predict traits of future patients based on marker values. RESULTS: In this paper, we present a prediction method for time-to-event traits using genome-wide single-nucleotide polymorphisms (SNPs). We also propose a MaxTest associating between a time-to-event trait and a SNP accounting for its possible genetic models. The proposed MaxTest can help screen out nonprognostic SNPs and identify genetic models of prognostic SNPs. The performance of the proposed method is evaluated through simulations. CONCLUSIONS: In conjunction with the MaxTest, the proposed method provides more parsimonious prediction models but includes more prognostic SNPs than some naive prediction methods. The proposed method is demonstrated with real GWAS data

    Southern Hemisphere mid- and high-latitudinal AOD, CO, NO2, and HCHO: spatiotemporal patterns revealed by satellite observations

    Get PDF
    To assess air pollution emitted in Southern Hemisphere mid-latitudes and transported to Antarctica, we investigate the climatological mean and temporal trends in aerosol optical depth (AOD), carbon monoxide (CO), nitrogen dioxide (NO2), and formaldehyde (HCHO) columns using satellite observations. Generally, all these measurements exhibit sharp peaks over and near the three nearby inhabited continents: South America, Africa, and Australia. This pattern indicates the large emission effect of anthropogenic activities and biomass burning processes. High AOD is also found over the Southern Atlantic Ocean, probably because of the sea salt production driven by strong winds. Since the pristine Antarctic atmosphere can be polluted by transport of air pollutants from the mid-latitudes, we analyze the 10-day back trajectories that arrive at Antarctic ground stations in consideration of the spatial distribution of mid-latitudinal AOD, CO, NO2, and HCHO. We find that the influence of mid-latitudinal emission differs across Antarctic regions: western Antarctic regions show relatively more back trajectories from the mid-latitudes, while the eastern Antarctic regions do not show large intrusions of mid-latitudinal air masses. Finally, we estimate the long-term trends in AOD, CO, NO2, and HCHO during the past decade (2005-2016). While CO shows a significant negative trend, the others show overall positive trends. Seasonal and regional differences in trends are also discussed

    Critical Boundary Sine-Gordon Revisited

    Full text link
    We revisit the exact solution of the two space-time dimensional quantum field theory of a free massless boson with a periodic boundary interaction and self-dual period. We analyze the model by using a mapping to free fermions with a boundary mass term originally suggested in ref.[22]. We find that the entire SL(2,C) family of boundary states of a single boson are boundary sine-Gordon states and we derive a simple explicit expression for the boundary state in fermion variables and as a function of sine-Gordon coupling constants. We use this expression to compute the partition function. We observe that the solution of the model has a strong-weak coupling generalization of T-duality. We then examine a class of recently discovered conformal boundary states for compact bosons with radii which are rational numbers times the self-dual radius. These have simple expression in fermion variables. We postulate sine-Gordon-like field theories with discrete gauge symmmetries for which they are the appropriate boundary states.Comment: 33 pages, 1 figure, references added, typos correcte

    Ovarian Cancer Prognostic Prediction Model Using RNA Sequencing Data

    Get PDF
    Ovarian cancer is one of the leading causes of cancer-related deaths in gynecological malignancies. Over 70% of ovarian cancer cases are high-grade serous ovarian cancers and have high death rates due to their resistance to chemotherapy. Despite advances in surgical and pharmaceutical therapies, overall survival rates are not good, and making an accurate prediction of the prognosis is not easy because of the highly heterogeneous nature of ovarian cancer. To improve the patient’s prognosis through proper treatment, we present a prognostic prediction model by integrating high-dimensional RNA sequencing data with their clinical data through the following steps: gene filtration, pre-screening, gene marker selection, integrated study of selected gene markers and prediction model building. These steps of the prognostic prediction model can be applied to other types of cancer besides ovarian cancer

    Pathway-Driven Discovery of Rare Mutational Impact on Cancer

    No full text
    Identifying driver mutation is important in understanding disease mechanism and future application of custom tailored therapeutic decision. Functional analysis of mutational impact usually focuses on the gene expression level of the mutated gene itself. However, complex regulatory network may cause differential gene expression among functional neighbors of the mutated gene. We suggest a new approach for discovering rare mutations that have real impact in the context of pathway; the philosophy of our method is iteratively combining rare mutations until no more mutations can be added under the condition that the combined mutational event can statistically discriminate pathway level mRNA expression between groups with and without mutational events. Breast cancer patients with somatic mutation and mRNA expression were analyzed by our approach. Our approach is shown to sensitively capture mutations that change pathway level mRNA expression, concurrently discovering important mutations previously reported in breast cancer such as TP53, PIK3CA, and RB1. In addition, out of 15,819 genes considered in breast cancer, our approach identified mutational events of 32 genes showing pathway level mRNA expression differences

    Deep Learning for Integrated Analysis of Insulin Resistance with Multi-Omics Data

    No full text
    Technological advances in next-generation sequencing (NGS) have made it possible to uncover extensive and dynamic alterations in diverse molecular components and biological pathways across healthy and diseased conditions. Large amounts of multi-omics data originating from emerging NGS experiments require feature engineering, which is a crucial step in the process of predictive modeling. The underlying relationship among multi-omics features in terms of insulin resistance is not well understood. In this study, using the multi-omics data of type II diabetes from the Integrative Human Microbiome Project, from 10,783 features, we conducted a data analytic approach to elucidate the relationship between insulin resistance and multi-omics features, including microbiome data. To better explain the impact of microbiome features on insulin classification, we used a developed deep neural network interpretation algorithm for each microbiome feature’s contribution to the discriminative model output in the samples

    Gene expression based prediction of prognostic outcome in ovarian cancer

    No full text
    Gene expression provides rich information. Successful application has made to predict prognosis of several cancers such as breast and colon. However, although ovarian cancer is the fifth leading death cancer to women, precise prediction of survival outcome is not available yet. Thus there is a still urgent need for optimized treatment decision. Recent studies made use of public gene expression data sources to predict the clinical outcome of ovarian cancer. Typically, two steps approach has tried. First step is figuring out significant genes by univariate Cox regression model. Second step is providing a statistic that will combine the effect of selected genes in terms of survival risk. One of drawback of the two steps approach is low reproducibility. Statistics for risk group classification built in the train set often fails to be validated when the statistic is applied to the data set. Applying the scheme to the RNAseq data from The Cancer Genome Atlas(TCGA) has shown that the classification results of the patient's prognosis was classified higher and lower risk patient of the patient's prognosis. We applied median standard to the classification of existing scheme and suggested other schemes for the successive work.N
    corecore