911 research outputs found

    Evaluation of colorectal cancer subtypes and cell lines using deep learning

    Get PDF
    Colorectal cancer (CRC) is a common cancer with a high mortality rate and a rising incidence rate in the developed world. Molecular profiling techniques have been used to better understand the variability between tumors and disease models such as cell lines. To maximize the translatability and clinical relevance of in vitro studies, the selection of optimal cancer models is imperative. We have developed a deep learning-based method to measure the similarity between CRC tumors and disease models such as cancer cell lines. Our method efficiently leverages multiomics data sets containing copy number alterations, gene expression, and point mutations and learns latent factors that describe data in lower dimensions. These latent factors represent the patterns that are clinically relevant and explain the variability of molecular profiles across tumors and cell lines. Using these, we propose refined CRC subtypes and provide best-matching cell lines to different subtypes. These findings are relevant to patient stratification and selection of cell lines for early-stage drug discovery pipelines, biomarker discovery, and target identification

    Evaluation of colorectal cancer subtypes and cell lines using deep learning

    Get PDF
    Colorectal cancer (CRC) is a common cancer with a high mortality rate and rising incidence rate in the developed world. Molecular profiling techniques have been used to study the variability between tumours as well as cancer models such as cell lines, but their translational value is incomplete with current methods. Moreover, first generation computational methods for subtype classification do not make use of multi-omics data in full scale. Drug discovery programs use cell lines as a proxy for human cancers to characterize their molecular makeup and drug response, identify relevant indications and discover biomarkers. In order to maximize the translatability and the clinical relevance of in vitro studies, selection of optimal cancer models is imperative. We present a novel subtype classification method based on deep learning and apply it to classify CRC tumors using multi-omics data, and further to measure the similarity between tumors and disease models such as cancer cell lines. Multi-omics Autoencoder Integration (maui) efficiently leverages data sets containing copy number alterations, gene expression, and point mutations, and learns clinically important patterns (latent factors) across these data types. Using these latent factors, we propose a refinement of the gold-standard CRC subtypes, and propose best-matching cell lines for the different subtypes. These findings are relevant for patient stratification and selection of cell lines for drug discovery pipelines, biomarker discovery, and target identification

    Microestructura de quesos blancos turcos bajos en grasa producidos industrialmente, influencia de la homogenización de la crema

    Get PDF
    The microstructure and fat globule distribution of reduced and low fat Turkish white cheese were evaluated. Reduced and low fat cheeses were manufactured from 1.5% and 0.75% fat milk respectively which were standardized unhomogenized and homogenized cream in a dairy plant. Homogenized and non-homogenized creams and cheese whey were analyzed for fat globule distribution and cheese samples were also analyzed for microstructure characteristics. According to the results, the homogenization of cream decreased the size of fat globules; and showed that a large number of fat particles were dispersed in the in matrix and improved the lubrication of cheese microstructure. According to the micrographs for the fat, which was not removed, they exhibited a more extended matrix with a few small fat globules compared to the defatted micrographs. Homogenization of cream produces small fat globules and unclustured fat globules were found in the resulting whey. These results are important for dairy processors for using cream homogenization as a processing tool at the industrial level.Se estudia la microestructura y distribución de los glóbulos de grasa de quesos blancos turcos bajos en grasa. Quesos con reducida y baja cantidad en grasa fueron fabricados conteniendo entre el 1,5% y 0,75% de grasa de leche, respectivamente, y con cremas homogeneizadas y no homogeneizadas, en una planta de lácteos. Las cremas homogeneizadas y no homogeneizadas y el suero de los quesos se analizaron para determinar la distribución de los glóbulos de grasa y también se analizaron las características de la microestructura de muestras de queso. De acuerdo con los resultados, la homogeneización de la crema reduce el tamaño de los glóbulos de grasa, mostrando un gran número de partículas de grasa dispersa en la matriz de caseína que mejoró la lubricación de la microestructura del queso. De acuerdo con las micrografías de la grasa que no se elimina, estas exhiben una matriz más amplia en la que hay pocos glóbulos de grasa en comparación con las micrografías de las muestras desgrasadas. La homogenización de la crema produce pequeños glóbulos de grasa y el suero resultante contiene glóbulos de grasa no incrustados. Estos resultados son importantes para los procesadores de productos lácteos, y muestran la utilidad de la homogeneización de crema como una herramienta del procesamiento a nivel industrial

    Strategies for analyzing bisulfite sequencing data

    Get PDF
    DNA methylation is one of the main epigenetic modifications in the eukaryotic genome; it has been shown to play a role in cell-type specific regulation of gene expression, and therefore cell-type identity. Bisulfite sequencing is the gold-standard for measuring methylation over the genomes of interest. Here, we review several techniques used for the analysis of high-throughput bisulfite sequencing. We introduce specialized short-read alignment techniques as well as pre/post-alignment quality check methods to ensure data quality. Furthermore, we discuss subsequent analysis steps after alignment. We introduce various differential methylation methods and compare their performance using simulated and real bisulfite sequencing datasets. We also discuss the methods used to segment methylomes in order to pinpoint regulatory regions. We introduce annotation methods that can be used for further classification of regions returned by segmentation and differential methylation methods. Finally, we review software packages that implement strategies to efficiently deal with large bisulfite sequencing datasets locally and we discuss online analysis workflows that do not require any prior programming skills. The analysis strategies described in this review will guide researchers at any level to the best practices of bisulfite sequencing analysis

    Strategies for analyzing bisulfite sequencing data

    Get PDF
    DNA methylation is one of the main epigenetic modifications in the eukaryotic genome and has been shown to play a role in cell-type specific regulation of gene expression, and therefore cell-type identity. Bisulfite sequencing is the gold-standard for measuring methylation over the genomes of interest. Here, we review several techniques used for the analysis of high-throughput bisulfite sequencing. We introduce specialized short-read alignment techniques as well as pre/post-alignment quality check methods to ensure data quality. Furthermore, we discuss subsequent analysis steps after alignment. We introduce various differential methylation methods and compare their performance using simulated and real bisulfite-sequencing datasets. We also discuss the methods used to segment methylomes in order to pinpoint regulatory regions. We introduce annotation methods that can be used further classification of regions returned by segmentation or differential methylation methods. Lastly, we review software packages that implement strategies to efficiently deal with large bisulfite sequencing datasets locally and also discuss online analysis workflows that do not require any prior programming skills. The analysis strategies described in this review will guide researchers at any level to the best practices of bisulfite sequencing analysis

    Optimal Computation of Avoided Words

    Get PDF
    The deviation of the observed frequency of a word ww from its expected frequency in a given sequence xx is used to determine whether or not the word is avoided. This concept is particularly useful in DNA linguistic analysis. The value of the standard deviation of ww, denoted by std(w)std(w), effectively characterises the extent of a word by its edge contrast in the context in which it occurs. A word ww of length k>2k>2 is a ρ\rho-avoided word in xx if std(w)ρstd(w) \leq \rho, for a given threshold ρ<0\rho < 0. Notice that such a word may be completely absent from xx. Hence computing all such words na\"{\i}vely can be a very time-consuming procedure, in particular for large kk. In this article, we propose an O(n)O(n)-time and O(n)O(n)-space algorithm to compute all ρ\rho-avoided words of length kk in a given sequence xx of length nn over a fixed-sized alphabet. We also present a time-optimal O(σn)O(\sigma n)-time and O(σn)O(\sigma n)-space algorithm to compute all ρ\rho-avoided words (of any length) in a sequence of length nn over an alphabet of size σ\sigma. Furthermore, we provide a tight asymptotic upper bound for the number of ρ\rho-avoided words and the expected length of the longest one. We make available an open-source implementation of our algorithm. Experimental results, using both real and synthetic data, show the efficiency of our implementation

    Global identification of functional microRNA-mRNA interactions in Drosophila

    Get PDF
    MicroRNAs (miRNAs) are key mediators of post-transcriptional gene expression silencing. So far, no comprehensive experimental annotation of functional miRNA target sites exists in Drosophila. Here, we generated a transcriptome-wide in vivo map of miRNA-mRNA interactions in Drosophila melanogaster, making use of single nucleotide resolution in Argonaute1 (AGO1) crosslinking and immunoprecipitation (CLIP) data. Absolute quantification of cellular miRNA levels presents the miRNA pool in Drosophila cell lines to be more diverse than previously reported. Benchmarking two CLIP approaches, we identify a similar predictive potential to unambiguously assign thousands of miRNA-mRNA pairs from AGO1 interaction data at unprecedented depth, achieving higher signal-to-noise ratios than with computational methods alone. Quantitative RNA-seq and sub-codon resolution ribosomal footprinting data upon AGO1 depletion enabled the determination of miRNA-mediated effects on target expression and translation. We thus provide the first comprehensive resource of miRNA target sites and their quantitative functional impact in Drosophila
    corecore