61 research outputs found
Learning Latent Representations of Bank Customers With The Variational Autoencoder
Learning data representations that reflect the customers' creditworthiness
can improve marketing campaigns, customer relationship management, data and
process management or the credit risk assessment in retail banks. In this
research, we adopt the Variational Autoencoder (VAE), which has the ability to
learn latent representations that contain useful information. We show that it
is possible to steer the latent representations in the latent space of the VAE
using the Weight of Evidence and forming a specific grouping of the data that
reflects the customers' creditworthiness. Our proposed method learns a latent
representation of the data, which shows a well-defied clustering structure
capturing the customers' creditworthiness. These clusters are well suited for
the aforementioned banks' activities. Further, our methodology generalizes to
new customers, captures high-dimensional and complex financial data, and scales
to large data sets.Comment: arXiv admin note: substantial text overlap with arXiv:1806.0253
Evaluation of colorectal cancer subtypes and cell lines using deep learning
Colorectal cancer (CRC) is a common cancer with a high mortality rate and rising incidence rate in the developed world. Molecular profiling techniques have been used to study the variability between tumours as well as cancer models such as cell lines, but their translational value is incomplete with current methods. Moreover, first generation computational methods for subtype classification do not make use of multi-omics data in full scale. Drug discovery programs use cell lines as a proxy for human cancers to characterize their molecular makeup and drug response, identify relevant indications and discover biomarkers. In order to maximize the translatability and the clinical relevance of in vitro studies, selection of optimal cancer models is imperative. We present a novel subtype classification method based on deep learning and apply it to classify CRC tumors using multi-omics data, and further to measure the similarity between tumors and disease models such as cancer cell lines. Multi-omics Autoencoder Integration (maui) efficiently leverages data sets containing copy number alterations, gene expression, and point mutations, and learns clinically important patterns (latent factors) across these data types. Using these latent factors, we propose a refinement of the gold-standard CRC subtypes, and propose best-matching cell lines for the different subtypes. These findings are relevant for patient stratification and selection of cell lines for drug discovery pipelines, biomarker discovery, and target identification
Integrated Multi-omics Analysis Using Variational Autoencoders: Application to Pan-cancer Classification
Different aspects of a clinical sample can be revealed by multiple types of
omics data. Integrated analysis of multi-omics data provides a comprehensive
view of patients, which has the potential to facilitate more accurate clinical
decision making. However, omics data are normally high dimensional with large
number of molecular features and relatively small number of available samples
with clinical labels. The "dimensionality curse" makes it challenging to train
a machine learning model using high dimensional omics data like DNA methylation
and gene expression profiles. Here we propose an end-to-end deep learning model
called OmiVAE to extract low dimensional features and classify samples from
multi-omics data. OmiVAE combines the basic structure of variational
autoencoders with a classification network to achieve task-oriented feature
extraction and multi-class classification. The training procedure of OmiVAE is
comprised of an unsupervised phase without the classifier and a supervised
phase with the classifier. During the unsupervised phase, a hierarchical
cluster structure of samples can be automatically formed without the need for
labels. And in the supervised phase, OmiVAE achieved an average classification
accuracy of 97.49% after 10-fold cross-validation among 33 tumour types and
normal samples, which shows better performance than other existing methods. The
OmiVAE model learned from multi-omics data outperformed that using only one
type of omics data, which indicates that the complementary information from
different omics datatypes provides useful insights for biomedical tasks like
cancer classification.Comment: 7 pages, 4 figure
Evaluation of colorectal cancer subtypes and cell lines using deep learning
Colorectal cancer (CRC) is a common cancer with a high mortality rate and a rising incidence rate in the developed world. Molecular profiling techniques have been used to better understand the variability between tumors and disease models such as cell lines. To maximize the translatability and clinical relevance of in vitro studies, the selection of optimal cancer models is imperative. We have developed a deep learning-based method to measure the similarity between CRC tumors and disease models such as cancer cell lines. Our method efficiently leverages multiomics data sets containing copy number alterations, gene expression, and point mutations and learns latent factors that describe data in lower dimensions. These latent factors represent the patterns that are clinically relevant and explain the variability of molecular profiles across tumors and cell lines. Using these, we propose refined CRC subtypes and provide best-matching cell lines to different subtypes. These findings are relevant to patient stratification and selection of cell lines for early-stage drug discovery pipelines, biomarker discovery, and target identification
The State of Applying Artificial Intelligence to Tissue Imaging for Cancer Research and Early Detection
Artificial intelligence represents a new frontier in human medicine that
could save more lives and reduce the costs, thereby increasing accessibility.
As a consequence, the rate of advancement of AI in cancer medical imaging and
more particularly tissue pathology has exploded, opening it to ethical and
technical questions that could impede its adoption into existing systems. In
order to chart the path of AI in its application to cancer tissue imaging, we
review current work and identify how it can improve cancer pathology
diagnostics and research. In this review, we identify 5 core tasks that models
are developed for, including regression, classification, segmentation,
generation, and compression tasks. We address the benefits and challenges that
such methods face, and how they can be adapted for use in cancer prevention and
treatment. The studies looked at in this paper represent the beginning of this
field and future experiments will build on the foundations that we highlight
- …