Search CORE

18,542 research outputs found

Unsupervised Dimensionality Reduction for Transfer Learning

Author: Blöbaum Patrick
Hammer Barbara
Schulz Alexander
Verleysen Michel
Publication venue: Ciaco
Publication date: 01/01/2015
Field of study

Blöbaum P, Schulz A, Hammer B. Unsupervised Dimensionality Reduction for Transfer Learning. In: Verleysen M, ed. Proceedings. 23rd European Symposium on Artificial Neural Networks, Computational Intelligence and Machine Learning. Louvain-la-Neuve: Ciaco; 2015: 507-512.We investigate the suitability of unsupervised dimensionality reduction (DR) for transfer learning in the context of different representations of the source and target domain. Essentially, unsupervised DR establishes a link of source and target domain by representing the data in a common latent space. We consider two settings: a linear DR of source and target data which establishes correspondences of the data and an according transfer, and its combination with a non-linear DR which allows to adapt to more complex data characterised by a global non-linear structure

Publications at Bielefeld University

Exploring the spectroscopic diversity of type Ia supernovae with DRACULA: a machine learning approach

Author: Aguena M.
Busti V. C.
Camacho H.
de Souza R. S.
Fantaye Y. T.
Gieseke F.
Ishida E. E. O.
Mazzali P. A.
Sasdelli Michele
Trindade A. M. M.
Vilalta R.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2016
Field of study

The existence of multiple subclasses of type Ia supernovae (SNeIa) has been the subject of great debate in the last decade. One major challenge inevitably met when trying to infer the existence of one or more subclasses is the time consuming, and subjective, process of subclass definition. In this work, we show how machine learning tools facilitate identification of subtypes of SNeIa through the establishment of a hierarchical group structure in the continuous space of spectral diversity formed by these objects. Using Deep Learning, we were capable of performing such identification in a 4 dimensional feature space (+1 for time evolution), while the standard Principal Component Analysis barely achieves similar results using 15 principal components. This is evidence that the progenitor system and the explosion mechanism can be described by a small number of initial physical parameters. As a proof of concept, we show that our results are in close agreement with a previously suggested classification scheme and that our proposed method can grasp the main spectral features behind the definition of such subtypes. This allows the confirmation of the velocity of lines as a first order effect in the determination of SNIa subtypes, followed by 91bg-like events. Given the expected data deluge in the forthcoming years, our proposed approach is essential to allow a quick and statistically coherent identification of SNeIa subtypes (and outliers). All tools used in this work were made publicly available in the Python package Dimensionality Reduction And Clustering for Unsupervised Learning in Astronomy (DRACULA) and can be found within COINtoolbox (https://github.com/COINtoolbox/DRACULA).Comment: 16 pages, 12 figures, accepted for publication in MNRA

arXiv.org e-Print Archive

HAL-IN2P3

HAL Clermont Université

Radboud Repository

ELTE Digital Institutional Repository (EDIT)

MPG.PuRe

State-of-the-art and gaps for deep learning on limited training data in remote sensing

Author: Anderson Derek T.
Ball John E.
Wei Pan
Publication venue
Publication date: 11/07/2018
Field of study

Deep learning usually requires big data, with respect to both volume and variety. However, most remote sensing applications only have limited training data, of which a small subset is labeled. Herein, we review three state-of-the-art approaches in deep learning to combat this challenge. The first topic is transfer learning, in which some aspects of one domain, e.g., features, are transferred to another domain. The next is unsupervised learning, e.g., autoencoders, which operate on unlabeled data. The last is generative adversarial networks, which can generate realistic looking data that can fool the likes of both a deep learning network and human. The aim of this article is to raise awareness of this dilemma, to direct the reader to existing work and to highlight current gaps that need solving.Comment: arXiv admin note: text overlap with arXiv:1709.0030

arXiv.org e-Print Archive

Crossref

A transfer-learning approach to feature extraction from cancer transcriptomes with deep autoencoders

Author: A Bashiri
GE Hinton
I Guyon
J Bergstra
JN Weinstein
JS Parker
K Kourou
N Srivastava
SC Schuster
SJ Pan
Y LeCun
Y Saeys
Y Xiao
Z Wang
Publication venue
Publication date: 18/06/2019
Field of study

Publicado en Lecture Notes in Computer Science.The diagnosis and prognosis of cancer are among the more challenging tasks that oncology medicine deals with. With the main aim of fitting the more appropriate treatments, current personalized medicine focuses on using data from heterogeneous sources to estimate the evolu- tion of a given disease for the particular case of a certain patient. In recent years, next-generation sequencing data have boosted cancer prediction by supplying gene-expression information that has allowed diverse machine learning algorithms to supply valuable solutions to the problem of cancer subtype classification, which has surely contributed to better estimation of patient’s response to diverse treatments. However, the efficacy of these models is seriously affected by the existing imbalance between the high dimensionality of the gene expression feature sets and the number of sam- ples available for a particular cancer type. To counteract what is known as the curse of dimensionality, feature selection and extraction methods have been traditionally applied to reduce the number of input variables present in gene expression datasets. Although these techniques work by scaling down the input feature space, the prediction performance of tradi- tional machine learning pipelines using these feature reduction strategies remains moderate. In this work, we propose the use of the Pan-Cancer dataset to pre-train deep autoencoder architectures on a subset com- posed of thousands of gene expression samples of very diverse tumor types. The resulting architectures are subsequently fine-tuned on a col- lection of specific breast cancer samples. This transfer-learning approach aims at combining supervised and unsupervised deep learning models with traditional machine learning classification algorithms to tackle the problem of breast tumor intrinsic-subtype classification.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

Crossref

Repositorio Institucional Universidad de Málaga