10 research outputs found
Object Recognition from very few Training Examples for Enhancing Bicycle Maps
In recent years, data-driven methods have shown great success for extracting
information about the infrastructure in urban areas. These algorithms are
usually trained on large datasets consisting of thousands or millions of
labeled training examples. While large datasets have been published regarding
cars, for cyclists very few labeled data is available although appearance,
point of view, and positioning of even relevant objects differ. Unfortunately,
labeling data is costly and requires a huge amount of work. In this paper, we
thus address the problem of learning with very few labels. The aim is to
recognize particular traffic signs in crowdsourced data to collect information
which is of interest to cyclists. We propose a system for object recognition
that is trained with only 15 examples per class on average. To achieve this, we
combine the advantages of convolutional neural networks and random forests to
learn a patch-wise classifier. In the next step, we map the random forest to a
neural network and transform the classifier to a fully convolutional network.
Thereby, the processing of full images is significantly accelerated and
bounding boxes can be predicted. Finally, we integrate data of the Global
Positioning System (GPS) to localize the predictions on the map. In comparison
to Faster R-CNN and other networks for object recognition or algorithms for
transfer learning, we considerably reduce the required amount of labeled data.
We demonstrate good performance on the recognition of traffic signs for
cyclists as well as their localization in maps.Comment: Submitted to IV 2018. This research was supported by German Research
Foundation DFG within Priority Research Programme 1894 "Volunteered
Geographic Information: Interpretation, Visualization and Social Computing
Semi-supervised Tuning from Temporal Coherence
Recent works demonstrated the usefulness of temporal coherence to regularize
supervised training or to learn invariant features with deep architectures. In
particular, enforcing smooth output changes while presenting temporally-closed
frames from video sequences, proved to be an effective strategy. In this paper
we prove the efficacy of temporal coherence for semi-supervised incremental
tuning. We show that a deep architecture, just mildly trained in a supervised
manner, can progressively improve its classification accuracy, if exposed to
video sequences of unlabeled data. The extent to which, in some cases, a
semi-supervised tuning allows to improve classification accuracy (approaching
the supervised one) is somewhat surprising. A number of control experiments
pointed out the fundamental role of temporal coherence.Comment: Under review as a conference paper at ICLR 201
Object Recognition from very few Training Examples for Enhancing Bicycle Maps
In recent years, data-driven methods have shown great success for extracting information about the infrastructure in urban areas. These algorithms are usually trained on large datasets consisting of thousands or millions of labeled training examples. While large datasets have been published regarding cars, for cyclists very few labeled data is available although appearance, point of view, and positioning of even relevant objects differ. Unfortunately, labeling data is costly and requires a huge amount of work. In this paper, we thus address the problem of learning with very few labels. The aim is to recognize particular traffic signs in crowdsourced data to collect information which is of interest to cyclists. We propose a system for object recognition that is trained with only 15 examples per class on average. To achieve this, we combine the advantages of convolutional neural networks and random forests to learn a patch-wise classifier. In the next step, we map the random forest to a neural network and transform the classifier to a fully convolutional network. Thereby, the processing of full images is significantly accelerated and bounding boxes can be predicted. Finally, we integrate data of the Global Positioning System (GPS) to localize the predictions on the map. In comparison to Faster R-CNN and other networks for object recognition or algorithms for transfer learning, we considerably reduce the required amount of labeled data. We demonstrate good performance on the recognition of traffic signs for cyclists as well as their localization in maps
Domain Mapping and Deep Learning from Multiple MRI Clinical Datasets for Prediction of Molecular Subtypes in Low Grade Gliomas
Brain tumors, such as low grade gliomas (LGG), are molecularly classified which require the surgical collection of tissue samples. The pre-surgical or non-operative identification of LGG molecular type could improve patient counseling and treatment decisions. However, radiographic approaches to LGG molecular classification are currently lacking, as clinicians are unable to reliably predict LGG molecular type using magnetic resonance imaging (MRI) studies. Machine learning approaches may improve the prediction of LGG molecular classification through MRI, however, the development of these techniques requires large annotated data sets. Merging clinical data from different hospitals to increase case numbers is needed, but the use of different scanners and settings can affect the results and simply combining them into a large dataset often have a significant negative impact on performance. This calls for efficient domain adaption methods. Despite some previous studies on domain adaptations, mapping MR images from different datasets to a common domain without affecting subtitle molecular-biomarker information has not been reported yet. In this paper, we propose an effective domain adaptation method based on Cycle Generative Adversarial Network (CycleGAN). The dataset is further enlarged by augmenting more MRIs using another GAN approach. Further, to tackle the issue of brain tumor segmentation that requires time and anatomical expertise to put exact boundary around the tumor, we have used a tight bounding box as a strategy. Finally, an efficient deep feature learning method, multi-stream convolutional autoencoder (CAE) and feature fusion, is proposed for the prediction of molecular subtypes (1p/19q-codeletion and IDH mutation). The experiments were conducted on a total of 161 patients consisting of FLAIR and T1 weighted with contrast enhanced (T1ce) MRIs from two different institutions in the USA and France. The proposed scheme is shown to achieve the test accuracy of\ua074.81%\ua0on 1p/19q codeletion and\ua081.19%\ua0on IDH mutation, with marked improvement over the results obtained without domain mapping. This approach is also shown to have comparable performance to several state-of-the-art methods
Aplicaci贸n del aprendizaje profundo (deep learning) al procesamiento de se帽ales digitales
Este trabajo hace una revisi贸n del concepto de deep learning, en la cual se
muestran las diferentes arquitecturas que cumplen con dicho concepto y como
estas se relacionan con las redes neuronales artificiales (RNA), tanto por su
arquitectura, como por su tipo de entrenamiento. Realizando de esta forma, una
taxonom铆a que explica o clasifica algunas de las redes que conforman el deep
learning.
Con la revisi贸n de arquitecturas y aplicaciones, se elabor贸 una aplicaci贸n en
MATLAB con dos de las arquitecturas m谩s comunes del concepto, en la cual se
puede apreciar el comportamiento distintivo de estas arquitecturas, con lo que se
corrobora la aplicabilidad del deep learning en el procesamiento digital de
im谩genes, obteniendo buenos resultados de clasificaci贸nPregradoIngeniero(a) Mecatr贸nico(a
Multifaceted Analysis of Fine-Tuning in Deep Model for Visual Recognition
In recent years, convolutional neural networks (CNNs) have achieved
impressive performance for various visual recognition scenarios. CNNs trained
on large labeled datasets can not only obtain significant performance on most
challenging benchmarks but also provide powerful representations, which can be
used to a wide range of other tasks. However, the requirement of massive
amounts of data to train deep neural networks is a major drawback of these
models, as the data available is usually limited or imbalanced. Fine-tuning
(FT) is an effective way to transfer knowledge learned in a source dataset to a
target task. In this paper, we introduce and systematically investigate several
factors that influence the performance of fine-tuning for visual recognition.
These factors include parameters for the retraining procedure (e.g., the
initial learning rate of fine-tuning), the distribution of the source and
target data (e.g., the number of categories in the source dataset, the distance
between the source and target datasets) and so on. We quantitatively and
qualitatively analyze these factors, evaluate their influence, and present many
empirical observations. The results reveal insights into what fine-tuning
changes CNN parameters and provide useful and evidence-backed intuitions about
how to implement fine-tuning for computer vision tasks.Comment: Accepted by ACM Transactions on Data Scienc