6,255 research outputs found
Self-paced Convolutional Neural Network for Computer Aided Detection in Medical Imaging Analysis
Tissue characterization has long been an important component of Computer
Aided Diagnosis (CAD) systems for automatic lesion detection and further
clinical planning. Motivated by the superior performance of deep learning
methods on various computer vision problems, there has been increasing work
applying deep learning to medical image analysis. However, the development of a
robust and reliable deep learning model for computer-aided diagnosis is still
highly challenging due to the combination of the high heterogeneity in the
medical images and the relative lack of training samples. Specifically,
annotation and labeling of the medical images is much more expensive and
time-consuming than other applications and often involves manual labor from
multiple domain experts. In this work, we propose a multi-stage, self-paced
learning framework utilizing a convolutional neural network (CNN) to classify
Computed Tomography (CT) image patches. The key contribution of this approach
is that we augment the size of training samples by refining the unlabeled
instances with a self-paced learning CNN. By implementing the framework on high
performance computing servers including the NVIDIA DGX1 machine, we obtained
the experimental result, showing that the self-pace boosted network
consistently outperformed the original network even with very scarce manual
labels. The performance gain indicates that applications with limited training
samples such as medical image analysis can benefit from using the proposed
framework.Comment: accepted by 8th International Workshop on Machine Learning in Medical
Imaging (MLMI 2017
Hierarchical Attention Network for Visually-aware Food Recommendation
Food recommender systems play an important role in assisting users to
identify the desired food to eat. Deciding what food to eat is a complex and
multi-faceted process, which is influenced by many factors such as the
ingredients, appearance of the recipe, the user's personal preference on food,
and various contexts like what had been eaten in the past meals. In this work,
we formulate the food recommendation problem as predicting user preference on
recipes based on three key factors that determine a user's choice on food,
namely, 1) the user's (and other users') history; 2) the ingredients of a
recipe; and 3) the descriptive image of a recipe. To address this challenging
problem, we develop a dedicated neural network based solution Hierarchical
Attention based Food Recommendation (HAFR) which is capable of: 1) capturing
the collaborative filtering effect like what similar users tend to eat; 2)
inferring a user's preference at the ingredient level; and 3) learning user
preference from the recipe's visual images. To evaluate our proposed method, we
construct a large-scale dataset consisting of millions of ratings from
AllRecipes.com. Extensive experiments show that our method outperforms several
competing recommender solutions like Factorization Machine and Visual Bayesian
Personalized Ranking with an average improvement of 12%, offering promising
results in predicting user preference for food. Codes and dataset will be
released upon acceptance
PYRO-NN: Python Reconstruction Operators in Neural Networks
Purpose: Recently, several attempts were conducted to transfer deep learning
to medical image reconstruction. An increasingly number of publications follow
the concept of embedding the CT reconstruction as a known operator into a
neural network. However, most of the approaches presented lack an efficient CT
reconstruction framework fully integrated into deep learning environments. As a
result, many approaches are forced to use workarounds for mathematically
unambiguously solvable problems. Methods: PYRO-NN is a generalized framework to
embed known operators into the prevalent deep learning framework Tensorflow.
The current status includes state-of-the-art parallel-, fan- and cone-beam
projectors and back-projectors accelerated with CUDA provided as Tensorflow
layers. On top, the framework provides a high level Python API to conduct FBP
and iterative reconstruction experiments with data from real CT systems.
Results: The framework provides all necessary algorithms and tools to design
end-to-end neural network pipelines with integrated CT reconstruction
algorithms. The high level Python API allows a simple use of the layers as
known from Tensorflow. To demonstrate the capabilities of the layers, the
framework comes with three baseline experiments showing a cone-beam short scan
FDK reconstruction, a CT reconstruction filter learning setup, and a TV
regularized iterative reconstruction. All algorithms and tools are referenced
to a scientific publication and are compared to existing non deep learning
reconstruction frameworks. The framework is available as open-source software
at \url{https://github.com/csyben/PYRO-NN}. Conclusions: PYRO-NN comes with the
prevalent deep learning framework Tensorflow and allows to setup end-to-end
trainable neural networks in the medical image reconstruction context. We
believe that the framework will be a step towards reproducible researchComment: V1: Submitted to Medical Physics, 11 pages, 7 figure
Two-Stage Transfer Learning for Heterogeneous Robot Detection and 3D Joint Position Estimation in a 2D Camera Image using CNN
Collaborative robots are becoming more common on factory floors as well as
regular environments, however, their safety still is not a fully solved issue.
Collision detection does not always perform as expected and collision avoidance
is still an active research area. Collision avoidance works well for fixed
robot-camera setups, however, if they are shifted around, Eye-to-Hand
calibration becomes invalid making it difficult to accurately run many of the
existing collision avoidance algorithms. We approach the problem by presenting
a stand-alone system capable of detecting the robot and estimating its
position, including individual joints, by using a simple 2D colour image as an
input, where no Eye-to-Hand calibration is needed. As an extension of previous
work, a two-stage transfer learning approach is used to re-train a
multi-objective convolutional neural network (CNN) to allow it to be used with
heterogeneous robot arms. Our method is capable of detecting the robot in
real-time and new robot types can be added by having significantly smaller
training datasets compared to the requirements of a fully trained network. We
present data collection approach, the structure of the multi-objective CNN, the
two-stage transfer learning training and test results by using real robots from
Universal Robots, Kuka, and Franka Emika. Eventually, we analyse possible
application areas of our method together with the possible improvements.Comment: 6+n pages, ICRA 2019 submissio
- …