4,687 research outputs found
Deep Lesion Graphs in the Wild: Relationship Learning and Organization of Significant Radiology Image Findings in a Diverse Large-scale Lesion Database
Radiologists in their daily work routinely find and annotate significant
abnormalities on a large number of radiology images. Such abnormalities, or
lesions, have collected over years and stored in hospitals' picture archiving
and communication systems. However, they are basically unsorted and lack
semantic annotations like type and location. In this paper, we aim to organize
and explore them by learning a deep feature representation for each lesion. A
large-scale and comprehensive dataset, DeepLesion, is introduced for this task.
DeepLesion contains bounding boxes and size measurements of over 32K lesions.
To model their similarity relationship, we leverage multiple supervision
information including types, self-supervised location coordinates and sizes.
They require little manual annotation effort but describe useful attributes of
the lesions. Then, a triplet network is utilized to learn lesion embeddings
with a sequential sampling strategy to depict their hierarchical similarity
structure. Experiments show promising qualitative and quantitative results on
lesion retrieval, clustering, and classification. The learned embeddings can be
further employed to build a lesion graph for various clinically useful
applications. We propose algorithms for intra-patient lesion matching and
missing annotation mining. Experimental results validate their effectiveness.Comment: Accepted by CVPR2018. DeepLesion url adde
Context Embedding Networks
Low dimensional embeddings that capture the main variations of interest in
collections of data are important for many applications. One way to construct
these embeddings is to acquire estimates of similarity from the crowd. However,
similarity is a multi-dimensional concept that varies from individual to
individual. Existing models for learning embeddings from the crowd typically
make simplifying assumptions such as all individuals estimate similarity using
the same criteria, the list of criteria is known in advance, or that the crowd
workers are not influenced by the data that they see. To overcome these
limitations we introduce Context Embedding Networks (CENs). In addition to
learning interpretable embeddings from images, CENs also model worker biases
for different attributes along with the visual context i.e. the visual
attributes highlighted by a set of images. Experiments on two noisy crowd
annotated datasets show that modeling both worker bias and visual context
results in more interpretable embeddings compared to existing approaches.Comment: CVPR 2018 spotligh
Survey on Evaluation Methods for Dialogue Systems
In this paper we survey the methods and concepts developed for the evaluation
of dialogue systems. Evaluation is a crucial part during the development
process. Often, dialogue systems are evaluated by means of human evaluations
and questionnaires. However, this tends to be very cost and time intensive.
Thus, much work has been put into finding methods, which allow to reduce the
involvement of human labour. In this survey, we present the main concepts and
methods. For this, we differentiate between the various classes of dialogue
systems (task-oriented dialogue systems, conversational dialogue systems, and
question-answering dialogue systems). We cover each class by introducing the
main technologies developed for the dialogue systems and then by presenting the
evaluation methods regarding this class
Boundary Attention Mapping (BAM): Fine-grained saliency maps for segmentation of Burn Injuries
Burn injuries can result from mechanisms such as thermal, chemical, and
electrical insults. A prompt and accurate assessment of burns is essential for
deciding definitive clinical treatments. Currently, the primary approach for
burn assessments, via visual and tactile observations, is approximately 60%-80%
accurate. The gold standard is biopsy and a close second would be non-invasive
methods like Laser Doppler Imaging (LDI) assessments, which have up to 97%
accuracy in predicting burn severity and the required healing time. In this
paper, we introduce a machine learning pipeline for assessing burn severities
and segmenting the regions of skin that are affected by burn. Segmenting 2D
colour images of burns allows for the injured versus non-injured skin to be
delineated, clearly marking the extent and boundaries of the localized
burn/region-of-interest, even during remote monitoring of a burn patient. We
trained a convolutional neural network (CNN) to classify four severities of
burns. We built a saliency mapping method, Boundary Attention Mapping (BAM),
that utilises this trained CNN for the purpose of accurately localizing and
segmenting the burn regions from skin burn images. We demonstrated the
effectiveness of our proposed pipeline through extensive experiments and
evaluations using two datasets; 1) A larger skin burn image dataset consisting
of 1684 skin burn images of four burn severities, 2) An LDI dataset that
consists of a total of 184 skin burn images with their associated LDI scans.
The CNN trained using the first dataset achieved an average F1-Score of 78% and
micro/macro- average ROC of 85% in classifying the four burn severities.
Moreover, a comparison between the BAM results and LDI results for measuring
injury boundary showed that the segmentations generated by our method achieved
91.60% accuracy, 78.17% sensitivity, and 93.37% specificity
MSKdeX: Musculoskeletal (MSK) decomposition from an X-ray image for fine-grained estimation of lean muscle mass and muscle volume
Musculoskeletal diseases such as sarcopenia and osteoporosis are major
obstacles to health during aging. Although dual-energy X-ray absorptiometry
(DXA) and computed tomography (CT) can be used to evaluate musculoskeletal
conditions, frequent monitoring is difficult due to the cost and accessibility
(as well as high radiation exposure in the case of CT). We propose a method
(named MSKdeX) to estimate fine-grained muscle properties from a plain X-ray
image, a low-cost, low-radiation, and highly accessible imaging modality,
through musculoskeletal decomposition leveraging fine-grained segmentation in
CT. We train a multi-channel quantitative image translation model to decompose
an X-ray image into projections of CT of individual muscles to infer the lean
muscle mass and muscle volume. We propose the object-wise intensity-sum loss, a
simple yet surprisingly effective metric invariant to muscle deformation and
projection direction, utilizing information in CT and X-ray images collected
from the same patient. While our method is basically an unpaired image-to-image
translation, we also exploit the nature of the bone's rigidity, which provides
the paired data through 2D-3D rigid registration, adding strong pixel-wise
supervision in unpaired training. Through the evaluation using a 539-patient
dataset, we showed that the proposed method significantly outperformed
conventional methods. The average Pearson correlation coefficient between the
predicted and CT-derived ground truth metrics was increased from 0.460 to
0.863. We believe our method opened up a new musculoskeletal diagnosis method
and has the potential to be extended to broader applications in multi-channel
quantitative image translation tasks. Our source code will be released soon.Comment: MICCAI 2023 early acceptance (12 pages and 6 figures
Ensemble of Loss Functions to Improve Generalizability of Deep Metric Learning methods
Deep Metric Learning (DML) learns a non-linear semantic embedding from input
data that brings similar pairs together while keeps dissimilar data away from
each other. To this end, many different methods are proposed in the last decade
with promising results in various applications. The success of a DML algorithm
greatly depends on its loss function. However, no loss function is perfect, and
it deals only with some aspects of an optimal similarity embedding. Besides,
the generalizability of the DML on unseen categories during the test stage is
an important matter that is not considered by existing loss functions. To
address these challenges, we propose novel approaches to combine different
losses built on top of a shared deep feature extractor. The proposed ensemble
of losses enforces the deep model to extract features that are consistent with
all losses. Since the selected losses are diverse and each emphasizes different
aspects of an optimal semantic embedding, our effective combining methods yield
a considerable improvement over any individual loss and generalize well on
unseen categories. Here, there is no limitation in choosing loss functions, and
our methods can work with any set of existing ones. Besides, they can optimize
each loss function as well as its weight in an end-to-end paradigm with no need
to adjust any hyper-parameter. We evaluate our methods on some popular datasets
from the machine vision domain in conventional Zero-Shot-Learning (ZSL)
settings. The results are very encouraging and show that our methods outperform
all baseline losses by a large margin in all datasets.Comment: 27 pages, 12 figure
Hypergraph Convolutional Networks for Fine-grained ICU Patient Similarity Analysis and Risk Prediction
The Intensive Care Unit (ICU) is one of the most important parts of a
hospital, which admits critically ill patients and provides continuous
monitoring and treatment. Various patient outcome prediction methods have been
attempted to assist healthcare professionals in clinical decision-making.
Existing methods focus on measuring the similarity between patients using deep
neural networks to capture the hidden feature structures. However, the
higher-order relationships are ignored, such as patient characteristics (e.g.,
diagnosis codes) and their causal effects on downstream clinical predictions.
In this paper, we propose a novel Hypergraph Convolutional Network that
allows the representation of non-pairwise relationships among diagnosis codes
in a hypergraph to capture the hidden feature structures so that fine-grained
patient similarity can be calculated for personalized mortality risk
prediction. Evaluation using a publicly available eICU Collaborative Research
Database indicates that our method achieves superior performance over the
state-of-the-art models on mortality risk prediction. Moreover, the results of
several case studies demonstrated the effectiveness of constructing graph
networks in providing good transparency and robustness in decision-making.Comment: 7 pages, 2 figures, submitted to IEEE BIBM 202
SdCT-GAN: Reconstructing CT from Biplanar X-Rays with Self-driven Generative Adversarial Networks
Computed Tomography (CT) is a medical imaging modality that can generate more
informative 3D images than 2D X-rays. However, this advantage comes at the
expense of more radiation exposure, higher costs, and longer acquisition time.
Hence, the reconstruction of 3D CT images using a limited number of 2D X-rays
has gained significant importance as an economical alternative. Nevertheless,
existing methods primarily prioritize minimizing pixel/voxel-level intensity
discrepancies, often neglecting the preservation of textural details in the
synthesized images. This oversight directly impacts the quality of the
reconstructed images and thus affects the clinical diagnosis. To address the
deficits, this paper presents a new self-driven generative adversarial network
model (SdCT-GAN), which is motivated to pay more attention to image details by
introducing a novel auto-encoder structure in the discriminator. In addition, a
Sobel Gradient Guider (SGG) idea is applied throughout the model, where the
edge information from the 2D X-ray image at the input can be integrated.
Moreover, LPIPS (Learned Perceptual Image Patch Similarity) evaluation metric
is adopted that can quantitatively evaluate the fine contours and textures of
reconstructed images better than the existing ones. Finally, the qualitative
and quantitative results of the empirical studies justify the power of the
proposed model compared to mainstream state-of-the-art baselines
- …