5 research outputs found
Weakly-Supervised Semantic Segmentation of Ships Using Thermal Imagery
The United States coastline spans 95,471 miles; a distance that cannot be
effectively patrolled or secured by manual human effort alone. Unmanned Aerial
Vehicles (UAVs) equipped with infrared cameras and deep-learning based
algorithms represent a more efficient alternative for identifying and
segmenting objects of interest - namely, ships. However, standard approaches to
training these algorithms require large-scale datasets of densely labeled
infrared maritime images. Such datasets are not publicly available and manually
annotating every pixel in a large-scale dataset would have an extreme labor
cost. In this work we demonstrate that, in the context of segmenting ships in
infrared imagery, weakly-supervising an algorithm with sparsely labeled data
can drastically reduce data labeling costs with minimal impact on system
performance. We apply weakly-supervised learning to an unlabeled dataset of
7055 infrared images sourced from the Naval Air Warfare Center Aircraft
Division (NAWCAD). We find that by sparsely labeling only 32 points per image,
weakly-supervised segmentation models can still effectively detect and segment
ships, with a Jaccard score of up to 0.756
Background Adaptive Faster R-CNN for Semi-Supervised Convolutional Object Detection of Threats in X-Ray Images
Recently, progress has been made in the supervised training of Convolutional
Object Detectors (e.g. Faster R-CNN) for threat recognition in carry-on luggage
using X-ray images. This is part of the Transportation Security
Administration's (TSA's) mission to protect air travelers in the United States.
While more training data with threats may reliably improve performance for this
class of deep algorithm, it is expensive to stage in realistic contexts. By
contrast, data from the real world can be collected quickly with minimal cost.
In this paper, we present a semi-supervised approach for threat recognition
which we call Background Adaptive Faster R-CNN. This approach is a training
method for two-stage object detectors which uses Domain Adaptation methods from
the field of deep learning. The data sources described earlier make two
"domains": a hand-collected data domain of images with threats, and a
real-world domain of images assumed without threats. Two domain discriminators,
one for discriminating object proposals and one for image features, are
adversarially trained to prevent encoding domain-specific information. Without
this penalty a Convolutional Neural Network (CNN) can learn to identify domains
based on superficial characteristics, and minimize a supervised loss function
without improving its ability to recognize objects. For the hand-collected
data, only object proposals and image features from backgrounds are used. The
losses for these domain-adaptive discriminators are added to the Faster R-CNN
losses of images from both domains. This can reduce threat detection false
alarm rates by matching the statistics of extracted features from
hand-collected backgrounds to real world data. Performance improvements are
demonstrated on two independently-collected datasets of labeled threats
Tackling the Incomplete Annotation Issue in Universal Lesion Detection Task By Exploratory Training
Universal lesion detection has great value for clinical practice as it aims
to detect various types of lesions in multiple organs on medical images. Deep
learning methods have shown promising results, but demanding large volumes of
annotated data for training. However, annotating medical images is costly and
requires specialized knowledge. The diverse forms and contrasts of objects in
medical images make fully annotation even more challenging, resulting in
incomplete annotations. Directly training ULD detectors on such datasets can
yield suboptimal results. Pseudo-label-based methods examine the training data
and mine unlabelled objects for retraining, which have shown to be effective to
tackle this issue. Presently, top-performing methods rely on a dynamic
label-mining mechanism, operating at the mini-batch level. However, the model's
performance varies at different iterations, leading to inconsistencies in the
quality of the mined labels and limits their performance enhancement. Inspired
by the observation that deep models learn concepts with increasing complexity,
we introduce an innovative exploratory training to assess the reliability of
mined lesions over time. Specifically, we introduce a teacher-student detection
model as basis, where the teacher's predictions are combined with incomplete
annotations to train the student. Additionally, we design a prediction bank to
record high-confidence predictions. Each sample is trained several times,
allowing us to get a sequence of records for each sample. If a prediction
consistently appears in the record sequence, it is likely to be a true object,
otherwise it may just a noise. This serves as a crucial criterion for selecting
reliable mined lesions for retraining. Our experimental results substantiate
that the proposed framework surpasses state-of-the-art methods on two medical
image datasets, demonstrating its superior performance