5 research outputs found

    Weakly-Supervised Semantic Segmentation of Ships Using Thermal Imagery

    Full text link
    The United States coastline spans 95,471 miles; a distance that cannot be effectively patrolled or secured by manual human effort alone. Unmanned Aerial Vehicles (UAVs) equipped with infrared cameras and deep-learning based algorithms represent a more efficient alternative for identifying and segmenting objects of interest - namely, ships. However, standard approaches to training these algorithms require large-scale datasets of densely labeled infrared maritime images. Such datasets are not publicly available and manually annotating every pixel in a large-scale dataset would have an extreme labor cost. In this work we demonstrate that, in the context of segmenting ships in infrared imagery, weakly-supervising an algorithm with sparsely labeled data can drastically reduce data labeling costs with minimal impact on system performance. We apply weakly-supervised learning to an unlabeled dataset of 7055 infrared images sourced from the Naval Air Warfare Center Aircraft Division (NAWCAD). We find that by sparsely labeling only 32 points per image, weakly-supervised segmentation models can still effectively detect and segment ships, with a Jaccard score of up to 0.756

    Background Adaptive Faster R-CNN for Semi-Supervised Convolutional Object Detection of Threats in X-Ray Images

    Full text link
    Recently, progress has been made in the supervised training of Convolutional Object Detectors (e.g. Faster R-CNN) for threat recognition in carry-on luggage using X-ray images. This is part of the Transportation Security Administration's (TSA's) mission to protect air travelers in the United States. While more training data with threats may reliably improve performance for this class of deep algorithm, it is expensive to stage in realistic contexts. By contrast, data from the real world can be collected quickly with minimal cost. In this paper, we present a semi-supervised approach for threat recognition which we call Background Adaptive Faster R-CNN. This approach is a training method for two-stage object detectors which uses Domain Adaptation methods from the field of deep learning. The data sources described earlier make two "domains": a hand-collected data domain of images with threats, and a real-world domain of images assumed without threats. Two domain discriminators, one for discriminating object proposals and one for image features, are adversarially trained to prevent encoding domain-specific information. Without this penalty a Convolutional Neural Network (CNN) can learn to identify domains based on superficial characteristics, and minimize a supervised loss function without improving its ability to recognize objects. For the hand-collected data, only object proposals and image features from backgrounds are used. The losses for these domain-adaptive discriminators are added to the Faster R-CNN losses of images from both domains. This can reduce threat detection false alarm rates by matching the statistics of extracted features from hand-collected backgrounds to real world data. Performance improvements are demonstrated on two independently-collected datasets of labeled threats

    Tackling the Incomplete Annotation Issue in Universal Lesion Detection Task By Exploratory Training

    Full text link
    Universal lesion detection has great value for clinical practice as it aims to detect various types of lesions in multiple organs on medical images. Deep learning methods have shown promising results, but demanding large volumes of annotated data for training. However, annotating medical images is costly and requires specialized knowledge. The diverse forms and contrasts of objects in medical images make fully annotation even more challenging, resulting in incomplete annotations. Directly training ULD detectors on such datasets can yield suboptimal results. Pseudo-label-based methods examine the training data and mine unlabelled objects for retraining, which have shown to be effective to tackle this issue. Presently, top-performing methods rely on a dynamic label-mining mechanism, operating at the mini-batch level. However, the model's performance varies at different iterations, leading to inconsistencies in the quality of the mined labels and limits their performance enhancement. Inspired by the observation that deep models learn concepts with increasing complexity, we introduce an innovative exploratory training to assess the reliability of mined lesions over time. Specifically, we introduce a teacher-student detection model as basis, where the teacher's predictions are combined with incomplete annotations to train the student. Additionally, we design a prediction bank to record high-confidence predictions. Each sample is trained several times, allowing us to get a sequence of records for each sample. If a prediction consistently appears in the record sequence, it is likely to be a true object, otherwise it may just a noise. This serves as a crucial criterion for selecting reliable mined lesions for retraining. Our experimental results substantiate that the proposed framework surpasses state-of-the-art methods on two medical image datasets, demonstrating its superior performance
    corecore