2 research outputs found

    ARUBA: An Architecture-Agnostic Balanced Loss for Aerial Object Detection

    Full text link
    Deep neural networks tend to reciprocate the bias of their training dataset. In object detection, the bias exists in the form of various imbalances such as class, background-foreground, and object size. In this paper, we denote size of an object as the number of pixels it covers in an image and size imbalance as the over-representation of certain sizes of objects in a dataset. We aim to address the problem of size imbalance in drone-based aerial image datasets. Existing methods for solving size imbalance are based on architectural changes that utilize multiple scales of images or feature maps for detecting objects of different sizes. We, on the other hand, propose a novel ARchitectUre-agnostic BAlanced Loss (ARUBA) that can be applied as a plugin on top of any object detection model. It follows a neighborhood-driven approach inspired by the ordinality of object size. We evaluate the effectiveness of our approach through comprehensive experiments on aerial datasets such as HRSC2016, DOTAv1.0, DOTAv1.5 and VisDrone and obtain consistent improvement in performance.Comment: Accepted to WACV 202

    Defending Deep Neural Networks against Structural Pertubations

    No full text
    Deep learning has had a tremendous impact in the field of computer vision. However, the deployment of such algorithms in real-world environments hinges upon its robustness to noise. This thesis focuses on testing robustness of a model against naturally occurring structural perturbations to ensure the safety of deep learning in multiple domains such as facial recognition, automated driving and object detection. To this end, given a dataset, we propose a systematic way to defend against such attacks. We evaluate various hyperparameters of a model and analyse the results to ascertain their causality on robustness. Subsequently. We propose a method to boost the stability of a model via training set augmentation, also known as adversarial training. The augmentation relies on a coreset set cover approach to cover six independent structural perturbations and their combinations. We introduce and compare two different strategies with time vs data tradeoff without compromising model robustness or performance. Also, we analyse the effect of adversarial training on the decision boundary of the model. This work primarily focuses on image classification and, we believe that our algorithm works independently of the model architecture. This approach is put to the test on different datasets and model architectures and compared against state-of-the-art defence against structural perturbations
    corecore