2 research outputs found
ARUBA: An Architecture-Agnostic Balanced Loss for Aerial Object Detection
Deep neural networks tend to reciprocate the bias of their training dataset.
In object detection, the bias exists in the form of various imbalances such as
class, background-foreground, and object size. In this paper, we denote size of
an object as the number of pixels it covers in an image and size imbalance as
the over-representation of certain sizes of objects in a dataset. We aim to
address the problem of size imbalance in drone-based aerial image datasets.
Existing methods for solving size imbalance are based on architectural changes
that utilize multiple scales of images or feature maps for detecting objects of
different sizes. We, on the other hand, propose a novel ARchitectUre-agnostic
BAlanced Loss (ARUBA) that can be applied as a plugin on top of any object
detection model. It follows a neighborhood-driven approach inspired by the
ordinality of object size. We evaluate the effectiveness of our approach
through comprehensive experiments on aerial datasets such as HRSC2016,
DOTAv1.0, DOTAv1.5 and VisDrone and obtain consistent improvement in
performance.Comment: Accepted to WACV 202
Defending Deep Neural Networks against Structural Pertubations
Deep learning has had a tremendous impact in the field of computer vision. However, the deployment of such algorithms in real-world environments hinges upon its robustness to noise. This thesis focuses on testing robustness of a model against naturally occurring structural perturbations to ensure the safety of deep learning in multiple domains such as facial recognition, automated driving and object detection. To this end, given a dataset, we propose a systematic way to defend against such attacks. We evaluate various hyperparameters of a model and analyse the results to ascertain their causality on robustness. Subsequently. We propose a method to boost the stability of a model via training set augmentation, also known as adversarial training. The augmentation relies on a coreset set cover approach to cover six independent structural perturbations and their combinations. We introduce and compare two different strategies with time vs data tradeoff without compromising model robustness or performance. Also, we analyse the effect of adversarial training on the decision boundary of the model. This work primarily focuses on image classification and, we believe that our algorithm works independently of the model architecture. This approach is put to the test on different datasets and model architectures and compared against state-of-the-art defence against structural perturbations