5 research outputs found
LAD-RCNN:A Powerful Tool for Livestock Face Detection and Normalization
With the demand for standardized large-scale livestock farming and the
development of artificial intelligence technology, a lot of research in area of
animal face recognition were carried on pigs, cattle, sheep and other
livestock. Face recognition consists of three sub-task: face detection, face
normalizing and face identification. Most of animal face recognition study
focuses on face detection and face identification. Animals are often
uncooperative when taking photos, so the collected animal face images are often
in arbitrary directions. The use of non-standard images may significantly
reduce the performance of face recognition system. However, there is no study
on normalizing of the animal face image with arbitrary directions. In this
study, we developed a light-weight angle detection and region-based
convolutional network (LAD-RCNN) containing a new rotation angle coding method
that can detect the rotation angle and the location of animal face in
one-stage. LAD-RCNN has a frame rate of 72.74 FPS (including all steps) on a
single GeForce RTX 2080 Ti GPU. LAD-RCNN has been evaluated on multiple dataset
including goat dataset and gaot infrared image. Evaluation result show that the
AP of face detection was more than 95% and the deviation between the detected
rotation angle and the ground-truth rotation angle were less than 0.036 (i.e.
6.48{\deg}) on all the test dataset. This shows that LAD-RCNN has excellent
performance on livestock face and its direction detection, and therefore it is
very suitable for livestock face detection and Normalizing. Code is available
at https://github.com/SheepBreedingLab-HZAU/LAD-RCNN/Comment: 8 figures, 5 table
Long-Range Correlation Supervision for Land-Cover Classification from Remote Sensing Images
Long-range dependency modeling has been widely considered in modern deep
learning based semantic segmentation methods, especially those designed for
large-size remote sensing images, to compensate the intrinsic locality of
standard convolutions. However, in previous studies, the long-range dependency,
modeled with an attention mechanism or transformer model, has been based on
unsupervised learning, instead of explicit supervision from the objective
ground truth. In this paper, we propose a novel supervised long-range
correlation method for land-cover classification, called the supervised
long-range correlation network (SLCNet), which is shown to be superior to the
currently used unsupervised strategies. In SLCNet, pixels sharing the same
category are considered highly correlated and those having different categories
are less relevant, which can be easily supervised by the category consistency
information available in the ground truth semantic segmentation map. Under such
supervision, the recalibrated features are more consistent for pixels of the
same category and more discriminative for pixels of other categories,
regardless of their proximity. To complement the detailed information lacking
in the global long-range correlation, we introduce an auxiliary adaptive
receptive field feature extraction module, parallel to the long-range
correlation module in the encoder, to capture finely detailed feature
representations for multi-size objects in multi-scale remote sensing images. In
addition, we apply multi-scale side-output supervision and a hybrid loss
function as local and global constraints to further boost the segmentation
accuracy. Experiments were conducted on three remote sensing datasets. Compared
with the advanced segmentation methods from the computer vision, medicine, and
remote sensing communities, the SLCNet achieved a state-of-the-art performance
on all the datasets.Comment: 14 pages, 11 figure
FAST ROTATED BOUNDING BOX ANNOTATIONS FOR OBJECT DETECTION
Traditionally, object detection models use a large amount of annotated data and axis-aligned bounding boxes (AABBs) are often chosen as the image annotation technique for both training and predictions. The purpose of annotating the objects in the images is to indicate the regions of interest with the corresponding labels. Accurate object annotations help the computer vision models to understand the distinct patterns of the image features to recognize and localize different classes of objects. However, AABBs are often a poor fit for elongated object instances. It’s also
challenging to localize objects with AABBs in densely packed aerial images because of overlapping adjacent bounding boxes. Alternatively, using rectangular annotations that can be oriented diagonally, also known as rotated bounding boxes (RBB), can provide a much tighter fit for elongated objects and reduce the potential bounding box overlap between adjacent objects. However, RBBs are much more time-consuming and tedious to annotate than AABBs for large datasets.
In this work, we propose a novel annotation tool named as FastRoLabelImg (Fast Rotated LabelImg) for producing high-quality RBB annotations with low time and effort. The tool generates accurate RBB proposals for objects of
interest as the annotator makes progress through the dataset. It can also adapt available AABBs to generate RBB proposals. Furthermore, a multipoint box drawing system is provided to reduce manual RBB annotation time compared to the existing methods. Across three diverse datasets, we show that the proposal generation methods can achieve a maximum of 88.9% manual workload reduction. We also show that our proposed manual annotation method is
twice as fast as the existing system with the same accuracy by conducting a participant study. Lastly, we publish the RBB annotations for two public datasets in order to motivate future research that will contribute in developing more competent object detection algorithms capable of RBB predictions