1,276 research outputs found
Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks
State-of-the-art object detection networks depend on region proposal
algorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN
have reduced the running time of these detection networks, exposing region
proposal computation as a bottleneck. In this work, we introduce a Region
Proposal Network (RPN) that shares full-image convolutional features with the
detection network, thus enabling nearly cost-free region proposals. An RPN is a
fully convolutional network that simultaneously predicts object bounds and
objectness scores at each position. The RPN is trained end-to-end to generate
high-quality region proposals, which are used by Fast R-CNN for detection. We
further merge RPN and Fast R-CNN into a single network by sharing their
convolutional features---using the recently popular terminology of neural
networks with 'attention' mechanisms, the RPN component tells the unified
network where to look. For the very deep VGG-16 model, our detection system has
a frame rate of 5fps (including all steps) on a GPU, while achieving
state-of-the-art object detection accuracy on PASCAL VOC 2007, 2012, and MS
COCO datasets with only 300 proposals per image. In ILSVRC and COCO 2015
competitions, Faster R-CNN and RPN are the foundations of the 1st-place winning
entries in several tracks. Code has been made publicly available.Comment: Extended tech repor
Multispecies Fruit Flower Detection Using a Refined Semantic Segmentation Network
In fruit production, critical crop management decisions are guided by bloom intensity, i.e., the number of flowers present in an orchard. Despite its importance, bloom intensity is still typically estimated by means of human visual inspection. Existing automated computer vision systems for flower identification are based on hand-engineered techniques that work only under specific conditions and with limited performance. This letter proposes an automated technique for flower identification that is robust to uncontrolled environments and applicable to different flower species. Our method relies on an end-to-end residual convolutional neural network (CNN) that represents the state-of-the-art in semantic segmentation. To enhance its sensitivity to flowers, we fine-tune this network using a single dataset of apple flower images. Since CNNs tend to produce coarse segmentations, we employ a refinement method to better distinguish between individual flower instances. Without any preprocessing or dataset-specific training, experimental results on images of apple, peach, and pear flowers, acquired under different conditions demonstrate the robustness and broad applicability of our method
DeepSolarEye: Power Loss Prediction and Weakly Supervised Soiling Localization via Fully Convolutional Networks for Solar Panels
The impact of soiling on solar panels is an important and well-studied
problem in renewable energy sector. In this paper, we present the first
convolutional neural network (CNN) based approach for solar panel soiling and
defect analysis. Our approach takes an RGB image of solar panel and
environmental factors as inputs to predict power loss, soiling localization,
and soiling type. In computer vision, localization is a complex task which
typically requires manually labeled training data such as bounding boxes or
segmentation masks. Our proposed approach consists of specialized four stages
which completely avoids localization ground truth and only needs panel images
with power loss labels for training. The region of impact area obtained from
the predicted localization masks are classified into soiling types using the
webly supervised learning. For improving localization capabilities of CNNs, we
introduce a novel bi-directional input-aware fusion (BiDIAF) block that
reinforces the input at different levels of CNN to learn input-specific feature
maps. Our empirical study shows that BiDIAF improves the power loss prediction
accuracy by about 3% and localization accuracy by about 4%. Our end-to-end
model yields further improvement of about 24% on localization when learned in a
weakly supervised manner. Our approach is generalizable and showed promising
results on web crawled solar panel images. Our system has a frame rate of 22
fps (including all steps) on a NVIDIA TitanX GPU. Additionally, we collected
first of it's kind dataset for solar panel image analysis consisting 45,000+
images.Comment: Accepted for publication at WACV 201
- …