263,495 research outputs found
A novel approach to recognition of the detected moving objects in non-stationary background using heuristics and colour measurements : a thesis presented in partial fulfilment of the requirement for the degree of Master of Engineering at Massey University, Albany, New Zealand
Computer vision has become a growing area of research which involves two fundamental steps, object detection and object recognition. These two steps have been implemented in real world scenarios such as video surveillance systems, traffic cameras for counting cars, or more explicit detection such as detecting faces and recognizing facial expressions. Humans have a vision system that provides sophisticated ways to detect and recognize objects. Colour detection, depth of view and our past experience helps us determine the class of objects with respect to object’s size, shape and the context of the environment. Detection of moving objects on a non-stationary background and recognizing the class of these detected objects, are tasks that have been approached in many different ways. However, the accuracy and efficiency of current methods for object detection are still quite low, due to high computation time and memory intensive approaches. Similarly, object recognition has been approached in many ways but lacks the perceptive methodology to recognise objects.
This thesis presents an improved algorithm for detection of moving objects on a non-stationary background. It also proposes a new method for object recognition. Detection of moving objects is initiated by detecting SURF features to identify unique keypoints in the first frame. These keypoints are then searched through individually in another frame using cross correlation, resulting in a process called optical flow. Rejection of outliers is performed by using keypoints to compute global shift of pixels due to camera motion, which helps isolate the points that belong to the moving objects. These points are grouped into clusters using the proposed improved clustering algorithm. The clustering function is capable of adapting to the search radius around a feature point by taking the average Euclidean distance between all the feature points into account. The detected object is then processed through colour measurement and heuristics. Heuristics provide context of the surroundings to recognize the class of the object based upon the object’s size, shape and the environment it is in. This gives object recognition a perceptive approach.
Results from the proposed method have shown successful detection of moving objects in various scenes with dynamic backgrounds achieving an efficiency for object detection of over 95% for both indoor and outdoor scenes. The average processing time was computed to be around 16.5 seconds which includes the time taken to detect objects, as well as recognize them. On the other hand, Heuristic and colour based object recognition methodology achieved an efficiency of over 97%
Bayesian Modeling of Dynamic Scenes for Object Detection
Abstract—Accurate detection of moving objects is an important precursor to stable tracking or recognition. In this paper, we present an object detection scheme that has three innovations over existing approaches. First, the model of the intensities of image pixels as independent random variables is challenged and it is asserted that useful correlation exists in intensities of spatially proximal pixels. This correlation is exploited to sustain high levels of detection accuracy in the presence of dynamic backgrounds. By using a nonparametric density estimation method over a joint domain-range representation of image pixels, multimodal spatial uncertainties and complex dependencies between the domain (location) and range (color) are directly modeled. We propose a model of the background as a single probability density. Second, temporal persistence is proposed as a detection criterion. Unlike previous approaches to object detection which detect objects by building adaptive models of the background, the foreground is modeled to augment the detection of objects (without explicit tracking) since objects detected in the preceding frame contain substantial evidence for detection in the current frame. Finally, the background and foreground models are used competitively in a MAP-MRF decision framework, stressing spatial context as a condition of detecting interesting objects and the posterior function is maximized efficiently by finding the minimum cut of a capacitated graph. Experimental validation of the proposed method is performed and presented on a diverse set of dynamic scenes. Index Terms—Object detection, kernel density estimation, joint domain range, MAP-MRF estimation. æ
Natural Adversarial Objects
Although state-of-the-art object detection methods have shown compelling
performance, models often are not robust to adversarial attacks and
out-of-distribution data. We introduce a new dataset, Natural Adversarial
Objects (NAO), to evaluate the robustness of object detection models. NAO
contains 7,934 images and 9,943 objects that are unmodified and representative
of real-world scenarios, but cause state-of-the-art detection models to
misclassify with high confidence. The mean average precision (mAP) of
EfficientDet-D7 drops 74.5% when evaluated on NAO compared to the standard
MSCOCO validation set.
Moreover, by comparing a variety of object detection architectures, we find
that better performance on MSCOCO validation set does not necessarily translate
to better performance on NAO, suggesting that robustness cannot be simply
achieved by training a more accurate model.
We further investigate why examples in NAO are difficult to detect and
classify. Experiments of shuffling image patches reveal that models are overly
sensitive to local texture. Additionally, using integrated gradients and
background replacement, we find that the detection model is reliant on pixel
information within the bounding box, and insensitive to the background context
when predicting class labels. NAO can be downloaded at
https://drive.google.com/drive/folders/15P8sOWoJku6SSEiHLEts86ORfytGezi8
Hyp-OW: Exploiting Hierarchical Structure Learning with Hyperbolic Distance Enhances Open World Object Detection
Open World Object Detection (OWOD) is a challenging and realistic task that
extends beyond the scope of standard Object Detection task. It involves
detecting both known and unknown objects while integrating learned knowledge
for future tasks. However, the level of "unknownness" varies significantly
depending on the context. For example, a tree is typically considered part of
the background in a self-driving scene, but it may be significant in a
household context. We argue that this contextual information should already be
embedded within the known classes. In other words, there should be a semantic
or latent structure relationship between the known and unknown items to be
discovered. Motivated by this observation, we propose Hyp-OW, a method that
learns and models hierarchical representation of known items through a
SuperClass Regularizer. Leveraging this representation allows us to effectively
detect unknown objects using a similarity distance-based relabeling module.
Extensive experiments on benchmark datasets demonstrate the effectiveness of
Hyp-OW, achieving improvement in both known and unknown detection (up to 6
percent). These findings are particularly pronounced in our newly designed
benchmark, where a strong hierarchical structure exists between known and
unknown objects. Our code can be found at
https://github.com/boschresearch/Hyp-OWComment: Accepted at AAAI 2024 || keywords: Open World Object Detection,
Hyperbolic Distance, Unknown Detection, Deformable Transformers, Hierarchical
Representation Learnin
Moving Object Detection and Tracking in Open-Air Test Bed
In mobile and ubiquitous computing environments, acquisition of contextual information about a user situation is necessary to provide useful services. Although the definition of user context may change according to the situation or the service used, contextual information about who, where, and when are considered to be essential. We have built a test bed with multiple sensors: floor pressure sensors, RFID (radio frequency identification) tag systems, and cameras, to carry out experiments to detect the positions of users and track their movement. The conventional background subtraction method by using cameras was used for moving object detection and tracking. In this paper, we propose knowledge application and parameter adaptation in the background subtraction method. The results are presented to show that the proposed method decreases the detection errors
Spatial Pyramid Context-Aware Moving Object Detection and Tracking for Full Motion Video and Wide Aerial Motion Imagery
A robust and fast automatic moving object detection and tracking system is
essential to characterize target object and extract spatial and temporal
information for different functionalities including video surveillance systems,
urban traffic monitoring and navigation, robotic. In this dissertation, I
present a collaborative Spatial Pyramid Context-aware moving object detection
and Tracking system. The proposed visual tracker is composed of one master
tracker that usually relies on visual object features and two auxiliary
trackers based on object temporal motion information that will be called
dynamically to assist master tracker. SPCT utilizes image spatial context at
different level to make the video tracking system resistant to occlusion,
background noise and improve target localization accuracy and robustness. We
chose a pre-selected seven-channel complementary features including RGB color,
intensity and spatial pyramid of HoG to encode object color, shape and spatial
layout information. We exploit integral histogram as building block to meet the
demands of real-time performance. A novel fast algorithm is presented to
accurately evaluate spatially weighted local histograms in constant time
complexity using an extension of the integral histogram method. Different
techniques are explored to efficiently compute integral histogram on GPU
architecture and applied for fast spatio-temporal median computations and 3D
face reconstruction texturing. We proposed a multi-component framework based on
semantic fusion of motion information with projected building footprint map to
significantly reduce the false alarm rate in urban scenes with many tall
structures. The experiments on extensive VOTC2016 benchmark dataset and aerial
video confirm that combining complementary tracking cues in an intelligent
fusion framework enables persistent tracking for Full Motion Video and Wide
Aerial Motion Imagery.Comment: PhD Dissertation (162 pages
- …