766 research outputs found

    Visual road following using intrinsic images

    Get PDF
    We present a real-time visual-based road following method for mobile robots in outdoor environments. The approach combines an image processing method, that allows to retrieve illumination invariant images, with an efficient path following algorithm. The method allows a mobile robot to autonomously navigate along pathways of different types in adverse lighting conditions using monocular vision

    Advances in deep learning methods for pavement surface crack detection and identification with visible light visual images

    Full text link
    Compared to NDT and health monitoring method for cracks in engineering structures, surface crack detection or identification based on visible light images is non-contact, with the advantages of fast speed, low cost and high precision. Firstly, typical pavement (concrete also) crack public data sets were collected, and the characteristics of sample images as well as the random variable factors, including environmental, noise and interference etc., were summarized. Subsequently, the advantages and disadvantages of three main crack identification methods (i.e., hand-crafted feature engineering, machine learning, deep learning) were compared. Finally, from the aspects of model architecture, testing performance and predicting effectiveness, the development and progress of typical deep learning models, including self-built CNN, transfer learning(TL) and encoder-decoder(ED), which can be easily deployed on embedded platform, were reviewed. The benchmark test shows that: 1) It has been able to realize real-time pixel-level crack identification on embedded platform: the entire crack detection average time cost of an image sample is less than 100ms, either using the ED method (i.e., FPCNet) or the TL method based on InceptionV3. It can be reduced to less than 10ms with TL method based on MobileNet (a lightweight backbone base network). 2) In terms of accuracy, it can reach over 99.8% on CCIC which is easily identified by human eyes. On SDNET2018, some samples of which are difficult to be identified, FPCNet can reach 97.5%, while TL method is close to 96.1%. To the best of our knowledge, this paper for the first time comprehensively summarizes the pavement crack public data sets, and the performance and effectiveness of surface crack detection and identification deep learning methods for embedded platform, are reviewed and evaluated.Comment: 15 pages, 14 figures, 11 table

    Spatial-Temporal Deep Embedding for Vehicle Trajectory Reconstruction from High-Angle Video

    Full text link
    Spatial-temporal Map (STMap)-based methods have shown great potential to process high-angle videos for vehicle trajectory reconstruction, which can meet the needs of various data-driven modeling and imitation learning applications. In this paper, we developed Spatial-Temporal Deep Embedding (STDE) model that imposes parity constraints at both pixel and instance levels to generate instance-aware embeddings for vehicle stripe segmentation on STMap. At pixel level, each pixel was encoded with its 8-neighbor pixels at different ranges, and this encoding is subsequently used to guide a neural network to learn the embedding mechanism. At the instance level, a discriminative loss function is designed to pull pixels belonging to the same instance closer and separate the mean value of different instances far apart in the embedding space. The output of the spatial-temporal affinity is then optimized by the mutex-watershed algorithm to obtain final clustering results. Based on segmentation metrics, our model outperformed five other baselines that have been used for STMap processing and shows robustness under the influence of shadows, static noises, and overlapping. The designed model is applied to process all public NGSIM US-101 videos to generate complete vehicle trajectories, indicating a good scalability and adaptability. Last but not least, the strengths of the scanline method with STDE and future directions were discussed. Code, STMap dataset and video trajectory are made publicly available in the online repository. GitHub Link: shorturl.at/jklT0

    Real-time object detection using monocular vision for low-cost automotive sensing systems

    Get PDF
    This work addresses the problem of real-time object detection in automotive environments using monocular vision. The focus is on real-time feature detection, tracking, depth estimation using monocular vision and finally, object detection by fusing visual saliency and depth information. Firstly, a novel feature detection approach is proposed for extracting stable and dense features even in images with very low signal-to-noise ratio. This methodology is based on image gradients, which are redefined to take account of noise as part of their mathematical model. Each gradient is based on a vector connecting a negative to a positive intensity centroid, where both centroids are symmetric about the centre of the area for which the gradient is calculated. Multiple gradient vectors define a feature with its strength being proportional to the underlying gradient vector magnitude. The evaluation of the Dense Gradient Features (DeGraF) shows superior performance over other contemporary detectors in terms of keypoint density, tracking accuracy, illumination invariance, rotation invariance, noise resistance and detection time. The DeGraF features form the basis for two new approaches that perform dense 3D reconstruction from a single vehicle-mounted camera. The first approach tracks DeGraF features in real-time while performing image stabilisation with minimal computational cost. This means that despite camera vibration the algorithm can accurately predict the real-world coordinates of each image pixel in real-time by comparing each motion-vector to the ego-motion vector of the vehicle. The performance of this approach has been compared to different 3D reconstruction methods in order to determine their accuracy, depth-map density, noise-resistance and computational complexity. The second approach proposes the use of local frequency analysis of i ii gradient features for estimating relative depth. This novel method is based on the fact that DeGraF gradients can accurately measure local image variance with subpixel accuracy. It is shown that the local frequency by which the centroid oscillates around the gradient window centre is proportional to the depth of each gradient centroid in the real world. The lower computational complexity of this methodology comes at the expense of depth map accuracy as the camera velocity increases, but it is at least five times faster than the other evaluated approaches. This work also proposes a novel technique for deriving visual saliency maps by using Division of Gaussians (DIVoG). In this context, saliency maps express the difference of each image pixel is to its surrounding pixels across multiple pyramid levels. This approach is shown to be both fast and accurate when evaluated against other state-of-the-art approaches. Subsequently, the saliency information is combined with depth information to identify salient regions close to the host vehicle. The fused map allows faster detection of high-risk areas where obstacles are likely to exist. As a result, existing object detection algorithms, such as the Histogram of Oriented Gradients (HOG) can execute at least five times faster. In conclusion, through a step-wise approach computationally-expensive algorithms have been optimised or replaced by novel methodologies to produce a fast object detection system that is aligned to the requirements of the automotive domain

    Biologically inspired composite image sensor for deep field target tracking

    Get PDF
    The use of nonuniform image sensors in mobile based computer vision applications can be an effective solution when computational burden is problematic. Nonuniform image sensors are still in their infancy and as such have not been fully investigated for their unique qualities nor have they been extensively applied in practice. In this dissertation a system has been developed that can perform vision tasks in both the far field and the near field. In order to accomplish this, a new and novel image sensor system has been developed. Inspired by the biological aspects of the visual systems found in both falcons and primates, a composite multi-camera sensor was constructed. The sensor provides for expandable visual range, excellent depth of field, and produces a single compact output image based on the log-polar retinal-cortical mapping that occurs in primates. This mapping provides for scale and rotational tolerant processing which, in turn, supports the mitigation of perspective distortion found in strict Cartesian based sensor systems. Furthermore, the scale-tolerant representation of objects moving on trajectories parallel to the sensor\u27s optical axis allows for fast acquisition and tracking of objects moving at high rates of speed. In order to investigate how effective this combination would be for object detection and tracking at both near and far field, the system was tuned for the application of vehicle detection and tracking from a moving platform. Finally, it was shown that the capturing of license plate information in an autonomous fashion could easily be accomplished from the extraction of information contained in the mapped log-polar representation space. The novel composite log-polar deep-field image sensor opens new horizons for computer vision. This current work demonstrates features that can benefit applications beyond the high-speed vehicle tracking for drivers assistance and license plate capture. Some of the future applications envisioned include obstacle detection for high-speed trains, computer assisted aircraft landing, and computer assisted spacecraft docking

    Advanced traffic video analytics for robust traffic accident detection

    Get PDF
    Automatic traffic accident detection is an important task in traffic video analysis due to its key applications in developing intelligent transportation systems. Reducing the time delay between the occurrence of an accident and the dispatch of the first responders to the scene may help lower the mortality rate and save lives. Since 1980, many approaches have been presented for the automatic detection of incidents in traffic videos. In this dissertation, some challenging problems for accident detection in traffic videos are discussed and a new framework is presented in order to automatically detect single-vehicle and intersection traffic accidents in real-time. First, a new foreground detection method is applied in order to detect the moving vehicles and subtract the ever-changing background in the traffic video frames captured by static or non-stationary cameras. For the traffic videos captured during day-time, the cast shadows degrade the performance of the foreground detection and road segmentation. A novel cast shadow detection method is therefore presented to detect and remove the shadows cast by moving vehicles and also the shadows cast by static objects on the road. Second, a new method is presented to detect the region of interest (ROI), which applies the location of the moving vehicles and the initial road samples and extracts the discriminating features to segment the road region. After detecting the ROI, the moving direction of the traffic is estimated based on the rationale that the crashed vehicles often make rapid change of direction. Lastly, single-vehicle traffic accidents and trajectory conflicts are detected using the first-order logic decision-making system. The experimental results using publicly available videos and a dataset provided by the New Jersey Department of Transportation (NJDOT) demonstrate the feasibility of the proposed methods. Additionally, the main challenges and future directions are discussed regarding (i) improving the performance of the foreground segmentation, (ii) reducing the computational complexity, and (iii) detecting other types of traffic accidents

    Pedestrian Detection and Tracking in Video Surveillance System: Issues, Comprehensive Review, and Challenges

    Get PDF
    Pedestrian detection and monitoring in a surveillance system are critical for numerous utility areas which encompass unusual event detection, human gait, congestion or crowded vicinity evaluation, gender classification, fall detection in elderly humans, etc. Researchers’ primary focus is to develop surveillance system that can work in a dynamic environment, but there are major issues and challenges involved in designing such systems. These challenges occur at three different levels of pedestrian detection, viz. video acquisition, human detection, and its tracking. The challenges in acquiring video are, viz. illumination variation, abrupt motion, complex background, shadows, object deformation, etc. Human detection and tracking challenges are varied poses, occlusion, crowd density area tracking, etc. These results in lower recognition rate. A brief summary of surveillance system along with comparisons of pedestrian detection and tracking technique in video surveillance is presented in this chapter. The publicly available pedestrian benchmark databases as well as the future research directions on pedestrian detection have also been discussed

    Computer vision in target pursuit using a UAV

    Get PDF
    Research in target pursuit using Unmanned Aerial Vehicle (UAV) has gained attention in recent years, this is primarily due to decrease in cost and increase in demand of small UAVs in many sectors. In computer vision, target pursuit is a complex problem as it involves the solving of many sub-problems which are typically concerned with the detection, tracking and following of the object of interest. At present, the majority of related existing methods are developed using computer simulation with the assumption of ideal environmental factors, while the remaining few practical methods are mainly developed to track and follow simple objects that contain monochromatic colours with very little texture variances. Current research in this topic is lacking of practical vision based approaches. Thus the aim of this research is to fill the gap by developing a real-time algorithm capable of following a person continuously given only a photo input. As this research considers the whole procedure as an autonomous system, therefore the drone is activated automatically upon receiving a photo of a person through Wi-Fi. This means that the whole system can be triggered by simply emailing a single photo from any device anywhere. This is done by first implementing image fetching to automatically connect to WIFI, download the image and decode it. Then, human detection is performed to extract the template from the upper body of the person, the intended target is acquired using both human detection and template matching. Finally, target pursuit is achieved by tracking the template continuously while sending the motion commands to the drone. In the target pursuit system, the detection is mainly accomplished using a proposed human detection method that is capable of detecting, extracting and segmenting the human body figure robustly from the background without prior training. This involves detecting face, head and shoulder separately, mainly using gradient maps. While the tracking is mainly accomplished using a proposed generic and non-learning template matching method, this involves combining intensity template matching with colour histogram model and employing a three-tier system for template management. A flight controller is also developed, it supports three types of controls: keyboard, mouse and text messages. Furthermore, the drone is programmed with three different modes: standby, sentry and search. To improve the detection and tracking of colour objects, this research has also proposed several colour related methods. One of them is a colour model for colour detection which consists of three colour components: hue, purity and brightness. Hue represents the colour angle, purity represents the colourfulness and brightness represents intensity. It can be represented in three different geometric shapes: sphere, hemisphere and cylinder, each of these shapes also contains two variations. Experimental results have shown that the target pursuit algorithm is capable of identifying and following the target person robustly given only a photo input. This can be evidenced by the live tracking and mapping of the intended targets with different clothing in both indoor and outdoor environments. Additionally, the various methods developed in this research could enhance the performance of practical vision based applications especially in detecting and tracking of objects

    Aerial LaneNet: Lane Marking Semantic Segmentation in Aerial Imagery using Wavelet-Enhanced Cost-sensitive Symmetric Fully Convolutional Neural Networks

    Get PDF
    The knowledge about the placement and appearance of lane markings is a prerequisite for the creation of maps with high precision, necessary for autonomous driving, infrastructure monitoring, lane-wise traffic management, and urban planning. Lane markings are one of the important components of such maps. Lane markings convey the rules of roads to drivers. While these rules are learned by humans, an autonomous driving vehicle should be taught to learn them to localize itself. Therefore, accurate and reliable lane marking semantic segmentation in the imagery of roads and highways is needed to achieve such goals. We use airborne imagery which can capture a large area in a short period of time by introducing an aerial lane marking dataset. In this work, we propose a Symmetric Fully Convolutional Neural Network enhanced by Wavelet Transform in order to automatically carry out lane marking segmentation in aerial imagery. Due to a heavily unbalanced problem in terms of number of lane marking pixels compared with background pixels, we use a customized loss function as well as a new type of data augmentation step. We achieve a very high accuracy in pixel-wise localization of lane markings without using 3rd-party information. In this work, we introduce the first high-quality dataset used within our experiments which contains a broad range of situations and classes of lane markings representative of current transportation systems. This dataset will be publicly available and hence, it can be used as the benchmark dataset for future algorithms within this domain.Comment: IEEE TGRS 2018 - Accepte
    • …
    corecore