558 research outputs found

    Vision based road lane detection system for vehicles guidance

    Get PDF
    Driver support system is one of the most important feature of the modern vehicles to ensure driver safety and decrease vehicle accident on roads. Apparently, the road lane detection or road boundaries detection is the complex and most challenging tasks. It is includes the localization of the road and the determination of the relative position between vehicle and road. A vision system using on-board camera looking outwards from the windshield is presented in this paper. The system acquires the front view using a camera mounted on the vehicle and detects the lanes by applying few processes. The lanes are extracted using Hough transform through a pair of hyperbolas which are fitted to the edges of the lanes. The proposed lane detection system can be applied on both painted and unpainted roads as well as curved and straight road in different weather conditions. The proposed system does not require any extra information such as lane width, time to lane crossing and offset between the center of the lanes. In addition, camera calibration and coordinate transformation are also not required. The system was investigated under various situations of changing illumination, and shadows effects in various road types without speed limits. The system has demonstrated a robust performance for detecting the road lanes under different conditions

    Novel Aggregated Solutions for Robust Visual Tracking in Traffic Scenarios

    Get PDF
    This work proposes novel approaches for object tracking in challenging scenarios like severe occlusion, deteriorated vision and long range multi-object reidentification. All these solutions are only based on image sequence captured by a monocular camera and do not require additional sensors. Experiments on standard benchmarks demonstrate an improved state-of-the-art performance of these approaches. Since all the presented approaches are smartly designed, they can run at a real-time speed

    Weighted and filtered mutual information: A Metric for the automated creation of panoramas from views of complex scenes

    Get PDF
    To contribute a novel approach in the field of image registration and panorama creation, this algorithm foregoes any scene knowledge, requiring only modest scene overlap and an acceptable amount of entropy within each overlapping view. The weighted and filtered mutual information (WFMI) algorithm has been developed for multiple stationary, color, surveillance video camera views and relies on color gradients for feature correspondence. This is a novel extension of well-established maximization of mutual information (MMI) algorithms. Where MMI algorithms are typically applied to high altitude photography and medical imaging (scenes with relatively simple shapes and affine relationships between views), the WFMI algorithm has been designed for scenes with occluded objects and significant parallax variation between non-affine related views. Despite these typically non-affine surveillance scenarios, searching in the affine space for a homography is a practical assumption that provides computational efficiency and accurate results, even with complex scene views. The WFMI algorithm can perfectly register affine views, performs exceptionally well with near-affine related views, and in complex scene views (well beyond affine constraints) the WFMI algorithm provides an accurate estimate of the overlap regions between the views. The WFMI algorithm uses simple calculations (vector field color gradient, Laplacian filtering, and feature histograms) to generate the WFMI metric and provide the optimal affine relationship. This algorithm is unique when compared to typical MMI algorithms and modern registration algorithms because it avoids almost all a priori knowledge and calculations, while still providing an accurate or useful estimate for realistic scenes. With mutual information weighting and the Laplacian filtering operation, the WFMI algorithm overcomes the failures of typical MMI algorithms in scenes where complex or occluded shapes do not provide sufficiently large peaks in the mutual information maps to determine the overlap region. This work has currently been applied to individual video frames and it will be shown that future work could easily extend the algorithm into utilizing motion information or temporal frame registrations to enhance scenes with smaller overlap regions, lower entropy, or even more significant parallax and occlusion variations between views

    Enhanced Augmented Reality Framework for Sports Entertainment Applications

    Get PDF
    Augmented Reality (AR) superimposes virtual information on real-world data, such as displaying useful information on videos/images of a scene. This dissertation presents an Enhanced AR (EAR) framework for displaying useful information on images of a sports game. The challenge in such applications is robust object detection and recognition. This is even more challenging when there is strong sunlight. We address the phenomenon where a captured image is degraded by strong sunlight. The developed framework consists of an image enhancement technique to improve the accuracy of subsequent player and face detection. The image enhancement is followed by player detection, face detection, recognition of players, and display of personal information of players. First, an algorithm based on Multi-Scale Retinex (MSR) is proposed for image enhancement. For the tasks of player and face detection, we use adaptive boosting algorithm with Haar-like features for both feature selection and classification. The player face recognition algorithm uses adaptive boosting with the LDA for feature selection and nearest neighbor classifier for classification. The framework can be deployed in any sports where a viewer captures images. Display of players-specific information enhances the end-user experience. Detailed experiments are performed on 2096 diverse images captured using a digital camera and smartphone. The images contain players in different poses, expressions, and illuminations. Player face recognition module requires players faces to be frontal or up to ?350 of pose variation. The work demonstrates the great potential of computer vision based approaches for future development of AR applications.COMSATS Institute of Information Technolog

    Person re-Identification over distributed spaces and time

    Get PDF
    PhDReplicating the human visual system and cognitive abilities that the brain uses to process the information it receives is an area of substantial scientific interest. With the prevalence of video surveillance cameras a portion of this scientific drive has been into providing useful automated counterparts to human operators. A prominent task in visual surveillance is that of matching people between disjoint camera views, or re-identification. This allows operators to locate people of interest, to track people across cameras and can be used as a precursory step to multi-camera activity analysis. However, due to the contrasting conditions between camera views and their effects on the appearance of people re-identification is a non-trivial task. This thesis proposes solutions for reducing the visual ambiguity in observations of people between camera views This thesis first looks at a method for mitigating the effects on the appearance of people under differing lighting conditions between camera views. This thesis builds on work modelling inter-camera illumination based on known pairs of images. A Cumulative Brightness Transfer Function (CBTF) is proposed to estimate the mapping of colour brightness values based on limited training samples. Unlike previous methods that use a mean-based representation for a set of training samples, the cumulative nature of the CBTF retains colour information from underrepresented samples in the training set. Additionally, the bi-directionality of the mapping function is explored to try and maximise re-identification accuracy by ensuring samples are accurately mapped between cameras. Secondly, an extension is proposed to the CBTF framework that addresses the issue of changing lighting conditions within a single camera. As the CBTF requires manually labelled training samples it is limited to static lighting conditions and is less effective if the lighting changes. This Adaptive CBTF (A-CBTF) differs from previous approaches that either do not consider lighting change over time, or rely on camera transition time information to update. By utilising contextual information drawn from the background in each camera view, an estimation of the lighting change within a single camera can be made. This background lighting model allows the mapping of colour information back to the original training conditions and thus remove the need for 3 retraining. Thirdly, a novel reformulation of re-identification as a ranking problem is proposed. Previous methods use a score based on a direct distance measure of set features to form a correct/incorrect match result. Rather than offering an operator a single outcome, the ranking paradigm is to give the operator a ranked list of possible matches and allow them to make the final decision. By utilising a Support Vector Machine (SVM) ranking method, a weighting on the appearance features can be learned that capitalises on the fact that not all image features are equally important to re-identification. Additionally, an Ensemble-RankSVM is proposed to address scalability issues by separating the training samples into smaller subsets and boosting the trained models. Finally, the thesis looks at a practical application of the ranking paradigm in a real world application. The system encompasses both the re-identification stage and the precursory extraction and tracking stages to form an aid for CCTV operators. Segmentation and detection are combined to extract relevant information from the video, while several combinations of matching techniques are combined with temporal priors to form a more comprehensive overall matching criteria. The effectiveness of the proposed approaches is tested on datasets obtained from a variety of challenging environments including offices, apartment buildings, airports and outdoor public spaces

    Wide View and Line Filter for Enhanced Image Gradient Computation and Edge Determination

    Full text link
    Edge determination is a challenging yet crucial step in the object detection process for images. It is the first in a multi-step process, serving as the foundation for all subsequent operations. Its accuracy directly affects the success of any future processing techniques and final detection. The challenge of edge detection is derived from a variety of factors, including noise, image sharpness, orientation, empirical parameters, and computational complexity. Many traditional kernel-based operators excel at tackling one of these problems, but trade off their ability to handle others. For example, the popular Sobel operator uses a horizontal and vertical constant high-pass kernel. The two kernels are used to tackle other operators’ challenges of directional propensity. Theoretically, the combined two kernels can estimate edge orientations with a certain degree of accuracy, however, the effectiveness of its two directional operators are not justified. This means it uses an arbitrary combination of the two kernels. Additionally, its high-pass nature amplifies noise, and the constant architecture of its kernel means it is unable to adapt to the varying light intensities in the photo. This presents large issues in localizing edges and accurately identifying them in the first place. Two new gradient detection kernels based on the two-dimensional high order Taylor Series expansion were proposed and constructed in MATLAB with the goal of tackling most of these problems. The key idea of the first kernel is to use a wide range of the pixels in view to suppress noise, hence improving the gradient intensities of edges. The second kernel builds on the first to leverage its noise suppression benefits while tackling an additional problem of degraded and low contrast edge boundaries. In principle, it can detect smooth lines in the presence of discontinuities and poor quality. The filter architecture allows for precise gradient calculation, edge detection, and orientation determination to less than one degree of the true value when faced with signal to noise ratios that exceed 0.75

    Adherent raindrop detection and removal in video

    Get PDF
    Abstract Raindrops adhered to a windscreen or window glass can significantly degrade the visibility of a scene. Detecting and removing raindrops will, therefore, benefit many computer vision applications, particularly outdoor surveillance systems and intelligent vehicle systems. In this paper, a method that automatically detects and removes adherent raindrops is introduced. The core idea is to exploit the local spatiotemporal derivatives of raindrops. First, it detects raindrops based on the motion and the intensity temporal derivatives of the input video. Second, relying on an analysis that some areas of a raindrop completely occludes the scene, yet the remaining areas occludes only partially, the method removes the two types of areas separately. For partially occluding areas, it restores them by retrieving as much as possible information of the scene, namely, by solving a blending function on the detected partially occluding areas using the temporal intensity change. For completely occluding areas, it recovers them by using a video completion technique. Experimental results using various real videos show the effectiveness of the proposed method
    corecore