194 research outputs found

    Shadow Detection and Removal in Single-Image Using Paired Regions

    Get PDF
    A shadow appears on an area when the light from a source cannot reach the area due to obstruction by an object. The shadows are sometimes helpful for providing useful information about objects, and sometimes it degrade the quality of images or it may affect the information provide by them. Thus for the correct image interpretation it is important to detect shadow and restore the information. However, shadow causes problems in computer vision applications, such as segmentation, object detection and object counting. That’s why shadow detection and removal is a pre-processing task in many computer vision applications. So we propose a simple method to detect and remove shadows from a single image. The proposed method begins by selecting shadow image and by pre-processing method we focus only on shadow part. In image classification we distinguish between shadow and non shadow pixels. So that we able to label shadow and non shadow regions of the image. Once shadow is detected that detection results are later refined by image matting, and the shadow- free image is recovered by removing shadow region by non shadow region. Examination of a number of examples indicates that this method yields a significant improvement over previous methods

    3D optical metrology by digital moiré: Pixel-wise calibration refinement, grid removal, and temporal phase unwrapping

    Get PDF
    Fast, accurate three dimensional (3D) optical metrology has diverse applications in object and environment modelling. Structured-lighting techniques allow non-contacting 3D surface-shape measurement by projecting patterns of light onto an object surface, capturing images of the deformed patterns, and computing the 3D surface geometry from the captured 2D images. However, motion artifacts can still be a problem with high-speed surface-motion especially with increasing demand for higher measurement resolution and accuracy. To avoid motion artifacts, fast 2D image acquisition of projected patterns is required. Fast multi-pattern projection and minimization of the number of projected patterns are two approaches for dynamic object measurement. To achieve a higher rate of switching frames, fast multi-pattern projection techniques require costly projector hardware modification or new designs of projection systems to increase the projection rate beyond the capabilities of off-the-shelf projectors. Even if these disadvantages were acceptable (higher cost, complex hardware), and even if the rate of acquisition achievable with current systems were fast enough to avoid errors, minimization of the number of captured frames required will still contribute to reduce further the effect of object motion on measurement accuracy and to enable capture of higher object dynamics. Development of an optical 3D metrology method that minimizes the number of projected patterns while maintaining accurate 3D surface-shape measurement of objects with continuous and discontinuous surface geometry has remained a challenge. Capture of a single image-frame instead of multiple frames would be advantageous for measuring moving or deforming objects. Since accurate measurement generally requires multiple phase-shifted images, imbedding multiple patterns into a single projected composite pattern is one approach to achieve accurate single-frame 3D surface-shape measurement. The main limitations of existing single-frame methods based on composite patterns are poor resolution, small range of gray-level intensity due to collection of multiple patterns in one image, and degradation of the extracted patterns because of modulation and demodulation processes on the captured composite pattern image. To benefit from the advantages of multi-pattern projection of phase-shifted fringes and single-frame techniques, without combining phase-shifted patterns into one frame, digital moiré was used. Moiré patterns are generated by projecting a grid pattern onto the object, capturing a single frame, and in a post-process, superimposing a synthetic grid of the same frequency as in the captured image. Phase-shifting is carried out as a post-process by digitally shifting the synthetic grid across the captured image. The useful moiré patterns, which contain object shape information, are contaminated with a high-frequency grid lines that must be removed. After performing grid removal, computation of a phase map, and phase-to-height mapping, 3D object shape can be computed. The advantage of digital moiré provides an opportunity to decrease the number of projected patterns. However, in previous attempts to apply digital phase-shifting moiré to perform 3D surface-shape measurement, there have been significant limitations. To address the limitation of previous system-calibration techniques based on direct measurement of optical-setup parameters, a moiré-wavelength based phase-to-height mapping system-calibration method was developed. The moiré-wavelength refinement performs pixel-wise computation of the moiré wavelength based on the measured height (depth). In measurement of a flat plate at different depths, the range of root-mean-square (RMS) error was reduced from 0.334 to 0.828 mm using a single global wavelength across all pixels, to 0.204 to 0.261 mm using the new pixel-wise moiré-wavelength refinement. To address the limitations of previous grid removal techniques (precise mechanical grid translation, multiple-frame capture, moiré-pattern blurring, and measurement artifacts), a new grid removal technique was developed for single-frame digital moiré using combined stationary wavelet and Fourier transforms (SWT-FFT). This approach removes high frequency grid both straight and curved lines, without moiré-pattern artifacts, blurring, and degradation, and was an improvement compared to previous techniques. To address the limitations of the high number of projected patterns and captured images of temporal phase unwrapping (TPU) in fringe projection, and the low signal-to-noise ratio of the extended phase map of TPU in digital moiré, improved methods using two-image and three-image TPU in digital phase-shifting moiré were developed. For measurement of a pair of hemispherical objects with true radii 50.80 mm by two-image TPU digital moiré, least-squares fitted spheres to the measured 3D point clouds had errors of 0.03 mm and 0.06 mm, respectively (sphere fitting standard deviations 0.15 mm and 0.14 mm), and the centre-to-centre distance measurement between hemispheres had an error of 0.19 mm. The number of captured images required by this new method is one third that for three-wavelength heterodyne temporal phase unwrapping by fringe projection techniques, which would be advantageous in measuring dynamic objects, either moving or deforming

    Detecting single-trial EEG evoked potential using a wavelet domain linear mixed model: application to error potentials classification

    Full text link
    Objective. The main goal of this work is to develop a model for multi-sensor signals such as MEG or EEG signals, that accounts for the inter-trial variability, suitable for corresponding binary classification problems. An important constraint is that the model be simple enough to handle small size and unbalanced datasets, as often encountered in BCI type experiments. Approach. The method involves linear mixed effects statistical model, wavelet transform and spatial filtering, and aims at the characterization of localized discriminant features in multi-sensor signals. After discrete wavelet transform and spatial filtering, a projection onto the relevant wavelet and spatial channels subspaces is used for dimension reduction. The projected signals are then decomposed as the sum of a signal of interest (i.e. discriminant) and background noise, using a very simple Gaussian linear mixed model. Main results. Thanks to the simplicity of the model, the corresponding parameter estimation problem is simplified. Robust estimates of class-covariance matrices are obtained from small sample sizes and an effective Bayes plug-in classifier is derived. The approach is applied to the detection of error potentials in multichannel EEG data, in a very unbalanced situation (detection of rare events). Classification results prove the relevance of the proposed approach in such a context. Significance. The combination of linear mixed model, wavelet transform and spatial filtering for EEG classification is, to the best of our knowledge, an original approach, which is proven to be effective. This paper improves on earlier results on similar problems, and the three main ingredients all play an important role

    Mapping and monitoring forest remnants : a multiscale analysis of spatio-temporal data

    Get PDF
    KEYWORDS : Landsat, time series, machine learning, semideciduous Atlantic forest, Brazil, wavelet transforms, classification, change detectionForests play a major role in important global matters such as carbon cycle, climate change, and biodiversity. Besides, forests also influence soil and water dynamics with major consequences for ecological relations and decision-making. One basic requirement to quantify and model these processes is the availability of accurate maps of forest cover. Data acquisition and analysis at appropriate scales is the keystone to achieve the mapping accuracy needed for development and reliable use of ecological models.The current and upcoming production of high-resolution data sets plus the ever-increasing time series that have been collected since the seventieth must be effectively explored. Missing values and distortions further complicate the analysis of this data set. Thus, integration and proper analysis is of utmost importance for environmental research. New conceptual models in environmental sciences, like the perception of multiple scales, require the development of effective implementation techniques.This thesis presents new methodologies to map and monitor forests on large, highly fragmented areas with complex land use patterns. The use of temporal information is extensively explored to distinguish natural forests from other land cover types that are spectrally similar. In chapter 4, novel schemes based on multiscale wavelet analysis are introduced, which enabled an effective preprocessing of long time series of Landsat data and improved its applicability on environmental assessment.In chapter 5, the produced time series as well as other information on spectral and spatial characteristics were used to classify forested areas in an experiment relating a number of combinations of attribute features. Feature sets were defined based on expert knowledge and on data mining techniques to be input to traditional and machine learning algorithms for pattern recognition, viz . maximum likelihood, univariate and multivariate decision trees, and neural networks. The results showed that maximum likelihood classification using temporal texture descriptors as extracted with wavelet transforms was most accurate to classify the semideciduous Atlantic forest in the study area.In chapter 6, a multiscale approach to digital change detection was developed to deal with multisensor and noisy remotely sensed images. Changes were extracted according to size classes minimising the effects of geometric and radiometric misregistration.Finally, in chapter 7, an automated procedure for GIS updating based on feature extraction, segmentation and classification was developed to monitor the remnants of semideciduos Atlantic forest. The procedure showed significant improvements over post classification comparison and direct multidate classification based on artificial neural networks.</p

    Detecting, Tracking, And Recognizing Activities In Aerial Video

    Get PDF
    In this dissertation, we address the problem of detecting humans and vehicles, tracking them in crowded scenes, and finally determining their activities in aerial video. Even though this is a well explored problem in the field of computer vision, many challenges still remain when one is presented with realistic data. These challenges include large camera motion, strong scene parallax, fast object motion, large object density, strong shadows, and insufficiently large action datasets. Therefore, we propose a number of novel methods based on exploiting scene constraints from the imagery itself to aid in the detection and tracking of objects. We show, via experiments on several datasets, that superior performance is achieved with the use of proposed constraints. First, we tackle the problem of detecting moving, as well as stationary, objects in scenes that contain parallax and shadows. We do this on both regular aerial video, as well as the new and challenging domain of wide area surveillance. This problem poses several challenges: large camera motion, strong parallax, large number of moving objects, small number of pixels on target, single channel data, and low frame-rate of video. We propose a method for detecting moving and stationary objects that overcomes these challenges, and evaluate it on CLIF and VIVID datasets. In order to find moving objects, we use median background modelling which requires few frames to obtain a workable model, and is very robust when there is a large number of moving objects in the scene while the model is being constructed. We then iii remove false detections from parallax and registration errors using gradient information from the background image. Relying merely on motion to detect objects in aerial video may not be sufficient to provide complete information about the observed scene. First of all, objects that are permanently stationary may be of interest as well, for example to determine how long a particular vehicle has been parked at a certain location. Secondly, moving vehicles that are being tracked through the scene may sometimes stop and remain stationary at traffic lights and railroad crossings. These prolonged periods of non-motion make it very difficult for the tracker to maintain the identities of the vehicles. Therefore, there is a clear need for a method that can detect stationary pedestrians and vehicles in UAV imagery. This is a challenging problem due to small number of pixels on the target, which makes it difficult to distinguish objects from background clutter, and results in a much larger search space. We propose a method for constraining the search based on a number of geometric constraints obtained from the metadata. Specifically, we obtain the orientation of the ground plane normal, the orientation of the shadows cast by out of plane objects in the scene, and the relationship between object heights and the size of their corresponding shadows. We utilize the above information in a geometry-based shadow and ground plane normal blob detector, which provides an initial estimation for the locations of shadow casting out of plane (SCOOP) objects in the scene. These SCOOP candidate locations are then classified as either human or clutter using a combination of wavelet features, and a Support Vector Machine. Additionally, we combine regular SCOOP and inverted SCOOP candidates to obtain vehicle candidates. We show impressive results on sequences from VIVID and CLIF datasets, and provide comparative quantitative and qualitative analysis. We also show that we can extend the SCOOP detection method to automatically estimate the iv orientation of the shadow in the image without relying on metadata. This is useful in cases where metadata is either unavailable or erroneous. Simply detecting objects in every frame does not provide sufficient understanding of the nature of their existence in the scene. It may be necessary to know how the objects have travelled through the scene over time and which areas they have visited. Hence, there is a need to maintain the identities of the objects across different time instances. The task of object tracking can be very challenging in videos that have low frame rate, high density, and a very large number of objects, as is the case in the WAAS data. Therefore, we propose a novel method for tracking a large number of densely moving objects in an aerial video. In order to keep the complexity of the tracking problem manageable when dealing with a large number of objects, we divide the scene into grid cells, solve the tracking problem optimally within each cell using bipartite graph matching and then link the tracks across the cells. Besides tractability, grid cells also allow us to define a set of local scene constraints, such as road orientation and object context. We use these constraints as part of cost function to solve the tracking problem; This allows us to track fast-moving objects in low frame rate videos. In addition to moving through the scene, the humans that are present may be performing individual actions that should be detected and recognized by the system. A number of different approaches exist for action recognition in both aerial and ground level video. One of the requirements for the majority of these approaches is the existence of a sizeable dataset of examples of a particular action from which a model of the action can be constructed. Such a luxury is not always possible in aerial scenarios since it may be difficult to fly a large number of missions to observe a particular event multiple times. Therefore, we propose a method for v recognizing human actions in aerial video from as few examples as possible (a single example in the extreme case). We use the bag of words action representation and a 1vsAll multi-class classification framework. We assume that most of the classes have many examples, and construct Support Vector Machine models for each class. Then, we use Support Vector Machines that were trained for classes with many examples to improve the decision function of the Support Vector Machine that was trained using few examples, via late weighted fusion of decision values

    Discovering Regularity in Point Clouds of Urban Scenes

    Full text link
    Despite the apparent chaos of the urban environment, cities are actually replete with regularity. From the grid of streets laid out over the earth, to the lattice of windows thrown up into the sky, periodic regularity abounds in the urban scene. Just as salient, though less uniform, are the self-similar branching patterns of trees and vegetation that line streets and fill parks. We propose novel methods for discovering these regularities in 3D range scans acquired by a time-of-flight laser sensor. The applications of this regularity information are broad, and we present two original algorithms. The first exploits the efficiency of the Fourier transform for the real-time detection of periodicity in building facades. Periodic regularity is discovered online by doing a plane sweep across the scene and analyzing the frequency space of each column in the sweep. The simplicity and online nature of this algorithm allow it to be embedded in scanner hardware, making periodicity detection a built-in feature of future 3D cameras. We demonstrate the usefulness of periodicity in view registration, compression, segmentation, and facade reconstruction. The second algorithm leverages the hierarchical decomposition and locality in space of the wavelet transform to find stochastic parameters for procedural models that succinctly describe vegetation. These procedural models facilitate the generation of virtual worlds for architecture, gaming, and augmented reality. The self-similarity of vegetation can be inferred using multi-resolution analysis to discover the underlying branching patterns. We present a unified framework of these tools, enabling the modeling, transmission, and compression of high-resolution, accurate, and immersive 3D images

    An Approach for Object Tracking in Video Sequences

    Get PDF
    In recent past there has been a significant increase in number of applications effectively utilizing digital videos because of less costly but superior devices. This upsurge in video acquisition has led to huge augmentation of data, which are quite impossible to handle manually. Therefore, an automated means of processing these videos is indispensable. In this thesis one such attempt has been made to track objects in videos. Object tracking comprises two closely related processes; object detection followed by tracking of the detected objects. Algorithms on these two processes are proposed in this thesis. Simple object detection algorithms compare a static background frame at pixel level with the current frame in a video. Existing methods in this domain first try to detect objects and then remove shadows associated with them, which is a two-stage process. The proposed approach combines both the stages into a single stage. Two different algorithms are proposed on object detection. First one to model the background and the next to extract the objects and remove shadows from them. Initially, from first few frames the nature of each pixel is determined as stationary or non-stationary and considering only the stationary pixels a background model is developed. Subsequently, a local thresholding technique is used to extract objects and discard shadows. After successfully detecting all the foreground objects, two different algorithms are proposed for tracking the objects and updating the background model. The first algorithm suggests a centroid searching technique, where a centroid in current frame is estimated from the previous frame. Its accuracy is verified by comparing the entropy of dual-tree complex wavelet coefficients in the bounding boxes of both the frames. If estimation becomes inaccurate, a dynamic window is utilized to search for accurate centroid. The second algorithm updates the background using a randomized updating scheme. Both stages of the proposed tracking model is simulated with various recorded videos. Simulation results are compared with the recent schemes to show the superiority of the model

    Sonar image interpretation for sub-sea operations

    Get PDF
    Mine Counter-Measure (MCM) missions are conducted to neutralise underwater explosives. Automatic Target Recognition (ATR) assists operators by increasing the speed and accuracy of data review. ATR embedded on vehicles enables adaptive missions which increase the speed of data acquisition. This thesis addresses three challenges; the speed of data processing, robustness of ATR to environmental conditions and the large quantities of data required to train an algorithm. The main contribution of this thesis is a novel ATR algorithm. The algorithm uses features derived from the projection of 3D boxes to produce a set of 2D templates. The template responses are independent of grazing angle, range and target orientation. Integer skewed integral images, are derived to accelerate the calculation of the template responses. The algorithm is compared to the Haar cascade algorithm. For a single model of sonar and cylindrical targets the algorithm reduces the Probability of False Alarm (PFA) by 80% at a Probability of Detection (PD) of 85%. The algorithm is trained on target data from another model of sonar. The PD is only 6% lower even though no representative target data was used for training. The second major contribution is an adaptive ATR algorithm that uses local sea-floor characteristics to address the problem of ATR robustness with respect to the local environment. A dual-tree wavelet decomposition of the sea-floor and an Markov Random Field (MRF) based graph-cut algorithm is used to segment the terrain. A Neural Network (NN) is then trained to filter ATR results based on the local sea-floor context. It is shown, for the Haar Cascade algorithm, that the PFA can be reduced by 70% at a PD of 85%. Speed of data processing is addressed using novel pre-processing techniques. The standard three class MRF, for sonar image segmentation, is formulated using graph-cuts. Consequently, a 1.2 million pixel image is segmented in 1.2 seconds. Additionally, local estimation of class models is introduced to remove range dependent segmentation quality. Finally, an A* graph search is developed to remove the surface return, a line of saturated pixels often detected as false alarms by ATR. The A* search identifies the surface return in 199 of 220 images tested with a runtime of 2.1 seconds. The algorithm is robust to the presence of ripples and rocks

    Novel Video Completion Approaches and Their Applications

    Get PDF
    Video completion refers to automatically restoring damaged or removed objects in a video sequence, with applications ranging from sophisticated video removal of undesired static or dynamic objects to correction of missing or corrupted video frames in old movies and synthesis of new video frames to add, modify, or generate a new visual story. The video completion problem can be solved using texture synthesis and/or data interpolation to fill-in the holes of the sequence inward. This thesis makes a distinction between still image completion and video completion. The latter requires visually pleasing consistency by taking into account the temporal information. Based on their applied concepts, video completion techniques are categorized as inpainting and texture synthesis. We present a bandlet transform-based technique for each of these categories of video completion techniques. The proposed inpainting-based technique is a 3D volume regularization scheme that takes advantage of bandlet bases for exploiting the anisotropic regularities to reconstruct a damaged video. The proposed exemplar-based approach, on the other hand, performs video completion using a precise patch fusion in the bandlet domain instead of patch replacement. The video completion task is extended to two important applications in video restoration. First, we develop an automatic video text detection and removal that benefits from the proposed inpainting scheme and a novel video text detector. Second, we propose a novel video super-resolution technique that employs the inpainting algorithm spatially in conjunction with an effective structure tensor, generated using bandlet geometry. The experimental results show a good performance of the proposed video inpainting method and demonstrate the effectiveness of bandlets in video completion tasks. The proposed video text detector and the video super resolution scheme also show a high performance in comparison with existing methods
    corecore