256 research outputs found

    Aprendizaje evolutivo supervisado: Uso de histograma de gradiente y algoritmo de enjambre de partículas para detección y seguimiento de peatones en secuencia de imágenes infrarrojas

    Get PDF
    Recently, tracking and pedestrian detection from various images have become one of the major issues in the field of image processing and statistical identification.  In this regard, using evolutionary learning-based approaches to improve performance in different contexts can greatly influence the appropriate response.  There are problems with pedestrian tracking/identification, such as low accuracy for detection, high processing time, and uncertainty in response to answers.  Researchers are looking for new processing models that can accurately monitor one's position on the move.  In this study, a hybrid algorithm for the automatic detection of pedestrian position is presented.  It is worth noting that this method, contrary to the analysis of visible images, examines pedestrians' thermal and infrared components while walking and combines a neural network with maximum learning capability, wavelet kernel (Wavelet transform), and particle swarm optimization (PSO) to find parameters of learner model. Gradient histograms have a high effect on extracting features in infrared images.  As well, the neural network algorithm can achieve its goal (pedestrian detection and tracking) by maximizing learning.  The proposed method, despite the possibility of maximum learning, has a high speed in education, and results of various data sets in this field have been analyzed. The result indicates a negligible error in observing the infrared sequence of pedestrian movements, and it is suggested to use neural networks because of their precision and trying to boost the selection of their hyperparameters based on evolutionary algorithms

    Pedestrian Detection in Infrared Outdoor Images Based on Atmospheric Situation Estimation

    Get PDF
    Observation in absolute darkness and daytime under every atmospheric situation is one of the advantages of thermal imaging systems. In spite of increasing trend of using these systems, there are still lots of difficulties in analysing thermal images due to the variable features of pedestrians and atmospheric situations. In this paper an efficient method is proposed for detecting pedestrians in outdoor thermal images that adapts to variable atmospheric situations. In the first step, the type of atmospheric situation is estimated based on the global features of the thermal image. Then, for each situation, a relevant algorithm is performed for pedestrian detection. To do this, thermal images are divided into three classes of atmospheric situations: a) fine such as sunny weather, b) bad such as rainy and hazy weather, c) hot such as hot summer days where pedestrians are darker than background. Then 2-Dimensional Double Density Dual Tree Discrete Wavelet Transform (2D DD DT DWT) in three levels is acquired from input images and the energy of low frequency coefficients in third level is calculated as the discriminating feature for atmospheric situation identification. Feed-forward neural network (FFNN) classifier is trained by this feature vector to determine the category of atmospheric situation. Finally, a predetermined algorithm that is relevant to the category of atmospheric situation is applied for pedestrian detection. The proposed method in pedestrian detection has high performance so that the accuracy of pedestrian detection in two popular databases is more than 99%

    Adaptive detection and tracking using multimodal information

    Get PDF
    This thesis describes work on fusing data from multiple sources of information, and focuses on two main areas: adaptive detection and adaptive object tracking in automated vision scenarios. The work on adaptive object detection explores a new paradigm in dynamic parameter selection, by selecting thresholds for object detection to maximise agreement between pairs of sources. Object tracking, a complementary technique to object detection, is also explored in a multi-source context and an efficient framework for robust tracking, termed the Spatiogram Bank tracker, is proposed as a means to overcome the difficulties of traditional histogram tracking. As well as performing theoretical analysis of the proposed methods, specific example applications are given for both the detection and the tracking aspects, using thermal infrared and visible spectrum video data, as well as other multi-modal information sources

    Analysis of infrared polarisation signatures for vehicle detection

    Get PDF
    Thermal radiation emitted from objects within a scene tends to be partially polarised in a direction parallel to the surface normal, to an extent governed by properties of the surface material. This thesis investigates whether vehicle detection algorithms can be improved by the additional measurement of polarisation state as well as intensity in the long wave infrared. Knowledge about the polarimetric properties of scenes guides the development of histogram based and cluster based descriptors which are used in a traditional classification framework. The best performing histogram based method, the Polarimetric Histogram, which forms a descriptor based on the polarimetric vehicle signature is shown to outperform the standard Histogram of Oriented Gradients descriptor which uses intensity imagery alone. These descriptors then lead to a novel clustering algorithm which, at a false positive rate of 10−2 is shown to improve upon the Polarimetric Histogram descriptor, increasing the true positive rate from 0.19 to 0.63. In addition, a multi-modal detection framework which combines thermal intensity hotspot and polarimetric hotspot detections with a local motion detector is presented. Through the combination of these detectors, the false positive rate is shown to be reduced when compared to the result of individual detectors in isolation

    A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community

    Full text link
    In recent years, deep learning (DL), a re-branding of neural networks (NNs), has risen to the top in numerous areas, namely computer vision (CV), speech recognition, natural language processing, etc. Whereas remote sensing (RS) possesses a number of unique challenges, primarily related to sensors and applications, inevitably RS draws from many of the same theories as CV; e.g., statistics, fusion, and machine learning, to name a few. This means that the RS community should be aware of, if not at the leading edge of, of advancements like DL. Herein, we provide the most comprehensive survey of state-of-the-art RS DL research. We also review recent new developments in the DL field that can be used in DL for RS. Namely, we focus on theories, tools and challenges for the RS community. Specifically, we focus on unsolved challenges and opportunities as it relates to (i) inadequate data sets, (ii) human-understandable solutions for modelling physical phenomena, (iii) Big Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and learning algorithms for spectral, spatial and temporal data, (vi) transfer learning, (vii) an improved theoretical understanding of DL systems, (viii) high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote Sensin

    Target classification in multimodal video

    Get PDF
    The presented thesis focuses on enhancing scene segmentation and target recognition methodologies via the mobilisation of contextual information. The algorithms developed to achieve this goal utilise multi-modal sensor information collected across varying scenarios, from controlled indoor sequences to challenging rural locations. Sensors are chiefly colour band and long wave infrared (LWIR), enabling persistent surveillance capabilities across all environments. In the drive to develop effectual algorithms towards the outlined goals, key obstacles are identified and examined: the recovery of background scene structure from foreground object ’clutter’, employing contextual foreground knowledge to circumvent training a classifier when labeled data is not readily available, creating a labeled LWIR dataset to train a convolutional neural network (CNN) based object classifier and the viability of spatial context to address long range target classification when big data solutions are not enough. For an environment displaying frequent foreground clutter, such as a busy train station, we propose an algorithm exploiting foreground object presence to segment underlying scene structure that is not often visible. If such a location is outdoors and surveyed by an infra-red (IR) and visible band camera set-up, scene context and contextual knowledge transfer allows reasonable class predictions for thermal signatures within the scene to be determined. Furthermore, a labeled LWIR image corpus is created to train an infrared object classifier, using a CNN approach. The trained network demonstrates effective classification accuracy of 95% over 6 object classes. However, performance is not sustainable for IR targets acquired at long range due to low signal quality and classification accuracy drops. This is addressed by mobilising spatial context to affect network class scores, restoring robust classification capability

    Methods for multi-spectral image fusion: identifying stable and repeatable information across the visible and infrared spectra

    Get PDF
    Fusion of images captured from different viewpoints is a well-known challenge in computer vision with many established approaches and applications; however, if the observations are captured by sensors also separated by wavelength, this challenge is compounded significantly. This dissertation presents an investigation into the fusion of visible and thermal image information from two front-facing sensors mounted side-by-side. The primary focus of this work is the development of methods that enable us to map and overlay multi-spectral information; the goal is to establish a combined image in which each pixel contains both colour and thermal information. Pixel-level fusion of these distinct modalities is approached using computational stereo methods; the focus is on the viewpoint alignment and correspondence search/matching stages of processing. Frequency domain analysis is performed using a method called phase congruency. An extensive investigation of this method is carried out with two major objectives: to identify predictable relationships between the elements extracted from each modality, and to establish a stable representation of the common information captured by both sensors. Phase congruency is shown to be a stable edge detector and repeatable spatial similarity measure for multi-spectral information; this result forms the basis for the methods developed in the subsequent chapters of this work. The feasibility of automatic alignment with sparse feature-correspondence methods is investigated. It is found that conventional methods fail to match inter-spectrum correspondences, motivating the development of an edge orientation histogram (EOH) descriptor which incorporates elements of the phase congruency process. A cost function, which incorporates the outputs of the phase congruency process and the mutual information similarity measure, is developed for computational stereo correspondence matching. An evaluation of the proposed cost function shows it to be an effective similarity measure for multi-spectral information

    Multi-Modality Human Action Recognition

    Get PDF
    Human action recognition is very useful in many applications in various areas, e.g. video surveillance, HCI (Human computer interaction), video retrieval, gaming and security. Recently, human action recognition becomes an active research topic in computer vision and pattern recognition. A number of action recognition approaches have been proposed. However, most of the approaches are designed on the RGB images sequences, where the action data was collected by RGB/intensity camera. Thus the recognition performance is usually related to various occlusion, background, and lighting conditions of the image sequences. If more information can be provided along with the image sequences, more data sources other than the RGB video can be utilized, human actions could be better represented and recognized by the designed computer vision system.;In this dissertation, the multi-modality human action recognition is studied. On one hand, we introduce the study of multi-spectral action recognition, which involves the information from different spectrum beyond visible, e.g. infrared and near infrared. Action recognition in individual spectra is explored and new methods are proposed. Then the cross-spectral action recognition is also investigated and novel approaches are proposed in our work. On the other hand, since the depth imaging technology has made a significant progress recently, where depth information can be captured simultaneously with the RGB videos. The depth-based human action recognition is also investigated. I first propose a method combining different type of depth data to recognize human actions. Then a thorough evaluation is conducted on spatiotemporal interest point (STIP) based features for depth-based action recognition. Finally, I advocate the study of fusing different features for depth-based action analysis. Moreover, human depression recognition is studied by combining facial appearance model as well as facial dynamic model
    corecore