3,217 research outputs found

    Box-level Segmentation Supervised Deep Neural Networks for Accurate and Real-time Multispectral Pedestrian Detection

    Get PDF
    Effective fusion of complementary information captured by multi-modal sensors (visible and infrared cameras) enables robust pedestrian detection under various surveillance situations (e.g. daytime and nighttime). In this paper, we present a novel box-level segmentation supervised learning framework for accurate and real-time multispectral pedestrian detection by incorporating features extracted in visible and infrared channels. Specifically, our method takes pairs of aligned visible and infrared images with easily obtained bounding box annotations as input and estimates accurate prediction maps to highlight the existence of pedestrians. It offers two major advantages over the existing anchor box based multispectral detection methods. Firstly, it overcomes the hyperparameter setting problem occurred during the training phase of anchor box based detectors and can obtain more accurate detection results, especially for small and occluded pedestrian instances. Secondly, it is capable of generating accurate detection results using small-size input images, leading to improvement of computational efficiency for real-time autonomous driving applications. Experimental results on KAIST multispectral dataset show that our proposed method outperforms state-of-the-art approaches in terms of both accuracy and speed

    Fusion of Heterogeneous Earth Observation Data for the Classification of Local Climate Zones

    Get PDF
    This paper proposes a novel framework for fusing multi-temporal, multispectral satellite images and OpenStreetMap (OSM) data for the classification of local climate zones (LCZs). Feature stacking is the most commonly-used method of data fusion but does not consider the heterogeneity of multimodal optical images and OSM data, which becomes its main drawback. The proposed framework processes two data sources separately and then combines them at the model level through two fusion models (the landuse fusion model and building fusion model), which aim to fuse optical images with landuse and buildings layers of OSM data, respectively. In addition, a new approach to detecting building incompleteness of OSM data is proposed. The proposed framework was trained and tested using data from the 2017 IEEE GRSS Data Fusion Contest, and further validated on one additional test set containing test samples which are manually labeled in Munich and New York. Experimental results have indicated that compared to the feature stacking-based baseline framework the proposed framework is effective in fusing optical images with OSM data for the classification of LCZs with high generalization capability on a large scale. The classification accuracy of the proposed framework outperforms the baseline framework by more than 6% and 2%, while testing on the test set of 2017 IEEE GRSS Data Fusion Contest and the additional test set, respectively. In addition, the proposed framework is less sensitive to spectral diversities of optical satellite images and thus achieves more stable classification performance than state-of-the art frameworks.Comment: accepted by TGR

    A Review on Data Fusion of Multidimensional Medical and Biomedical Data

    Get PDF
    Data fusion aims to provide a more accurate description of a sample than any one source of data alone. At the same time, data fusion minimizes the uncertainty of the results by combining data from multiple sources. Both aim to improve the characterization of samples and might improve clinical diagnosis and prognosis. In this paper, we present an overview of the advances achieved over the last decades in data fusion approaches in the context of the medical and biomedical fields. We collected approaches for interpreting multiple sources of data in different combinations: image to image, image to biomarker, spectra to image, spectra to spectra, spectra to biomarker, and others. We found that the most prevalent combination is the image-to-image fusion and that most data fusion approaches were applied together with deep learning or machine learning methods

    Perception of Unstructured Environments for Autonomous Off-Road Vehicles

    Get PDF
    Autonome Fahrzeuge benötigen die FĂ€higkeit zur Perzeption als eine notwendige Voraussetzung fĂŒr eine kontrollierbare und sichere Interaktion, um ihre Umgebung wahrzunehmen und zu verstehen. Perzeption fĂŒr strukturierte Innen- und Außenumgebungen deckt wirtschaftlich lukrative Bereiche, wie den autonomen Personentransport oder die Industrierobotik ab, wĂ€hrend die Perzeption unstrukturierter Umgebungen im Forschungsfeld der Umgebungswahrnehmung stark unterreprĂ€sentiert ist. Die analysierten unstrukturierten Umgebungen stellen eine besondere Herausforderung dar, da die vorhandenen, natĂŒrlichen und gewachsenen Geometrien meist keine homogene Struktur aufweisen und Ă€hnliche Texturen sowie schwer zu trennende Objekte dominieren. Dies erschwert die Erfassung dieser Umgebungen und deren Interpretation, sodass Perzeptionsmethoden speziell fĂŒr diesen Anwendungsbereich konzipiert und optimiert werden mĂŒssen. In dieser Dissertation werden neuartige und optimierte Perzeptionsmethoden fĂŒr unstrukturierte Umgebungen vorgeschlagen und in einer ganzheitlichen, dreistufigen Pipeline fĂŒr autonome GelĂ€ndefahrzeuge kombiniert: Low-Level-, Mid-Level- und High-Level-Perzeption. Die vorgeschlagenen klassischen Methoden und maschinellen Lernmethoden (ML) zur Perzeption bzw.~Wahrnehmung ergĂ€nzen sich gegenseitig. DarĂŒber hinaus ermöglicht die Kombination von Perzeptions- und Validierungsmethoden fĂŒr jede Ebene eine zuverlĂ€ssige Wahrnehmung der möglicherweise unbekannten Umgebung, wobei lose und eng gekoppelte Validierungsmethoden kombiniert werden, um eine ausreichende, aber flexible Bewertung der vorgeschlagenen Perzeptionsmethoden zu gewĂ€hrleisten. Alle Methoden wurden als einzelne Module innerhalb der in dieser Arbeit vorgeschlagenen Perzeptions- und Validierungspipeline entwickelt, und ihre flexible Kombination ermöglicht verschiedene Pipelinedesigns fĂŒr eine Vielzahl von GelĂ€ndefahrzeugen und AnwendungsfĂ€llen je nach Bedarf. Low-Level-Perzeption gewĂ€hrleistet eine eng gekoppelte Konfidenzbewertung fĂŒr rohe 2D- und 3D-Sensordaten, um SensorausfĂ€lle zu erkennen und eine ausreichende Genauigkeit der Sensordaten zu gewĂ€hrleisten. DarĂŒber hinaus werden neuartige Kalibrierungs- und RegistrierungsansĂ€tze fĂŒr Multisensorsysteme in der Perzeption vorgestellt, welche lediglich die Struktur der Umgebung nutzen, um die erfassten Sensordaten zu registrieren: ein halbautomatischer Registrierungsansatz zur Registrierung mehrerer 3D~Light Detection and Ranging (LiDAR) Sensoren und ein vertrauensbasiertes Framework, welches verschiedene Registrierungsmethoden kombiniert und die Registrierung verschiedener Sensoren mit unterschiedlichen Messprinzipien ermöglicht. Dabei validiert die Kombination mehrerer Registrierungsmethoden die Registrierungsergebnisse in einer eng gekoppelten Weise. Mid-Level-Perzeption ermöglicht die 3D-Rekonstruktion unstrukturierter Umgebungen mit zwei Verfahren zur SchĂ€tzung der DisparitĂ€t von Stereobildern: ein klassisches, korrelationsbasiertes Verfahren fĂŒr Hyperspektralbilder, welches eine begrenzte Menge an Test- und Validierungsdaten erfordert, und ein zweites Verfahren, welches die DisparitĂ€t aus Graustufenbildern mit neuronalen Faltungsnetzen (CNNs) schĂ€tzt. Neuartige DisparitĂ€tsfehlermetriken und eine Evaluierungs-Toolbox fĂŒr die 3D-Rekonstruktion von Stereobildern ergĂ€nzen die vorgeschlagenen Methoden zur DisparitĂ€tsschĂ€tzung aus Stereobildern und ermöglichen deren lose gekoppelte Validierung. High-Level-Perzeption konzentriert sich auf die Interpretation von einzelnen 3D-Punktwolken zur Befahrbarkeitsanalyse, Objekterkennung und Hindernisvermeidung. Eine DomĂ€nentransferanalyse fĂŒr State-of-the-art-Methoden zur semantischen 3D-Segmentierung liefert Empfehlungen fĂŒr eine möglichst exakte Segmentierung in neuen ZieldomĂ€nen ohne eine Generierung neuer Trainingsdaten. Der vorgestellte Trainingsansatz fĂŒr 3D-Segmentierungsverfahren mit CNNs kann die benötigte Menge an Trainingsdaten weiter reduzieren. Methoden zur ErklĂ€rbarkeit kĂŒnstlicher Intelligenz vor und nach der Modellierung ermöglichen eine lose gekoppelte Validierung der vorgeschlagenen High-Level-Methoden mit Datensatzbewertung und modellunabhĂ€ngigen ErklĂ€rungen fĂŒr CNN-Vorhersagen. Altlastensanierung und MilitĂ€rlogistik sind die beiden HauptanwendungsfĂ€lle in unstrukturierten Umgebungen, welche in dieser Arbeit behandelt werden. Diese Anwendungsszenarien zeigen auch, wie die LĂŒcke zwischen der Entwicklung einzelner Methoden und ihrer Integration in die Verarbeitungskette fĂŒr autonome GelĂ€ndefahrzeuge mit Lokalisierung, Kartierung, Planung und Steuerung geschlossen werden kann. Zusammenfassend lĂ€sst sich sagen, dass die vorgeschlagene Pipeline flexible Perzeptionslösungen fĂŒr autonome GelĂ€ndefahrzeuge bietet und die begleitende Validierung eine exakte und vertrauenswĂŒrdige Perzeption unstrukturierter Umgebungen gewĂ€hrleistet

    A Linear Combination of Heuristics Approach to Spatial Sampling Hyperspectral Data for Target Tracking

    Get PDF
    Persistent surveillance of the battlespace results in better battlespace awareness which aids in obtaining air superiority, winning battles, and saving friendly lives. Although hyperspectral imagery (HSI) data has proven useful for discriminating targets, it presents many challenges as a useful tool in persistent surveillance. A new sensor under development has the potential of overcoming these challenges and transforming our persistent surveillance capability by providing HSI data for a limited number of pixels and grayscale video for the remainder. The challenge of exploiting this new sensor is determining where the HSI data in the sensor\u27s field of view will be the most useful. The approach taken is to use a utility function with components of equal dispersion, periodic poling, missed measurements, and predictive probability of association error (PPAE). The relative importance or optimal weighting of the different types of TOI is accomplished by a genetic algorithm using a multi-objective problem formulation. Experiments show using the utility function with equal weighting results in superior target tracking compared to any individual component by itself, and the equal weighting in close to the optimal solution. The new sensor is successfully exploited resulting in improved persistent surveillance

    An object-based approach for mapping forest structural types based on low-density LiDAR and multispectral imagery

    Full text link
    [EN] Mapping forest structure variables provides important information for the estimation of forest biomass, carbon stocks, pasture suitability or for wildfire risk prevention and control. The optimization of the prediction models of these variables requires an adequate stratification of the forest landscape in order to create specific models for each structural type or strata. This paper aims to propose and validate the use of an object-oriented classification methodology based on low-density LiDAR data (0.5 m−2) available at national level, WorldView-2 and Sentinel-2 multispectral imagery to categorize Mediterranean forests in generic structural types. After preprocessing the data sets, the area was segmented using a multiresolution algorithm, features describing 3D vertical structure were extracted from LiDAR data and spectral and texture features from satellite images. Objects were classified after feature selection in the following structural classes: grasslands, shrubs, forest (without shrubs), mixed forest (trees and shrubs) and dense young forest. Four classification algorithms (C4.5 decision trees, random forest, k-nearest neighbour and support vector machine) were evaluated using cross-validation techniques. The results show that the integration of low-density LiDAR and multispectral imagery provide a set of complementary features that improve the results (90.75% overall accuracy), and the object-oriented classification techniques are efficient for stratification of Mediterranean forest areas in structural- and fuel-related categories. Further work will be focused on the creation and validation of a different prediction model adapted to the various strata.This work was supported by the Spanish Ministerio de Economia y Competitividad and FEDER under [grant number CGL2013-46387-C2-1-R]; Fondo de Garantia Juvenil under [contract number PEJ-2014-A-45358].Ruiz FernĂĄndez, LÁ.; Recio Recio, JA.; Crespo-Peremarch, P.; Sapena, M. (2018). An object-based approach for mapping forest structural types based on low-density LiDAR and multispectral imagery. Geocarto International. 33(5):443-457. https://doi.org/10.1080/10106049.2016.1265595S44345733

    QUEST Hierarchy for Hyperspectral Face Recognition

    Get PDF
    Face recognition is an attractive biometric due to the ease in which photographs of the human face can be acquired and processed. The non-intrusive ability of many surveillance systems permits face recognition applications to be used in a myriad of environments. Despite decades of impressive research in this area, face recognition still struggles with variations in illumination, pose and expression not to mention the larger challenge of willful circumvention. The integration of supporting contextual information in a fusion hierarchy known as QUalia Exploitation of Sensor Technology (QUEST) is a novel approach for hyperspectral face recognition that results in performance advantages and a robustness not seen in leading face recognition methodologies. This research demonstrates a method for the exploitation of hyperspectral imagery and the intelligent processing of contextual layers of spatial, spectral, and temporal information. This approach illustrates the benefit of integrating spatial and spectral domains of imagery for the automatic extraction and integration of novel soft features (biometric). The establishment of the QUEST methodology for face recognition results in an engineering advantage in both performance and efficiency compared to leading and classical face recognition techniques. An interactive environment for the testing and expansion of this recognition framework is also provided

    Transformation Based Ensembles for Time Series Classification

    Get PDF
    Until recently, the vast majority of data mining time series classification (TSC) research has focused on alternative distance measures for 1-Nearest Neighbour (1-NN) classifiers based on either the raw data, or on compressions or smoothing of the raw data. Despite the extensive evidence in favour of 1-NN classifiers with Euclidean or Dynamic Time Warping distance, there has also been a flurry of recent research publications proposing classification algorithms for TSC. Generally, these classifiers describe different ways of incorporating summary measures in the time domain into more complex classifiers. Our hypothesis is that the easiest way to gain improvement on TSC problems is simply to transform into an alternative data space where the discriminatory features are more easily detected. To test our hypothesis, we perform a range of benchmarking experiments in the time domain, before evaluating nearest neighbour classifiers on data transformed into the power spectrum, the autocorrelation function, and the principal component space. We demonstrate that on some problems there is dramatic improvement in the accuracy of classifiers built on the transformed data over classifiers built in the time domain, but that there is also a wide variance in accuracy for a particular classifier built on different data transforms. To overcome this variability, we propose a simple transformation based ensemble, then demonstrate that it improves performance and reduces the variability of classifiers built in the time domain only. Our advice to a practitioner with a real world TSC problem is to try transforms before developing a complex classifier; it is the easiest way to get a potentially large increase in accuracy, and may provide further insights into the underlying relationships that characterise the problem
    • 

    corecore