218,050 research outputs found

    Optimization of Automatic Target Recognition with a Reject Option Using Fusion and Correlated Sensor Data

    Get PDF
    This dissertation examines the optimization of automatic target recognition (ATR) systems when a rejection option is included. First, a comprehensive review of the literature inclusive of ATR assessment, fusion, correlated sensor data, and classifier rejection is presented. An optimization framework for the fusion of multiple sensors is then developed. This framework identifies preferred fusion rules and sensors along with rejection and receiver operating characteristic (ROC) curve thresholds without the use of explicit misclassification costs as required by a Bayes\u27 loss function. This optimization framework is the first to integrate both vertical warfighter output label analysis and horizontal engineering confusion matrix analysis. In addition, optimization is performed for the true positive rate, which incorporates the time required by classification systems. The mathematical programming framework is used to assess different fusion methods and to characterize correlation effects both within and across sensors. A synthetic classifier fusion-testing environment is developed by controlling the correlation levels of generated multivariate Gaussian data. This synthetic environment is used to demonstrate the utility of the optimization framework and to assess the performance of fusion algorithms as correlation varies. The mathematical programming framework is then applied to collected radar data. This radar fusion experiment optimizes Boolean and neural network fusion rules across four levels of sensor correlation. Comparisons are presented for the maximum true positive rate and the percentage of feasible thresholds to assess system robustness. Empirical evidence suggests ATR performance may improve by reducing the correlation within and across polarimetric radar sensors. Sensitivity analysis shows ATR performance is affected by the number of forced looks, prior probabilities, the maximum allowable rejection level, and the acceptable error rates

    Deep Architectures and Ensembles for Semantic Video Classification

    Get PDF
    This work addresses the problem of accurate semantic labelling of short videos. To this end, a multitude of different deep nets, ranging from traditional recurrent neural networks (LSTM, GRU), temporal agnostic networks (FV,VLAD,BoW), fully connected neural networks mid-stage AV fusion and others. Additionally, we also propose a residual architecture-based DNN for video classification, with state-of-the art classification performance at significantly reduced complexity. Furthermore, we propose four new approaches to diversity-driven multi-net ensembling, one based on fast correlation measure and three incorporating a DNN-based combiner. We show that significant performance gains can be achieved by ensembling diverse nets and we investigate factors contributing to high diversity. Based on the extensive YouTube8M dataset, we provide an in-depth evaluation and analysis of their behaviour. We show that the performance of the ensemble is state-of-the-art achieving the highest accuracy on the YouTube-8M Kaggle test data. The performance of the ensemble of classifiers was also evaluated on the HMDB51 and UCF101 datasets, and show that the resulting method achieves comparable accuracy with state-of-the-art methods using similar input features

    Cooperative sensing of spectrum opportunities

    Get PDF
    Reliability and availability of sensing information gathered from local spectrum sensing (LSS) by a single Cognitive Radio is strongly affected by the propagation conditions, period of sensing, and geographical position of the device. For this reason, cooperative spectrum sensing (CSS) was largely proposed in order to improve LSS performance by using cooperation between Secondary Users (SUs). The goal of this chapter is to provide a general analysis on CSS for cognitive radio networks (CRNs). Firstly, the theoretical system model for centralized CSS is introduced, together with a preliminary discussion on several fusion rules and operative modes. Moreover, three main aspects of CSS that substantially differentiate the theoretical model from realistic application scenarios are analyzed: (i) the presence of spatiotemporal correlation between decisions by different SUs; (ii) the possible mobility of SUs; and (iii) the nonideality of the control channel between the SUs and the Fusion Center (FC). For each aspect, a possible practical solution for network organization is presented, showing that, in particular for the first two aspects, cluster-based CSS, in which sensing SUs are properly chosen, could mitigate the impact of such realistic assumptions

    Multimodal fusion architectures for pedestrian detection

    Get PDF
    Pedestrian detection provides a crucial functionality in many human-centric applications, such as video surveillance, urban scene analysis, and autonomous driving. Recently, multimodal pedestrian detection has received extensive attention since the fusion of complementary information captured by visible and infrared sensors enables robust human target detection under daytime and nighttime scenes. In this chapter, we systematically evaluate the performance of different multimodal fusion architectures in order to identify the optimal solutions for pedestrian detection. We made two important observations: (1) it is useful to combine the most commonly used concatenation fusion scheme with a global scene-aware mechanism to learn both human-related features and correlation between visible and thermal feature maps; (2) the two-stream segmentation supervision without multimodal fusion provides the most effective scheme to infuse segmentation information as supervision for learning human-related features. Based on these studies, we present a unified multimodal fusion framework for joint training of target detection and segmentation supervision which achieves the state-of-the-art multimodal pedestrian detection performance on the public KAIST benchmark dataset.</p

    Adaptive distributed detection with applications to cellular CDMA

    Get PDF
    Chair and Varshney have derived an optimal rule for fusing decisions based on the Bayesian criterion. To implement the rule, probabilities of detection PD and false alarm PF for each detector must be known, which is not readily available in practice. This dissertation presents an adaptive fusion model which estimates the PD and PF adaptively by a simple counting process. Since reference signals are not given, the decision of a local detector is arbitrated by the fused decision of all the other local detectors. Adaptive algorithms for both equal probable and unequal probable sources, for independent and correlated observations are developed and analyzed, respectively. The convergence and error analysis of the system are analytically proven and demonstrated by simulations. In addition, in this dissertation, the performance of four practical fusion rules in both independent and correlated Gaussian noise is analyzed, and compared in terms of their Receiver Operating Characteristics (ROCs). Various factors that affect the fusion performance are considered in the analysis. By varying the local decision thresholds, the Rocs under the influence of the number of sensors, signal-to-noise ratio (SNR), the deviation of local decision probabilities, and correlation coefficient, are computed and plotted, respectively. Several interesting and key observations on the performance of fusion rules are drawn from the analysis. As an application of the above theory, a decentralized or distributed scheme in which each fusion center is connected with three widely spaced base stations is proposed for digital cellular code-division multi-access communications. Detected results at each base station are transmitted to the fusion center where the final decision is made by optimal fusion. The theoretical analysis shows that this novel structure can achieve an error probability at the fusion center which is always less than or equal to the minimum of the three respective base station. The performance comparison for binary coherent signaling in Rayleigh fading and log-normal shadowing demonstrates that the decentralized detection has a significant increased system capacity over conventional macro selection diversity. This dissertation analyzes the performance of the adaptive fusion method for macroscopic diversity combination in the wireless cellular environment when the error probability information from each base station detection is not available. The performance analysis includes the derivation of the minimum achievable error probability. An alternative realization with lower complexity of the optimal fusion scheme by using selection diversity is also proposed. The selection of the information bit in this realization is obtained either from the most reliable base station or through the majority rule from the participating base stations

    An efficient adaptive fusion scheme for multifocus images in wavelet domain using statistical properties of neighborhood

    Get PDF
    In this paper we present a novel fusion rule which can efficiently fuse multifocus images in wavelet domain by taking weighted average of pixels. The weights are adaptively decided using the statistical properties of the neighborhood. The main idea is that the eigen value of unbiased estimate of the covariance matrix of an image block depends on the strength of edges in the block and thus makes a good choice for weight to be given to the pixel, giving more weightage to pixel with sharper neighborhood. The performance of the proposed method have been extensively tested on several pairs of multifocus images and also compared quantitatively with various existing methods with the help of well known parameters including Petrovic and Xydeas image fusion metric. Experimental results show that performance evaluation based on entropy, gradient, contrast or deviation, the criteria widely used for fusion analysis, may not be enough. This work demonstrates that in some cases, these evaluation criteria are not consistent with the ground truth. It also demonstrates that Petrovic and Xydeas image fusion metric is a more appropriate criterion, as it is in correlation with ground truth as well as visual quality in all the tested fused images. The proposed novel fusion rule significantly improves contrast information while preserving edge information. The major achievement of the work is that it significantly increases the quality of the fused image, both visually and in terms of quantitative parameters, especially sharpness with minimum fusion artifacts

    Multimodal Fusion With Reference: Searching for Joint Neuromarkers of Working Memory Deficits in Schizophrenia

    Get PDF
    Multimodal fusion is an effective approach to take advantage of cross-information among multiple imaging data to better understand brain diseases. However, most current fusion approaches are blind, without adopting any prior information. To date, there is increasing interest to uncover the neurocognitive mapping of specific behavioral measurement on enriched brain imaging data; hence, a supervised, goal-directed model that enables a priori information as a reference to guide multimodal data fusion is in need and a natural option. Here we proposed a fusion with reference model, called “multi-site canonical correlation analysis with reference plus joint independent component analysis” (MCCAR+jICA), which can precisely identify co-varying multimodal imaging patterns closely related to reference information, such as cognitive scores. In a 3-way fusion simulation, the proposed method was compared with its alternatives on estimation accuracy of both target component decomposition and modality linkage detection. MCCAR+jICA outperforms others with higher precision. In human imaging data, working memory performance was utilized as a reference to investigate the covarying functional and structural brain patterns among 3 modalities and how they are impaired in schizophrenia. Two independent cohorts (294 and 83 subjects respectively) were used. Interestingly, similar brain maps were identified between the two cohorts, with substantial overlap in the executive control networks in fMRI, salience network in sMRI, and major white matter tracts in dMRI. These regions have been linked with working memory deficits in schizophrenia in multiple reports, while MCCAR+jICA further verified them in a repeatable, joint manner, demonstrating the potential of such results to identify potential neuromarkers for mental disorders

    Isomer: Isomerous Transformer for Zero-shot Video Object Segmentation

    Full text link
    Recent leading zero-shot video object segmentation (ZVOS) works devote to integrating appearance and motion information by elaborately designing feature fusion modules and identically applying them in multiple feature stages. Our preliminary experiments show that with the strong long-range dependency modeling capacity of Transformer, simply concatenating the two modality features and feeding them to vanilla Transformers for feature fusion can distinctly benefit the performance but at a cost of heavy computation. Through further empirical analysis, we find that attention dependencies learned in Transformer in different stages exhibit completely different properties: global query-independent dependency in the low-level stages and semantic-specific dependency in the high-level stages. Motivated by the observations, we propose two Transformer variants: i) Context-Sharing Transformer (CST) that learns the global-shared contextual information within image frames with a lightweight computation. ii) Semantic Gathering-Scattering Transformer (SGST) that models the semantic correlation separately for the foreground and background and reduces the computation cost with a soft token merging mechanism. We apply CST and SGST for low-level and high-level feature fusions, respectively, formulating a level-isomerous Transformer framework for ZVOS task. Compared with the baseline that uses vanilla Transformers for multi-stage fusion, ours significantly increase the speed by 13 times and achieves new state-of-the-art ZVOS performance. Code is available at https://github.com/DLUT-yyc/Isomer.Comment: ICCV202
    corecore