457 research outputs found

    Multiple Instance Choquet Integral for multiresolution sensor fusion

    Get PDF
    Imagine you are traveling to Columbia, MO for the first time. On your flight to Columbia, the woman sitting next to you recommended a bakery by a large park with a big yellow umbrella outside. After you land, you need directions to the hotel from the airport. Suppose you are driving a rental car, you will need to park your car at a parking lot or a parking structure. After a good night's sleep in the hotel, you may decide to go for a run in the morning on the closest trail and stop by that recommended bakery under a big yellow umbrella. It would be helpful in the course of completing all these tasks to accurately distinguish the proper car route and walking trail, find a parking lot, and pinpoint the yellow umbrella. Satellite imagery and other geo-tagged data such as Open Street Maps provide effective information for this goal. Open Street Maps can provide road information and suggest bakery within a five-mile radius. The yellow umbrella is a distinctive color and, perhaps, is made of a distinctive material that can be identified from a hyperspectral camera. Open Street Maps polygons are tagged with information such as "parking lot" and "sidewalk." All these information can and should be fused to help identify and offer better guidance on the tasks you are completing. Supervised learning methods generally require precise labels for each training data point. It is hard (and probably at an extra cost) to manually go through and label each pixel in the training imagery. GPS coordinates cannot always be fully trusted as a GPS device may only be accurate to the level of several pixels. In many cases, it is practically infeasible to obtain accurate pixel-level training labels to perform fusion for all the imagery and maps available. Besides, the training data may come in a variety of data types, such as imagery or as a 3D point cloud. The imagery may have different resolutions, scales and, even, coordinate systems. Previous fusion methods are generally only limited to data mapped to the same pixel grid, with accurate labels. Furthermore, most fusion methods are restricted to only two sources, even if certain methods, such as pan-sharpening, can deal with different geo-spatial types or data of different resolution. It is, therefore, necessary and important, to come up with a way to perform fusion on multiple sources of imagery and map data, possibly with different resolutions and of different geo-spatial types with consideration of uncertain labels. I propose a Multiple Instance Choquet Integral framework for multi-resolution multisensor fusion with uncertain training labels. The Multiple Instance Choquet Integral (MICI) framework addresses uncertain training labels and performs both classification and regression. Three classifier fusion models, i.e. the noisy-or, min-max, and generalized-mean models, are derived under MICI. The Multi-Resolution Multiple Instance Choquet Integral (MR-MICI) framework is built upon the MICI framework and further addresses multiresolution in the fusion sources in addition to the uncertainty in training labels. For both MICI and MR-MICI, a monotonic normalized fuzzy measure is learned to be used with the Choquet integral to perform two-class classifier fusion given bag-level training labels. An optimization scheme based on the evolutionary algorithm is used to optimize the models proposed. For regression problems where the desired prediction is real-valued, the primary instance assumption is adopted. The algorithms are applied to target detection, regression and scene understanding applications. Experiments are conducted on the fusion of remote sensing data (hyperspectral and LiDAR) over the campus of University of Southern Mississippi - Gulfpark. Clothpanel sub-pixel and super-pixel targets were placed on campus with varying levels of occlusion and the proposed algorithms can successfully detect the targets in the scene. A semi-supervised approach is developed to automatically generate training labels based on data from Google Maps, Google Earth and Open Street Map. Based on such training labels with uncertainty, the proposed algorithms can also identify materials on campus for scene understanding, such as road, buildings, sidewalks, etc. In addition, the algorithms are used for weed detection and real-valued crop yield prediction experiments based on remote sensing data that can provide information for agricultural applications.Includes biblographical reference

    Efficient Data Driven Multi Source Fusion

    Get PDF
    Data/information fusion is an integral component of many existing and emerging applications; e.g., remote sensing, smart cars, Internet of Things (IoT), and Big Data, to name a few. While fusion aims to achieve better results than what any one individual input can provide, often the challenge is to determine the underlying mathematics for aggregation suitable for an application. In this dissertation, I focus on the following three aspects of aggregation: (i) efficient data-driven learning and optimization, (ii) extensions and new aggregation methods, and (iii) feature and decision level fusion for machine learning with applications to signal and image processing. The Choquet integral (ChI), a powerful nonlinear aggregation operator, is a parametric way (with respect to the fuzzy measure (FM)) to generate a wealth of aggregation operators. The FM has 2N variables and N(2N − 1) constraints for N inputs. As a result, learning the ChI parameters from data quickly becomes impractical for most applications. Herein, I propose a scalable learning procedure (which is linear with respect to training sample size) for the ChI that identifies and optimizes only data-supported variables. As such, the computational complexity of the learning algorithm is proportional to the complexity of the solver used. This method also includes an imputation framework to obtain scalar values for data-unsupported (aka missing) variables and a compression algorithm (lossy or losselss) of the learned variables. I also propose a genetic algorithm (GA) to optimize the ChI for non-convex, multi-modal, and/or analytical objective functions. This algorithm introduces two operators that automatically preserve the constraints; therefore there is no need to explicitly enforce the constraints as is required by traditional GA algorithms. In addition, this algorithm provides an efficient representation of the search space with the minimal set of vertices. Furthermore, I study different strategies for extending the fuzzy integral for missing data and I propose a GOAL programming framework to aggregate inputs from heterogeneous sources for the ChI learning. Last, my work in remote sensing involves visual clustering based band group selection and Lp-norm multiple kernel learning based feature level fusion in hyperspectral image processing to enhance pixel level classification

    Bridging the semantic gap in content-based image retrieval.

    Get PDF
    To manage large image databases, Content-Based Image Retrieval (CBIR) emerged as a new research subject. CBIR involves the development of automated methods to use visual features in searching and retrieving. Unfortunately, the performance of most CBIR systems is inherently constrained by the low-level visual features because they cannot adequately express the user\u27s high-level concepts. This is known as the semantic gap problem. This dissertation introduces a new approach to CBIR that attempts to bridge the semantic gap. Our approach includes four components. The first one learns a multi-modal thesaurus that associates low-level visual profiles with high-level keywords. This is accomplished through image segmentation, feature extraction, and clustering of image regions. The second component uses the thesaurus to annotate images in an unsupervised way. This is accomplished through fuzzy membership functions to label new regions based on their proximity to the profiles in the thesaurus. The third component consists of an efficient and effective method for fusing the retrieval results from the multi-modal features. Our method is based on learning and adapting fuzzy membership functions to the distribution of the features\u27 distances and assigning a degree of worthiness to each feature. The fourth component provides the user with the option to perform hybrid querying and query expansion. This allows the enrichment of a visual query with textual data extracted from the automatically labeled images in the database. The four components are integrated into a complete CBIR system that can run in three different and complementary modes. The first mode allows the user to query using an example image. The second mode allows the user to specify positive and/or negative sample regions that should or should not be included in the retrieved images. The third mode uses a Graphical Text Interface to allow the user to browse the database interactively using a combination of low-level features and high-level concepts. The proposed system and ail of its components and modes are implemented and validated using a large data collection for accuracy, performance, and improvement over traditional CBIR techniques

    Beyond Value-at-Risk : GlueVaR Distortion Risk Measures

    Get PDF
    We propose a new family of risk measures, called GlueVaR, within the class of distortion risk measures. Analytical closed-form expressions are shown for the most frequently used distribution functions in financial and insurance applications. The relationship between Glue-VaR, Value-at-Risk (VaR) and Tail Value-at-Risk (TVaR) is explained. Tail-subadditivity is investigated and it is shown that some GlueVaR risk measures satisfy this property. An interpretation in terms of risk attitudes is provided and a discussion is given on the applicability in non-financial problems such as health, safety, environmental or catastrophic risk managemen

    EXPLAINABLE FEATURE- AND DECISION-LEVEL FUSION

    Get PDF
    Information fusion is the process of aggregating knowledge from multiple data sources to produce more consistent, accurate, and useful information than any one individual source can provide. In general, there are three primary sources of data/information: humans, algorithms, and sensors. Typically, objective data---e.g., measurements---arise from sensors. Using these data sources, applications such as computer vision and remote sensing have long been applying fusion at different levels (signal, feature, decision, etc.). Furthermore, the daily advancement in engineering technologies like smart cars, which operate in complex and dynamic environments using multiple sensors, are raising both the demand for and complexity of fusion. There is a great need to discover new theories to combine and analyze heterogeneous data arising from one or more sources. The work collected in this dissertation addresses the problem of feature- and decision-level fusion. Specifically, this work focuses on fuzzy choquet integral (ChI)-based data fusion methods. Most mathematical approaches for data fusion have focused on combining inputs relative to the assumption of independence between them. However, often there are rich interactions (e.g., correlations) between inputs that should be exploited. The ChI is a powerful aggregation tool that is capable modeling these interactions. Consider the fusion of m sources, where there are 2m unique subsets (interactions); the ChI is capable of learning the worth of each of these possible source subsets. However, the complexity of fuzzy integral-based methods grows quickly, as the number of trainable parameters for the fusion of m sources scales as 2m. Hence, we require a large amount of training data to avoid the problem of over-fitting. This work addresses the over-fitting problem of ChI-based data fusion with novel regularization strategies. These regularization strategies alleviate the issue of over-fitting while training with limited data and also enable the user to consciously push the learned methods to take a predefined, or perhaps known, structure. Also, the existing methods for training the ChI for decision- and feature-level data fusion involve quadratic programming (QP). The QP-based learning approach for learning ChI-based data fusion solutions has a high space complexity. This has limited the practical application of ChI-based data fusion methods to six or fewer input sources. To address the space complexity issue, this work introduces an online training algorithm for learning ChI. The online method is an iterative gradient descent approach that processes one observation at a time, enabling the applicability of ChI-based data fusion on higher dimensional data sets. In many real-world data fusion applications, it is imperative to have an explanation or interpretation. This may include providing information on what was learned, what is the worth of individual sources, why a decision was reached, what evidence process(es) were used, and what confidence does the system have on its decision. However, most existing machine learning solutions for data fusion are black boxes, e.g., deep learning. In this work, we designed methods and metrics that help with answering these questions of interpretation, and we also developed visualization methods that help users better understand the machine learning solution and its behavior for different instances of data

    Quantitative risk assessment, aggregation functions and capital allocation problems

    Get PDF
    [eng] This work is focused on the study of risk measures and solutions to capital allocation problems, their suitability to answer practical questions in the framework of insurance and financial institutions and their connection with a family of functions named aggregation operators. These operators are well-known among researchers from the information sciences or fuzzy sets and systems community. The first contribution of this dissertation is the introduction of GlueVaR risk measures, a family belonging to the more general class of distortion risk measures. GlueVaR risk measures are simple to understand for risk managers in the financial and insurance sectors, because they are based on the most popular risk measures (VaR and TVaR) in both industries. For the same reason, they are almost as easy to compute as those common risk measures and, moreover, GlueVaR risk measures allow to capture more intricated managerial and regulatory attitudes towards risk. The definition of the tail-subadditivity property for a pair of risks may be considered the second contribution. A distortion risk measure which satisfies this property has the ability to be subadditive in extremely adverse scenarios. In order to decide if a GlueVaR risk measure is a candidate to satisfy the tail-subadditivity property, conditions on its parameters are determined. It is shown that distortion risk measures and several ordered weighted averaging operators in the discrete finite case are mathematically linked by means of the Choquet integral. It is shown that the overall aggregation preference of the expert may be measured by means of the local degree of orness of the distortion risk measure, which is a concept taken over from the information sciences community and brung into the quantitative risk management one. New indicators for helping to characterize the discrete Choquet integral are also presented in this dissertation. The aim is complementing those already available, in order to be able to highlight particular features of this kind of aggregation function. Following this spirit, the degree of balance, the divergence, the variance indicator and Rényi entropies as indicators within the framework of the Choquet integral are here introduced. A major contribution derived from the relationship between distortion risk measures and aggregation operators is the characterization of the risk attitude implicit into the choice of a distortion risk measure and a confidence or tolerance level. It is pointed out that the risk attitude implicit in a distortion risk measure is to some extent contained in its distortion function. In order to describe some relevant features of the distortion function, the degree of orness indicator and a quotient function are used. It is shown that these mathematical devices give insights on the implicit risk behavior involved in risk measures and entail the definitions of overall, absolute and specific risk attitudes. Regarding capital allocation problems, a list of key elements to delimit these problems is provided and mainly two contributions are made. Firstly, it is shown that GlueVaR risk measures are as useful as other alternatives like VaR or TVaR to solve capital allocation problems. The second contribution is understanding capital allocation principles as compositional data. This interpretation of capital allocation principles allows the connection between aggregation operators and capital allocation problems, with an immediate practical application: Properly averaging several available solutions to the same capital allocation problem. This thesis contains some preliminary ideas on this connection, but it seems to be a promising research field.[spa] Este trabajo se centra en el estudio de medidas de riesgo y de soluciones a problemas de asignación de capital, en su capacidad para responder cuestiones prácticas en el ámbito de las instituciones aseguradoras y financieras, y en su conexión con una familia de funciones denominadas operadores de agregación. Estos operadores son bien conocidos entre los investigadores de las comunidades de las ciencias de la información o de los conjuntos y sistemas fuzzy. La primera contribución de esta tesis es la introducción de las medidas de riesgo GlueVaR, una familia que pertenece a la clase más general de las medidas de riesgo de distorsión. Las medidas de riesgo GlueVaR son sencillas de entender para los gestores de riesgo de los sectores financiero y asegurador, puesto que están basadas en las medidas de riesgo más populares (el VaR y el TVaR) de ambas industrias. Por el mismo motivo, son casi tan fáciles de calcular como estas medidas de riesgo más comunes pero, además, las medidas de riesgo GlueVaR permiten capturar actitudes de gestión y regulatorias ante el riesgo más complicadas. La definición de la propiedad de la subadditividad en colas para un par de riesgos se puede considerar la segunda contribución. Una medida de riesgo de distorsión que cumple esta propiedad tiene la capacidad de ser subadditiva en escenarios extremadamente adversos. Con el propósito de decidir si una medida de riesgo GlueVaR es candidata a satisfacer la propiedad de la subadditividad en colas se determinan condiciones sobre sus parámetros. Se muestra que las medidas de riesgo de distorsión y varios operadores de medias ponderadas ordenadas en el caso finito y discreto están matemáticamente relacionadas a través de la integral de Choquet. Se muestra que la preferencia global de agregación del experto puede medirse usando el nivel local de orness de la medida de riesgo de distorsión, que es un concepto trasladado des de la comunidad de las ciencias de la información hacia la comunidad de la gestión cuantitativa del riesgo. Nuevos indicadores para ayudar a caracterizar las integrales de Choquet en el caso discreto también se presentan en esta disertación. Se pretende complementar a los existentes, con el fin de ser capaces de destacar características particulares de este tipo de funciones de agregación. Con este espíritu, se presentan el nivel de balance, la divergencia, el indicador de varianza y las entropías de Rényi como indicadores en el ámbito de la integral de Choquet. Una contribución relevante que se deriva de la relación entre las medidas de riesgo de distorsión y los operadores de agregación es la caracterización de la actitud ante el riesgo implícita en la elección de una medida de riesgo de distorsión y de un nivel de confianza. Se señala que la actitud ante el riesgo implícita en una medida de riesgo de distorsión está contenida, hasta cierto punto, en su función de distorsión. Para describir algunos rasgos relevantes de la función de distorsión se usan el indicador nivel de orness y una función cociente. Se muestra que estos instrumentos matemáticos aportan información relativa al comportamiento ante el riesgo implícito en las medidas de riesgo, y que de ellos se derivan las definiciones de les actitudes ante el riego de tipo general, absoluto y específico. En cuanto a los problemas de asignación de capital, se proporciona un listado de elementos clave para delimitar estos problemas y se hacen principalmente dos contribuciones. En primer lugar, se muestra que las medidas de riesgo GlueVaR son tan útiles como otras alternativas tales como el VaR o el TVaR para resolver problemas de asignación de capital. La segunda contribución consiste en entender los principios de asignación de capital como datos composicionales. Esta interpretación de los principios de asignación de capital permite establecer conexión entre los operadores de agregación y los problemas de asignación de capital, con una aplicación práctica inmediata: calcular debidamente la media de diferentes soluciones disponibles para el mismo problema de asignación de capital. Esta tesis contiene algunas ideas preliminares sobre esta conexión, pero parece un campo de investigación prometedor

    A novel method based on extended uncertain 2-tuple linguistic muirhead mean operators to MAGDM under uncertain 2-tuple linguistic environment

    Get PDF
    The present work is focused on multi-attribute group decision-making (MAGDM) problems with the uncertain 2-tuple linguistic information (ULI2–tuple) based on new aggregation operators which can capture interrelationships of attributes by a parameter vector P. To begin with, we present some new uncertain 2-tuple linguistic MM aggregation (UL2–tuple-MM) operators to handle MAGDM problems with ULI2–tuple, including the uncertain 2-tuple linguistic Muirhead mean (UL2–tuple-MM) operator, uncertain 2-tuple linguistic weighted Muirhead mean (UL2–tuple-WMM) operator. In addition, we extend UL2–tuple-WMM operator to a new aggregation operator named extended uncertain 2-tuple linguistic weighted Muirhead mean (EUL2–tuple-WMM) operators in order to handle some decision-making problems with ULI2–tuple whose attribute values are expressed in ULI2–tuple and attribute weights are also 2-tuple linguistic information. Whilst, the some properties of these new aggregation operators are obtained and some special cases are discussed. Moreover, we propose a new method to solve the MAGDM problems with ULI2–tuple. Finally, a numerical example is given to show the validity of the proposed method and the advantages of proposed method are also analysed
    corecore