30 research outputs found

    Efficient Data Driven Multi Source Fusion

    Get PDF
    Data/information fusion is an integral component of many existing and emerging applications; e.g., remote sensing, smart cars, Internet of Things (IoT), and Big Data, to name a few. While fusion aims to achieve better results than what any one individual input can provide, often the challenge is to determine the underlying mathematics for aggregation suitable for an application. In this dissertation, I focus on the following three aspects of aggregation: (i) efficient data-driven learning and optimization, (ii) extensions and new aggregation methods, and (iii) feature and decision level fusion for machine learning with applications to signal and image processing. The Choquet integral (ChI), a powerful nonlinear aggregation operator, is a parametric way (with respect to the fuzzy measure (FM)) to generate a wealth of aggregation operators. The FM has 2N variables and N(2N − 1) constraints for N inputs. As a result, learning the ChI parameters from data quickly becomes impractical for most applications. Herein, I propose a scalable learning procedure (which is linear with respect to training sample size) for the ChI that identifies and optimizes only data-supported variables. As such, the computational complexity of the learning algorithm is proportional to the complexity of the solver used. This method also includes an imputation framework to obtain scalar values for data-unsupported (aka missing) variables and a compression algorithm (lossy or losselss) of the learned variables. I also propose a genetic algorithm (GA) to optimize the ChI for non-convex, multi-modal, and/or analytical objective functions. This algorithm introduces two operators that automatically preserve the constraints; therefore there is no need to explicitly enforce the constraints as is required by traditional GA algorithms. In addition, this algorithm provides an efficient representation of the search space with the minimal set of vertices. Furthermore, I study different strategies for extending the fuzzy integral for missing data and I propose a GOAL programming framework to aggregate inputs from heterogeneous sources for the ChI learning. Last, my work in remote sensing involves visual clustering based band group selection and Lp-norm multiple kernel learning based feature level fusion in hyperspectral image processing to enhance pixel level classification

    Contribution to the Fusion of Biometric Modalities by the Choquet Integral

    Full text link

    Feature and Decision Level Fusion Using Multiple Kernel Learning and Fuzzy Integrals

    Get PDF
    The work collected in this dissertation addresses the problem of data fusion. In other words, this is the problem of making decisions (also known as the problem of classification in the machine learning and statistics communities) when data from multiple sources are available, or when decisions/confidence levels from a panel of decision-makers are accessible. This problem has become increasingly important in recent years, especially with the ever-increasing popularity of autonomous systems outfitted with suites of sensors and the dawn of the ``age of big data.\u27\u27 While data fusion is a very broad topic, the work in this dissertation considers two very specific techniques: feature-level fusion and decision-level fusion. In general, the fusion methods proposed throughout this dissertation rely on kernel methods and fuzzy integrals. Both are very powerful tools, however, they also come with challenges, some of which are summarized below. I address these challenges in this dissertation. Kernel methods for classification is a well-studied area in which data are implicitly mapped from a lower-dimensional space to a higher-dimensional space to improve classification accuracy. However, for most kernel methods, one must still choose a kernel to use for the problem. Since there is, in general, no way of knowing which kernel is the best, multiple kernel learning (MKL) is a technique used to learn the aggregation of a set of valid kernels into a single (ideally) superior kernel. The aggregation can be done using weighted sums of the pre-computed kernels, but determining the summation weights is not a trivial task. Furthermore, MKL does not work well with large datasets because of limited storage space and prediction speed. These challenges are tackled by the introduction of many new algorithms in the following chapters. I also address MKL\u27s storage and speed drawbacks, allowing MKL-based techniques to be applied to big data efficiently. Some algorithms in this work are based on the Choquet fuzzy integral, a powerful nonlinear aggregation operator parameterized by the fuzzy measure (FM). These decision-level fusion algorithms learn a fuzzy measure by minimizing a sum of squared error (SSE) criterion based on a set of training data. The flexibility of the Choquet integral comes with a cost, however---given a set of N decision makers, the size of the FM the algorithm must learn is 2N. This means that the training data must be diverse enough to include 2N independent observations, though this is rarely encountered in practice. I address this in the following chapters via many different regularization functions, a popular technique in machine learning and statistics used to prevent overfitting and increase model generalization. Finally, it is worth noting that the aggregation behavior of the Choquet integral is not intuitive. I tackle this by proposing a quantitative visualization strategy allowing the FM and Choquet integral behavior to be shown simultaneously

    Use of aggregation functions in decision making

    Full text link
    A key component of many decision making processes is the aggregation step, whereby a set of numbers is summarised with a single representative value. This research showed that aggregation functions can provide a mathematical formalism to deal with issues like vagueness and uncertainty, which arise naturally in various decision contexts

    Literature Review of the Recent Trends and Applications in various Fuzzy Rule based systems

    Full text link
    Fuzzy rule based systems (FRBSs) is a rule-based system which uses linguistic fuzzy variables as antecedents and consequent to represent human understandable knowledge. They have been applied to various applications and areas throughout the soft computing literature. However, FRBSs suffers from many drawbacks such as uncertainty representation, high number of rules, interpretability loss, high computational time for learning etc. To overcome these issues with FRBSs, there exists many extensions of FRBSs. This paper presents an overview and literature review of recent trends on various types and prominent areas of fuzzy systems (FRBSs) namely genetic fuzzy system (GFS), hierarchical fuzzy system (HFS), neuro fuzzy system (NFS), evolving fuzzy system (eFS), FRBSs for big data, FRBSs for imbalanced data, interpretability in FRBSs and FRBSs which use cluster centroids as fuzzy rules. The review is for years 2010-2021. This paper also highlights important contributions, publication statistics and current trends in the field. The paper also addresses several open research areas which need further attention from the FRBSs research community.Comment: 49 pages, Accepted for publication in ijf

    Quantitative risk assessment, aggregation functions and capital allocation problems

    Get PDF
    [eng] This work is focused on the study of risk measures and solutions to capital allocation problems, their suitability to answer practical questions in the framework of insurance and financial institutions and their connection with a family of functions named aggregation operators. These operators are well-known among researchers from the information sciences or fuzzy sets and systems community. The first contribution of this dissertation is the introduction of GlueVaR risk measures, a family belonging to the more general class of distortion risk measures. GlueVaR risk measures are simple to understand for risk managers in the financial and insurance sectors, because they are based on the most popular risk measures (VaR and TVaR) in both industries. For the same reason, they are almost as easy to compute as those common risk measures and, moreover, GlueVaR risk measures allow to capture more intricated managerial and regulatory attitudes towards risk. The definition of the tail-subadditivity property for a pair of risks may be considered the second contribution. A distortion risk measure which satisfies this property has the ability to be subadditive in extremely adverse scenarios. In order to decide if a GlueVaR risk measure is a candidate to satisfy the tail-subadditivity property, conditions on its parameters are determined. It is shown that distortion risk measures and several ordered weighted averaging operators in the discrete finite case are mathematically linked by means of the Choquet integral. It is shown that the overall aggregation preference of the expert may be measured by means of the local degree of orness of the distortion risk measure, which is a concept taken over from the information sciences community and brung into the quantitative risk management one. New indicators for helping to characterize the discrete Choquet integral are also presented in this dissertation. The aim is complementing those already available, in order to be able to highlight particular features of this kind of aggregation function. Following this spirit, the degree of balance, the divergence, the variance indicator and Rényi entropies as indicators within the framework of the Choquet integral are here introduced. A major contribution derived from the relationship between distortion risk measures and aggregation operators is the characterization of the risk attitude implicit into the choice of a distortion risk measure and a confidence or tolerance level. It is pointed out that the risk attitude implicit in a distortion risk measure is to some extent contained in its distortion function. In order to describe some relevant features of the distortion function, the degree of orness indicator and a quotient function are used. It is shown that these mathematical devices give insights on the implicit risk behavior involved in risk measures and entail the definitions of overall, absolute and specific risk attitudes. Regarding capital allocation problems, a list of key elements to delimit these problems is provided and mainly two contributions are made. Firstly, it is shown that GlueVaR risk measures are as useful as other alternatives like VaR or TVaR to solve capital allocation problems. The second contribution is understanding capital allocation principles as compositional data. This interpretation of capital allocation principles allows the connection between aggregation operators and capital allocation problems, with an immediate practical application: Properly averaging several available solutions to the same capital allocation problem. This thesis contains some preliminary ideas on this connection, but it seems to be a promising research field.[spa] Este trabajo se centra en el estudio de medidas de riesgo y de soluciones a problemas de asignación de capital, en su capacidad para responder cuestiones prácticas en el ámbito de las instituciones aseguradoras y financieras, y en su conexión con una familia de funciones denominadas operadores de agregación. Estos operadores son bien conocidos entre los investigadores de las comunidades de las ciencias de la información o de los conjuntos y sistemas fuzzy. La primera contribución de esta tesis es la introducción de las medidas de riesgo GlueVaR, una familia que pertenece a la clase más general de las medidas de riesgo de distorsión. Las medidas de riesgo GlueVaR son sencillas de entender para los gestores de riesgo de los sectores financiero y asegurador, puesto que están basadas en las medidas de riesgo más populares (el VaR y el TVaR) de ambas industrias. Por el mismo motivo, son casi tan fáciles de calcular como estas medidas de riesgo más comunes pero, además, las medidas de riesgo GlueVaR permiten capturar actitudes de gestión y regulatorias ante el riesgo más complicadas. La definición de la propiedad de la subadditividad en colas para un par de riesgos se puede considerar la segunda contribución. Una medida de riesgo de distorsión que cumple esta propiedad tiene la capacidad de ser subadditiva en escenarios extremadamente adversos. Con el propósito de decidir si una medida de riesgo GlueVaR es candidata a satisfacer la propiedad de la subadditividad en colas se determinan condiciones sobre sus parámetros. Se muestra que las medidas de riesgo de distorsión y varios operadores de medias ponderadas ordenadas en el caso finito y discreto están matemáticamente relacionadas a través de la integral de Choquet. Se muestra que la preferencia global de agregación del experto puede medirse usando el nivel local de orness de la medida de riesgo de distorsión, que es un concepto trasladado des de la comunidad de las ciencias de la información hacia la comunidad de la gestión cuantitativa del riesgo. Nuevos indicadores para ayudar a caracterizar las integrales de Choquet en el caso discreto también se presentan en esta disertación. Se pretende complementar a los existentes, con el fin de ser capaces de destacar características particulares de este tipo de funciones de agregación. Con este espíritu, se presentan el nivel de balance, la divergencia, el indicador de varianza y las entropías de Rényi como indicadores en el ámbito de la integral de Choquet. Una contribución relevante que se deriva de la relación entre las medidas de riesgo de distorsión y los operadores de agregación es la caracterización de la actitud ante el riesgo implícita en la elección de una medida de riesgo de distorsión y de un nivel de confianza. Se señala que la actitud ante el riesgo implícita en una medida de riesgo de distorsión está contenida, hasta cierto punto, en su función de distorsión. Para describir algunos rasgos relevantes de la función de distorsión se usan el indicador nivel de orness y una función cociente. Se muestra que estos instrumentos matemáticos aportan información relativa al comportamiento ante el riesgo implícito en las medidas de riesgo, y que de ellos se derivan las definiciones de les actitudes ante el riego de tipo general, absoluto y específico. En cuanto a los problemas de asignación de capital, se proporciona un listado de elementos clave para delimitar estos problemas y se hacen principalmente dos contribuciones. En primer lugar, se muestra que las medidas de riesgo GlueVaR son tan útiles como otras alternativas tales como el VaR o el TVaR para resolver problemas de asignación de capital. La segunda contribución consiste en entender los principios de asignación de capital como datos composicionales. Esta interpretación de los principios de asignación de capital permite establecer conexión entre los operadores de agregación y los problemas de asignación de capital, con una aplicación práctica inmediata: calcular debidamente la media de diferentes soluciones disponibles para el mismo problema de asignación de capital. Esta tesis contiene algunas ideas preliminares sobre esta conexión, pero parece un campo de investigación prometedor
    corecore