    Using Non-Additive Measure for Optimization-Based Nonlinear Classification

    Over the past few decades, numerous optimization-based methods have been proposed for solving the classification problem in data mining. Classic optimization-based methods do not consider attribute interactions toward classification. Thus, a novel learning machine is needed to provide a better understanding on the nature of classification when the interaction among contributions from various attributes cannot be ignored. The interactions can be described by a non-additive measure while the Choquet integral can serve as the mathematical tool to aggregate the values of attributes and the corresponding values of a non-additive measure. As a main part of this research, a new nonlinear classification method with non-additive measures is proposed. Experimental results show that applying non-additive measures on the classic optimization-based models improves the classification robustness and accuracy compared with some popular classification methods. In addition, motivated by well-known Support Vector Machine approach, we transform the primal optimization-based nonlinear classification model with the signed non-additive measure into its dual form by applying Lagrangian optimization theory and Wolfes dual programming theory. As a result, 2 – 1 parameters of the signed non-additive measure can now be approximated with m (number of records) Lagrangian multipliers by applying necessary conditions of the primal classification problem to be optimal. This method of parameter approximation is a breakthrough for solving a non-additive measure practically when there are a relatively small number of training cases available (). Furthermore, the kernel-based learning method engages the nonlinear classifiers to achieve better classification accuracy. The research produces practically deliverable nonlinear models with the non-additive measure for classification problem in data mining when interactions among attributes are considered

    Learning nonlinear monotone classifiers using the Choquet Integral

    In der jĂŒngeren Vergangenheit hat das Lernen von Vorhersagemodellen, die eine monotone Beziehung zwischen Ein- und Ausgabevariablen garantieren, wachsende Aufmerksamkeit im Bereich des maschinellen Lernens erlangt. Besonders fĂŒr flexible nichtlineare Modelle stellt die GewĂ€hrleistung der Monotonie eine große Herausforderung fĂŒr die Umsetzung dar. Die vorgelegte Arbeit nutzt das Choquet Integral als mathematische Grundlage fĂŒr die Entwicklung neuer Modelle fĂŒr nichtlineare Klassifikationsaufgaben. Neben den bekannten Einsatzgebieten des Choquet-Integrals als flexible Aggregationsfunktion in multi-kriteriellen Entscheidungsverfahren, findet der Formalismus damit Eingang als wichtiges Werkzeug fĂŒr Modelle des maschinellen Lernens. Neben dem Vorteil, Monotonie und FlexibilitĂ€t auf elegante Weise mathematisch vereinbar zu machen, bietet das Choquet-Integral Möglichkeiten zur Quantifizierung von Wechselwirkungen zwischen Gruppen von Attributen der Eingabedaten, wodurch interpretierbare Modelle gewonnen werden können. In der Arbeit werden konkrete Methoden fĂŒr das Lernen mit dem Choquet Integral entwickelt, welche zwei unterschiedliche AnsĂ€tze nutzen, die Maximum-Likelihood-SchĂ€tzung und die strukturelle Risikominimierung. WĂ€hrend der erste Ansatz zu einer Verallgemeinerung der logistischen Regression fĂŒhrt, wird der zweite mit Hilfe von Support-Vektor-Maschinen realisiert. In beiden FĂ€llen wird das Lernproblem imWesentlichen auf die Parameter-Identifikation von Fuzzy-Maßen fĂŒr das Choquet Integral zurĂŒckgefĂŒhrt. Die exponentielle Anzahl von Freiheitsgraden zur Modellierung aller Attribut-Teilmengen stellt dabei besondere Herausforderungen im Hinblick auf LaufzeitkomplexitĂ€t und Generalisierungsleistung. Vor deren Hintergrund werden die beiden AnsĂ€tze praktisch bewertet und auch theoretisch analysiert. Zudem werden auch geeignete Verfahren zur KomplexitĂ€tsreduktion und Modellregularisierung vorgeschlagen und untersucht. Die experimentellen Ergebnisse sind auch fĂŒr anspruchsvolle Referenzprobleme im Vergleich mit aktuellen Verfahren sehr gut und heben die NĂŒtzlichkeit der Kombination aus Monotonie und FlexibilitĂ€t des Choquet Integrals in verschiedenen AnsĂ€tzen des maschinellen Lernens hervor

    An empirical study of statistical properties of Choquet and Sugeno integrals

    International audienceThis paper investigates the statistical properties of the Choquet and Sugeno integrals, used as multiattribute models. The investigation is done on an empirical basis, and focuses on two topics: the distribution of the output of these integrals when the input is corrupted with noise, and the robustness of these models, when they are identified using some set of learning data through some learning procedure

    A multi-attribute decision making procedure using fuzzy numbers and hybrid aggregators

    The classical Analytical Hierarchy Process (AHP) has two limitations. Firstly, it disregards the aspect of uncertainty that usually embedded in the data or information expressed by human. Secondly, it ignores the aspect of interdependencies among attributes during aggregation. The application of fuzzy numbers aids in confronting the former issue whereas, the usage of Choquet Integral operator helps in dealing with the later issue. However, the application of fuzzy numbers into multi-attribute decision making (MADM) demands some additional steps and inputs from decision maker(s). Similarly, identification of monotone measure weights prior to employing Choquet Integral requires huge number of computational steps and amount of inputs from decision makers, especially with the increasing number of attributes. Therefore, this research proposed a MADM procedure which able to reduce the number of computational steps and amount of information required from the decision makers when dealing with these two aspects simultaneously. To attain primary goal of this research, five phases were executed. First, the concept of fuzzy set theory and its application in AHP were investigated. Second, an analysis on the aggregation operators was conducted. Third, the investigation was narrowed on Choquet Integral and its associate monotone measure. Subsequently, the proposed procedure was developed with the convergence of five major components namely Factor Analysis, Fuzzy-Linguistic Estimator, Choquet Integral, Mikhailov‘s Fuzzy AHP, and Simple Weighted Average. Finally, the feasibility of the proposed procedure was verified by solving a real MADM problem where the image of three stores located in Sabak Bernam, Selangor, Malaysia was analysed from the homemakers‘ perspective. This research has a potential in motivating more decision makers to simultaneously include uncertainties in human‘s data and interdependencies among attributes when solving any MADM problems

    Efficient Data Driven Multi Source Fusion

    Data/information fusion is an integral component of many existing and emerging applications; e.g., remote sensing, smart cars, Internet of Things (IoT), and Big Data, to name a few. While fusion aims to achieve better results than what any one individual input can provide, often the challenge is to determine the underlying mathematics for aggregation suitable for an application. In this dissertation, I focus on the following three aspects of aggregation: (i) efficient data-driven learning and optimization, (ii) extensions and new aggregation methods, and (iii) feature and decision level fusion for machine learning with applications to signal and image processing. The Choquet integral (ChI), a powerful nonlinear aggregation operator, is a parametric way (with respect to the fuzzy measure (FM)) to generate a wealth of aggregation operators. The FM has 2N variables and N(2N − 1) constraints for N inputs. As a result, learning the ChI parameters from data quickly becomes impractical for most applications. Herein, I propose a scalable learning procedure (which is linear with respect to training sample size) for the ChI that identifies and optimizes only data-supported variables. As such, the computational complexity of the learning algorithm is proportional to the complexity of the solver used. This method also includes an imputation framework to obtain scalar values for data-unsupported (aka missing) variables and a compression algorithm (lossy or losselss) of the learned variables. I also propose a genetic algorithm (GA) to optimize the ChI for non-convex, multi-modal, and/or analytical objective functions. This algorithm introduces two operators that automatically preserve the constraints; therefore there is no need to explicitly enforce the constraints as is required by traditional GA algorithms. In addition, this algorithm provides an efficient representation of the search space with the minimal set of vertices. Furthermore, I study different strategies for extending the fuzzy integral for missing data and I propose a GOAL programming framework to aggregate inputs from heterogeneous sources for the ChI learning. Last, my work in remote sensing involves visual clustering based band group selection and Lp-norm multiple kernel learning based feature level fusion in hyperspectral image processing to enhance pixel level classification


    Information fusion is the process of aggregating knowledge from multiple data sources to produce more consistent, accurate, and useful information than any one individual source can provide. In general, there are three primary sources of data/information: humans, algorithms, and sensors. Typically, objective data---e.g., measurements---arise from sensors. Using these data sources, applications such as computer vision and remote sensing have long been applying fusion at different levels (signal, feature, decision, etc.). Furthermore, the daily advancement in engineering technologies like smart cars, which operate in complex and dynamic environments using multiple sensors, are raising both the demand for and complexity of fusion. There is a great need to discover new theories to combine and analyze heterogeneous data arising from one or more sources. The work collected in this dissertation addresses the problem of feature- and decision-level fusion. Specifically, this work focuses on fuzzy choquet integral (ChI)-based data fusion methods. Most mathematical approaches for data fusion have focused on combining inputs relative to the assumption of independence between them. However, often there are rich interactions (e.g., correlations) between inputs that should be exploited. The ChI is a powerful aggregation tool that is capable modeling these interactions. Consider the fusion of m sources, where there are 2m unique subsets (interactions); the ChI is capable of learning the worth of each of these possible source subsets. However, the complexity of fuzzy integral-based methods grows quickly, as the number of trainable parameters for the fusion of m sources scales as 2m. Hence, we require a large amount of training data to avoid the problem of over-fitting. This work addresses the over-fitting problem of ChI-based data fusion with novel regularization strategies. These regularization strategies alleviate the issue of over-fitting while training with limited data and also enable the user to consciously push the learned methods to take a predefined, or perhaps known, structure. Also, the existing methods for training the ChI for decision- and feature-level data fusion involve quadratic programming (QP). The QP-based learning approach for learning ChI-based data fusion solutions has a high space complexity. This has limited the practical application of ChI-based data fusion methods to six or fewer input sources. To address the space complexity issue, this work introduces an online training algorithm for learning ChI. The online method is an iterative gradient descent approach that processes one observation at a time, enabling the applicability of ChI-based data fusion on higher dimensional data sets. In many real-world data fusion applications, it is imperative to have an explanation or interpretation. This may include providing information on what was learned, what is the worth of individual sources, why a decision was reached, what evidence process(es) were used, and what confidence does the system have on its decision. However, most existing machine learning solutions for data fusion are black boxes, e.g., deep learning. In this work, we designed methods and metrics that help with answering these questions of interpretation, and we also developed visualization methods that help users better understand the machine learning solution and its behavior for different instances of data

    Genetic Algorithm Optimization for Determining Fuzzy Measures from Fuzzy Data

    Fuzzy measures and fuzzy integrals have been successfully used in many real applications. How to determine fuzzy measures is a very difficult problem in these applications. Though there have existed some methodologies for solving this problem, such as genetic algorithms, gradient descent algorithms, neural networks, and particle swarm algorithm, it is hard to say which one is more appropriate and more feasible. Each method has its advantages. Most of the existed works can only deal with the data consisting of classic numbers which may arise limitations in practical applications. It is not reasonable to assume that all data are real data before we elicit them from practical data. Sometimes, fuzzy data may exist, such as in pharmacological, financial and sociological applications. Thus, we make an attempt to determine a more generalized type of general fuzzy measures from fuzzy data by means of genetic algorithms and Choquet integrals. In this paper, we make the first effort to define the σ-λ rules. Furthermore we define and characterize the Choquet integrals of interval-valued functions and fuzzy-number-valued functions based on σ-λ rules. In addition, we design a special genetic algorithm to determine a type of general fuzzy measures from fuzzy data

    Bridging the semantic gap in content-based image retrieval.

    To manage large image databases, Content-Based Image Retrieval (CBIR) emerged as a new research subject. CBIR involves the development of automated methods to use visual features in searching and retrieving. Unfortunately, the performance of most CBIR systems is inherently constrained by the low-level visual features because they cannot adequately express the user\u27s high-level concepts. This is known as the semantic gap problem. This dissertation introduces a new approach to CBIR that attempts to bridge the semantic gap. Our approach includes four components. The first one learns a multi-modal thesaurus that associates low-level visual profiles with high-level keywords. This is accomplished through image segmentation, feature extraction, and clustering of image regions. The second component uses the thesaurus to annotate images in an unsupervised way. This is accomplished through fuzzy membership functions to label new regions based on their proximity to the profiles in the thesaurus. The third component consists of an efficient and effective method for fusing the retrieval results from the multi-modal features. Our method is based on learning and adapting fuzzy membership functions to the distribution of the features\u27 distances and assigning a degree of worthiness to each feature. The fourth component provides the user with the option to perform hybrid querying and query expansion. This allows the enrichment of a visual query with textual data extracted from the automatically labeled images in the database. The four components are integrated into a complete CBIR system that can run in three different and complementary modes. The first mode allows the user to query using an example image. The second mode allows the user to specify positive and/or negative sample regions that should or should not be included in the retrieved images. The third mode uses a Graphical Text Interface to allow the user to browse the database interactively using a combination of low-level features and high-level concepts. The proposed system and ail of its components and modes are implemented and validated using a large data collection for accuracy, performance, and improvement over traditional CBIR techniques
