6,442 research outputs found

    Integration of Auxiliary Data Knowledge in Prototype Based Vector Quantization and Classification Models

    Get PDF
    This thesis deals with the integration of auxiliary data knowledge into machine learning methods especially prototype based classification models. The problem of classification is diverse and evaluation of the result by using only the accuracy is not adequate in many applications. Therefore, the classification tasks are analyzed more deeply. Possibilities to extend prototype based methods to integrate extra knowledge about the data or the classification goal is presented to obtain problem adequate models. One of the proposed extensions is Generalized Learning Vector Quantization for direct optimization of statistical measurements besides the classification accuracy. But also modifying the metric adaptation of the Generalized Learning Vector Quantization for functional data, i. e. data with lateral dependencies in the features, is considered.:Symbols and Abbreviations 1 Introduction 1.1 Motivation and Problem Description . . . . . . . . . . . . . . . . . 1 1.2 Utilized Data Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2 Prototype Based Methods 19 2.1 Unsupervised Vector Quantization . . . . . . . . . . . . . . . . . . 22 2.1.1 C-means . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 2.1.2 Self-Organizing Map . . . . . . . . . . . . . . . . . . . . . . 25 2.1.3 Neural Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 2.1.4 Common Generalizations . . . . . . . . . . . . . . . . . . . 30 2.2 Supervised Vector Quantization . . . . . . . . . . . . . . . . . . . . 35 2.2.1 The Family of Learning Vector Quantizers - LVQ . . . . . . 36 2.2.2 Generalized Learning Vector Quantization . . . . . . . . . 38 2.3 Semi-Supervised Vector Quantization . . . . . . . . . . . . . . . . 42 2.3.1 Learning Associations by Self-Organization . . . . . . . . . 42 2.3.2 Fuzzy Labeled Self-Organizing Map . . . . . . . . . . . . . 43 2.3.3 Fuzzy Labeled Neural Gas . . . . . . . . . . . . . . . . . . 45 2.4 Dissimilarity Measures . . . . . . . . . . . . . . . . . . . . . . . . . 47 2.4.1 Differentiable Kernels in Generalized LVQ . . . . . . . . . 52 2.4.2 Dissimilarity Adaptation for Performance Improvement . 56 3 Deeper Insights into Classification Problems - From the Perspective of Generalized LVQ- 81 3.1 Classification Models . . . . . . . . . . . . . . . . . . . . . . . . . . 81 3.2 The Classification Task . . . . . . . . . . . . . . . . . . . . . . . . . 84 3.3 Evaluation of Classification Results . . . . . . . . . . . . . . . . . . 88 3.4 The Classification Task as an Ill-Posed Problem . . . . . . . . . . . 92 4 Auxiliary Structure Information and Appropriate Dissimilarity Adaptation in Prototype Based Methods 93 4.1 Supervised Vector Quantization for Functional Data . . . . . . . . 93 4.1.1 Functional Relevance/Matrix LVQ . . . . . . . . . . . . . . 95 4.1.2 Enhancement Generalized Relevance/Matrix LVQ . . . . 109 4.2 Fuzzy Information About the Labels . . . . . . . . . . . . . . . . . 121 4.2.1 Fuzzy Semi-Supervised Self-Organizing Maps . . . . . . . 122 4.2.2 Fuzzy Semi-Supervised Neural Gas . . . . . . . . . . . . . 123 5 Variants of Classification Costs and Class Sensitive Learning 137 5.1 Border Sensitive Learning in Generalized LVQ . . . . . . . . . . . 137 5.1.1 Border Sensitivity by Additive Penalty Function . . . . . . 138 5.1.2 Border Sensitivity by Parameterized Transfer Function . . 139 5.2 Optimizing Different Validation Measures by the Generalized LVQ 147 5.2.1 Attention Based Learning Strategy . . . . . . . . . . . . . . 148 5.2.2 Optimizing Statistical Validation Measurements for Binary Class Problems in the GLVQ . . . . . . . . . . . . . 155 5.3 Integration of Structural Knowledge about the Labeling in Fuzzy Supervised Neural Gas . . . . . . . . . . . . . . . . . . . . . . . . . 160 6 Conclusion and Future Work 165 My Publications 168 A Appendix 173 A.1 Stochastic Gradient Descent (SGD) . . . . . . . . . . . . . . . . . . 173 A.2 Support Vector Machine . . . . . . . . . . . . . . . . . . . . . . . . 175 A.3 Fuzzy Supervised Neural Gas Algorithm Solved by SGD . . . . . 179 Bibliography 182 Acknowledgements 20

    A Review on the Application of Natural Computing in Environmental Informatics

    Get PDF
    Natural computing offers new opportunities to understand, model and analyze the complexity of the physical and human-created environment. This paper examines the application of natural computing in environmental informatics, by investigating related work in this research field. Various nature-inspired techniques are presented, which have been employed to solve different relevant problems. Advantages and disadvantages of these techniques are discussed, together with analysis of how natural computing is generally used in environmental research.Comment: Proc. of EnviroInfo 201

    Batch and median neural gas

    Full text link
    Neural Gas (NG) constitutes a very robust clustering algorithm given euclidian data which does not suffer from the problem of local minima like simple vector quantization, or topological restrictions like the self-organizing map. Based on the cost function of NG, we introduce a batch variant of NG which shows much faster convergence and which can be interpreted as an optimization of the cost function by the Newton method. This formulation has the additional benefit that, based on the notion of the generalized median in analogy to Median SOM, a variant for non-vectorial proximity data can be introduced. We prove convergence of batch and median versions of NG, SOM, and k-means in a unified formulation, and we investigate the behavior of the algorithms in several experiments.Comment: In Special Issue after WSOM 05 Conference, 5-8 september, 2005, Pari

    A survey of machine learning techniques applied to self organizing cellular networks

    Get PDF
    In this paper, a survey of the literature of the past fifteen years involving Machine Learning (ML) algorithms applied to self organizing cellular networks is performed. In order for future networks to overcome the current limitations and address the issues of current cellular systems, it is clear that more intelligence needs to be deployed, so that a fully autonomous and flexible network can be enabled. This paper focuses on the learning perspective of Self Organizing Networks (SON) solutions and provides, not only an overview of the most common ML techniques encountered in cellular networks, but also manages to classify each paper in terms of its learning solution, while also giving some examples. The authors also classify each paper in terms of its self-organizing use-case and discuss how each proposed solution performed. In addition, a comparison between the most commonly found ML algorithms in terms of certain SON metrics is performed and general guidelines on when to choose each ML algorithm for each SON function are proposed. Lastly, this work also provides future research directions and new paradigms that the use of more robust and intelligent algorithms, together with data gathered by operators, can bring to the cellular networks domain and fully enable the concept of SON in the near future

    Relational visual cluster validity

    Get PDF
    The assessment of cluster validity plays a very important role in cluster analysis. Most commonly used cluster validity methods are based on statistical hypothesis testing or finding the best clustering scheme by computing a number of different cluster validity indices. A number of visual methods of cluster validity have been produced to display directly the validity of clusters by mapping data into two- or three-dimensional space. However, these methods may lose too much information to correctly estimate the results of clustering algorithms. Although the visual cluster validity (VCV) method of Hathaway and Bezdek can successfully solve this problem, it can only be applied for object data, i.e. feature measurements. There are very few validity methods that can be used to analyze the validity of data where only a similarity or dissimilarity relation exists – relational data. To tackle this problem, this paper presents a relational visual cluster validity (RVCV) method to assess the validity of clustering relational data. This is done by combining the results of the non-Euclidean relational fuzzy c-means (NERFCM) algorithm with a modification of the VCV method to produce a visual representation of cluster validity. RVCV can cluster complete and incomplete relational data and adds to the visual cluster validity theory. Numeric examples using synthetic and real data are presente

    Fuzzy sets predict flexural strength and density of silicon nitride ceramics

    Get PDF
    In this work, we utilize fuzzy sets theory to evaluate and make predictions of flexural strength and density of NASA 6Y silicon nitride ceramic. Processing variables of milling time, sintering time, and sintering nitrogen pressure are used as an input to the fuzzy system. Flexural strength and density are the output parameters of the system. Data from 273 Si3N4 modulus of rupture bars tested at room temperature and 135 bars tested at 1370 C are used in this study. Generalized mean operator and Hamming distance are utilized to build the fuzzy predictive model. The maximum test error for density does not exceed 3.3 percent, and for flexural strength 7.1 percent, as compared with the errors of 1.72 percent and 11.34 percent obtained by using neural networks, respectively. These results demonstrate that fuzzy sets theory can be incorporated into the process of designing materials, such as ceramics, especially for assessing more complex relationships between the processing variables and parameters, like strength, which are governed by randomness of manufacturing processes
    • …
    corecore