18,128 research outputs found

    Robust fuzzy clustering for multiple instance regression.

    Get PDF
    Multiple instance regression (MIR) operates on a collection of bags, where each bag contains multiple instances sharing an identical real-valued label. Only few instances, called primary instances, contribute to the bag labels. The remaining instances are noise and outliers observations. The goal in MIR is to identify the primary instances within each bag and learn a regression model that can predict the label of a previously unseen bag. In this thesis, we introduce an algorithm that uses robust fuzzy clustering with an appropriate distance to learn multiple linear models from a noisy feature space simultaneously. We show that fuzzy memberships are useful in allowing instances to belong to multiple models, while possibilistic memberships allow identification of the primary instances of each bag with respect to each model. We also use possibilistic memberships to identify and ignore noisy instances and determine the optimal number of regression models. We evaluate our approach on a series of synthetic data sets, remote sensing data to predict the yearly average yield of a crop and application to drug activity prediction. We show that our approach achieves higher accuracy than existing methods

    A systematic review of data quality issues in knowledge discovery tasks

    Get PDF
    Hay un gran crecimiento en el volumen de datos porque las organizaciones capturan permanentemente la cantidad colectiva de datos para lograr un mejor proceso de toma de decisiones. El desafío mas fundamental es la exploración de los grandes volúmenes de datos y la extracción de conocimiento útil para futuras acciones por medio de tareas para el descubrimiento del conocimiento; sin embargo, muchos datos presentan mala calidad. Presentamos una revisión sistemática de los asuntos de calidad de datos en las áreas del descubrimiento de conocimiento y un estudio de caso aplicado a la enfermedad agrícola conocida como la roya del café.Large volume of data is growing because the organizations are continuously capturing the collective amount of data for better decision-making process. The most fundamental challenge is to explore the large volumes of data and extract useful knowledge for future actions through knowledge discovery tasks, nevertheless many data has poor quality. We presented a systematic review of the data quality issues in knowledge discovery tasks and a case study applied to agricultural disease named coffee rust

    Thermal error modelling of machine tools based on ANFIS with fuzzy c-means clustering using a thermal imaging camera

    Get PDF
    Thermal errors are often quoted as being the largest contributor to CNC machine tool errors, but they can be effectively reduced using error compensation. The performance of a thermal error compensation system depends on the accuracy and robustness of the thermal error model and the quality of the inputs to the model. The location of temperature measurement must provide a representative measurement of the change in temperature that will affect the machine structure. The number of sensors and their locations are not always intuitive and the time required to identify the optimal locations is often prohibitive, resulting in compromise and poor results. In this paper, a new intelligent compensation system for reducing thermal errors of machine tools using data obtained from a thermal imaging camera is introduced. Different groups of key temperature points were identified from thermal images using a novel schema based on a Grey model GM (0, N) and Fuzzy c-means (FCM) clustering method. An Adaptive Neuro-Fuzzy Inference System with Fuzzy c-means clustering (FCM-ANFIS) was employed to design the thermal prediction model. In order to optimise the approach, a parametric study was carried out by changing the number of inputs and number of membership functions to the FCM-ANFIS model, and comparing the relative robustness of the designs. According to the results, the FCM-ANFIS model with four inputs and six membership functions achieves the best performance in terms of the accuracy of its predictive ability. The residual value of the model is smaller than ± 2 μm, which represents a 95% reduction in the thermally-induced error on the machine. Finally, the proposed method is shown to compare favourably against an Artificial Neural Network (ANN) model

    Model fusion using fuzzy aggregation: Special applications to metal properties

    Get PDF
    To improve the modelling performance, one should either propose a new modelling methodology or make the best of existing models. In this paper, the study is concentrated on the latter solution, where a structure-free modelling paradigm is proposed. It does not rely on a fixed structure and can combine various modelling techniques in ‘symbiosis’ using a ‘master fuzzy system’. This approach is shown to be able to include the advantages of different modelling techniques altogether by requiring less training and by minimising the efforts relating optimisation of the final structure. The proposed approach is then successfully applied to the industrial problems of predicting machining induced residual stresses for aerospace alloy components as well as modelling the mechanical properties of heat-treated alloy steels, both representing complex, non-linear and multi-dimensional environments

    A hierarchical Mamdani-type fuzzy modelling approach with new training data selection and multi-objective optimisation mechanisms: A special application for the prediction of mechanical properties of alloy steels

    Get PDF
    In this paper, a systematic data-driven fuzzy modelling methodology is proposed, which allows to construct Mamdani fuzzy models considering both accuracy (precision) and transparency (interpretability) of fuzzy systems. The new methodology employs a fast hierarchical clustering algorithm to generate an initial fuzzy model efficiently; a training data selection mechanism is developed to identify appropriate and efficient data as learning samples; a high-performance Particle Swarm Optimisation (PSO) based multi-objective optimisation mechanism is developed to further improve the fuzzy model in terms of both the structure and the parameters; and a new tolerance analysis method is proposed to derive the confidence bands relating to the final elicited models. This proposed modelling approach is evaluated using two benchmark problems and is shown to outperform other modelling approaches. Furthermore, the proposed approach is successfully applied to complex high-dimensional modelling problems for manufacturing of alloy steels, using ‘real’ industrial data. These problems concern the prediction of the mechanical properties of alloy steels by correlating them with the heat treatment process conditions as well as the weight percentages of the chemical compositions
    • …
    corecore