2,529 research outputs found
Data-driven Soft Sensors in the Process Industry
In the last two decades Soft Sensors established themselves as a valuable alternative to the traditional means for the acquisition of critical process variables, process monitoring and other tasks which are related to process control. This paper discusses characteristics of the process industry data which are critical for the development of data-driven Soft Sensors. These characteristics are common to a large number of process industry fields, like the chemical industry, bioprocess industry, steel industry, etc. The focus of this work is put on the data-driven Soft Sensors because of their growing popularity, already demonstrated usefulness and huge, though yet not completely realised, potential. A comprehensive selection of case studies covering the three most important Soft Sensor application fields, a general introduction to the most popular Soft Sensor modelling techniques as well as a discussion of some open issues in the Soft Sensor development and maintenance and their possible solutions are the main contributions of this work
Rough Fuzzy Subspace Clustering for Data with Missing Values
The paper presents rough fuzzy subspace clustering algorithm and experimental results of clustering. In this algorithm three approaches for handling missing values are used: marginalisation, imputation and rough sets. The algorithm also assigns weights to attributes in each cluster; this leads to subspace clustering. The parameters of clusters are elaborated in the iterative procedure based on minimising of criterion function. The crucial parameter of the proposed algorithm is the parameter having the influence on the sharpness of elaborated subspace cluster. The lower values of the parameter lead to selection of the most important attribute. The higher values create clusters in the global space, not in subspaces. The paper is accompanied by results of clustering of synthetic and real life data sets
Data mining in soft computing framework: a survey
The present article provides a survey of the available literature on data mining using soft computing. A categorization has been provided based on the different soft computing tools and their hybridizations used, the data mining function implemented, and the preference criterion selected by the model. The utility of the different soft computing methodologies is highlighted. Generally fuzzy sets are suitable for handling the issues related to understandability of patterns, incomplete/noisy data, mixed media information and human interaction, and can provide approximate solutions faster. Neural networks are nonparametric, robust, and exhibit good learning and generalization capabilities in data-rich environments. Genetic algorithms provide efficient search algorithms to select a model, from mixed media data, based on some preference criterion/objective function. Rough sets are suitable for handling different types of uncertainty in data. Some challenges to data mining and the application of soft computing methodologies are indicated. An extensive bibliography is also included
HIV analysis using computational intelligence
In this study, a new method to analyze HIV using a combination of autoencoder
networks and genetic algorithms is proposed. The proposed method is
tested on a set of demographic properties of individuals obtained from the
South African antenatal survey. The autoencoder model is then compared
with a conventional feedforward neural network model and yields a classification
accuracy of 92% compared to 84% obtained for the conventional feedforward
model. The autoencoder model is then used to propose a new method
of approximating missing entries in the HIV database using ant colony optimization.
This method is able to estimate missing input to an accuracy of
80%. The estimated missing input values are then used to analyze HIV. The
autoencoder network classifier model yields a classification accuracy of 81% in
the presence of missing input values. The feedforward neural network classifier
model yields a classification accuracy of 82% in the presence of missing input
values. A control mechanism is proposed to assess the effect of demographic
properties on the HIV status of individuals, based on inverse neural networks,
and autoencoder networks-based-on-genetic algorithms. This control mechanism
is aimed at understanding whether HIV susceptibility can be controlled
by modifying some of the demographic properties. The inverse neural network
control model has accuracies of 77% and 82%, meanwhile the genetic algorithm
model has accuracies of 77% and 92%, for the prediction of educational level
of individuals, and gravidity, respectively. HIV modelling using neuro-fuzzy
models is then investigated, and rules are extracted, which provide more valuable
insight. The classification accuracy obtained by the neuro-fuzzy model
is 86%. A rough set approximation is then investigated for rule extraction,
and it is found that the rules present simplistic and understandable relationships
on how the demographic properties affect HIV risk. The study concludes
by investigating a model for automatic relevance determination, to determine
which of the demographic properties is important for HIV modelling. A comparison
is done between using the full input data set and the data set using the
input parameters selected by the technique for the HIV classification. Age of
the individual, gravidity, province, region, reported pregnancy and educational
level were amongst the input parameters selected as relevant for classification
of an individual’s HIV risk. This study thus proposes models, which can be
used to understand HIV dynamics, and can be used by policy-makers to more
effectively understand the demographic influences driving HIV infection
- …