3,875 research outputs found

    Data-driven Soft Sensors in the Process Industry

    Get PDF
    In the last two decades Soft Sensors established themselves as a valuable alternative to the traditional means for the acquisition of critical process variables, process monitoring and other tasks which are related to process control. This paper discusses characteristics of the process industry data which are critical for the development of data-driven Soft Sensors. These characteristics are common to a large number of process industry fields, like the chemical industry, bioprocess industry, steel industry, etc. The focus of this work is put on the data-driven Soft Sensors because of their growing popularity, already demonstrated usefulness and huge, though yet not completely realised, potential. A comprehensive selection of case studies covering the three most important Soft Sensor application fields, a general introduction to the most popular Soft Sensor modelling techniques as well as a discussion of some open issues in the Soft Sensor development and maintenance and their possible solutions are the main contributions of this work

    Soft computing applications in dynamic model identification of polymer extrusion process

    Get PDF
    This paper proposes the application of soft computing to deal with the constraints in conventional modelling techniques of the dynamic extrusion process. The proposed technique increases the efficiency in utilising the available information during the model identification. The resultant model can be classified as a ‘grey-box model’ or has been termed as a ‘semi-physical model’ in the context. The extrusion process contains a number of parameters that are sensitive to the operating environment. Fuzzy ruled-based system is introduced into the analytical model of the extrusion by means of sub-models to approximate those operational-sensitive parameters. In drawing the optimal structure for the sub-models, a hybrid algorithm of genetic algorithm with fuzzy system (GA-Fuzzy) has been implemented. The sub-models obtained show advantages such as linguistic interpretability, simpler rule-base and less membership functions. The developed model is adaptive with its learning ability through the steepest decent error back-propagation algorithm. This ability might help to minimise the deviation of the model prediction when the operational-sensitive parameters adapt to the changing operating environment in the real situation. The model is first evaluated through simulations on the consistency of model prediction to the theoretical analysis. Then, the effectiveness of adaptive sub-models in approximating the operational-sensitive parameters during the operation is further investigated

    Futility Analysis in the Cross-Validation of Machine Learning Models

    Full text link
    Many machine learning models have important structural tuning parameters that cannot be directly estimated from the data. The common tactic for setting these parameters is to use resampling methods, such as cross--validation or the bootstrap, to evaluate a candidate set of values and choose the best based on some pre--defined criterion. Unfortunately, this process can be time consuming. However, the model tuning process can be streamlined by adaptively resampling candidate values so that settings that are clearly sub-optimal can be discarded. The notion of futility analysis is introduced in this context. An example is shown that illustrates how adaptive resampling can be used to reduce training time. Simulation studies are used to understand how the potential speed--up is affected by parallel processing techniques.Comment: 22 pages, 5 figure

    Learning in Evolutionary Environments

    Get PDF
    The purpose of this work is to present a sort of short selective guide to an enormous and diverse literature on learning processes in economics. We argue that learning is an ubiquitous characteristic of most economic and social systems but it acquires even greater importance in explicitly evolutionary environments where: a) heterogeneous agents systematically display various forms of "bounded rationality"; b) there is a persistent appearance of novelties, both as exogenous shocks and as the result of technological, behavioural and organisational innovations by the agents themselves; c) markets (and other interaction arrangements) perform as selection mechanisms; d) aggregate regularities are primarily emergent properties stemming from out-of-equilibrium interactions. We present, by means of examples, the most important classes of learning models, trying to show their links and differences, and setting them against a sort of ideal framework of "what one would like to understand about learning...". We put a signifiphasis on learning models in their bare-bone formal structure, but we also refer to the (generally richer) non-formal theorising about the same objects. This allows us to provide an easier mapping of a wide and largely unexplored research agenda.Learning, Evolutionary Environments, Economic Theory, Rationality

    Multi-tier framework for the inferential measurement and data-driven modeling

    Get PDF
    A framework for the inferential measurement and data-driven modeling has been proposed and assessed in several real-world application domains. The architecture of the framework has been structured in multiple tiers to facilitate extensibility and the integration of new components. Each of the proposed four tiers has been assessed in an uncoupled way to verify their suitability. The first tier, dealing with exploratory data analysis, has been assessed with the characterization of the chemical space related to the biodegradation of organic chemicals. This analysis has established relationships between physicochemical variables and biodegradation rates that have been used for model development. At the preprocessing level, a novel method for feature selection based on dissimilarity measures between Self-Organizing maps (SOM) has been developed and assessed. The proposed method selected more features than others published in literature but leads to models with improved predictive power. Single and multiple data imputation techniques based on the SOM have also been used to recover missing data in a Waste Water Treatment Plant benchmark. A new dynamic method to adjust the centers and widths of in Radial basis Function networks has been proposed to predict water quality. The proposed method outperformed other neural networks. The proposed modeling components have also been assessed in the development of prediction and classification models for biodegradation rates in different media. The results obtained proved the suitability of this approach to develop data-driven models when the complex dynamics of the process prevents the formulation of mechanistic models. The use of rule generation algorithms and Bayesian dependency models has been preliminary screened to provide the framework with interpretation capabilities. Preliminary results obtained from the classification of Modes of Toxic Action (MOA) indicate that this could be a promising approach to use MOAs as proxy indicators of human health effects of chemicals.Finally, the complete framework has been applied to three different modeling scenarios. A virtual sensor system, capable of inferring product quality indices from primary process variables has been developed and assessed. The system was integrated with the control system in a real chemical plant outperforming multi-linear correlation models usually adopted by chemical manufacturers. A model to predict carcinogenicity from molecular structure for a set of aromatic compounds has been developed and tested. Results obtained after the application of the SOM-dissimilarity feature selection method yielded better results than models published in the literature. Finally, the framework has been used to facilitate a new approach for environmental modeling and risk management within geographical information systems (GIS). The SOM has been successfully used to characterize exposure scenarios and to provide estimations of missing data through geographic interpolation. The combination of SOM and Gaussian Mixture models facilitated the formulation of a new probabilistic risk assessment approach.Aquesta tesi proposa i avalua en diverses aplicacions reals, un marc general de treball per al desenvolupament de sistemes de mesurament inferencial i de modelat basats en dades. L'arquitectura d'aquest marc de treball s'organitza en diverses capes que faciliten la seva extensibilitat aixĂ­ com la integraciĂł de nous components. Cadascun dels quatre nivells en que s'estructura la proposta de marc de treball ha estat avaluat de forma independent per a verificar la seva funcionalitat. El primer que nivell s'ocupa de l'anĂ lisi exploratĂČria de dades ha esta avaluat a partir de la caracteritzaciĂł de l'espai quĂ­mic corresponent a la biodegradaciĂł de certs compostos orgĂ nics. Fruit d'aquest anĂ lisi s'han establert relacions entre diverses variables fĂ­sico-quĂ­miques que han estat emprades posteriorment per al desenvolupament de models de biodegradaciĂł. A nivell del preprocĂ©s de les dades s'ha desenvolupat i avaluat una nova metodologia per a la selecciĂł de variables basada en l'Ășs del Mapes Autoorganitzats (SOM). Tot i que el mĂštode proposat selecciona, en general, un major nombre de variables que altres mĂštodes proposats a la literatura, els models resultants mostren una millor capacitat predictiva. S'han avaluat tambĂ© tot un conjunt de tĂšcniques d'imputaciĂł de dades basades en el SOM amb un conjunt de dades estĂ ndard corresponent als parĂ metres d'operaciĂł d'una planta de tractament d'aigĂŒes residuals. Es proposa i avalua en un problema de predicciĂł de qualitat en aigua un nou model dinĂ mic per a ajustar el centre i la dispersiĂł en xarxes de funcions de base radial. El mĂštode proposat millora els resultats obtinguts amb altres arquitectures neuronals. Els components de modelat proposat s'han aplicat tambĂ© al desenvolupament de models predictius i de classificaciĂł de les velocitats de biodegradaciĂł de compostos orgĂ nics en diferents medis. Els resultats obtinguts demostren la viabilitat d'aquesta aproximaciĂł per a desenvolupar models basats en dades en aquells casos en els que la complexitat de dinĂ mica del procĂ©s impedeix formular models mecanicistes. S'ha dut a terme un estudi preliminar de l'Ășs de algorismes de generaciĂł de regles i de grafs de dependĂšncia bayesiana per a introduir una nova capa que faciliti la interpretaciĂł dels models. Els resultats preliminars obtinguts a partir de la classificaciĂł dels Modes d'acciĂł TĂČxica (MOA) apunten a que l'Ășs dels MOA com a indicadors intermediaris dels efectes dels compostos quĂ­mics en la salut Ă©s una aproximaciĂł factible.Finalment, el marc de treball proposat s'ha aplicat en tres escenaris de modelat diferents. En primer lloc, s'ha desenvolupat i avaluat un sensor virtual capaç d'inferir Ă­ndexs de qualitat a partir de variables primĂ ries de procĂ©s. El sensor resultant ha estat implementat en una planta quĂ­mica real millorant els resultats de les correlacions multilineals emprades habitualment. S'ha desenvolupat i avaluat un model per a predir els efectes carcinĂČgens d'un grup de compostos aromĂ tics a partir de la seva estructura molecular. Els resultats obtinguts desprĂšs d'aplicar el mĂštode de selecciĂł de variables basat en el SOM milloren els resultats prĂšviament publicats. Aquest marc de treball s'ha usat tambĂ© per a proporcionar una nova aproximaciĂł al modelat ambiental i l'anĂ lisi de risc amb sistemes d'informaciĂł geogrĂ fica (GIS). S'ha usat el SOM per a caracteritzar escenaris d'exposiciĂł i per a desenvolupar un nou mĂštode d'interpolaciĂł geogrĂ fica. La combinaciĂł del SOM amb els models de mescla de gaussianes dona una nova formulaciĂł al problema de l'anĂ lisi de risc des d'un punt de vista probabilĂ­stic
    • 

    corecore