6,862 research outputs found

    Data-driven Soft Sensors in the Process Industry

    Get PDF
    In the last two decades Soft Sensors established themselves as a valuable alternative to the traditional means for the acquisition of critical process variables, process monitoring and other tasks which are related to process control. This paper discusses characteristics of the process industry data which are critical for the development of data-driven Soft Sensors. These characteristics are common to a large number of process industry fields, like the chemical industry, bioprocess industry, steel industry, etc. The focus of this work is put on the data-driven Soft Sensors because of their growing popularity, already demonstrated usefulness and huge, though yet not completely realised, potential. A comprehensive selection of case studies covering the three most important Soft Sensor application fields, a general introduction to the most popular Soft Sensor modelling techniques as well as a discussion of some open issues in the Soft Sensor development and maintenance and their possible solutions are the main contributions of this work

    Sustainable approaches for stormwater quality improvements with experimental geothermal paving systems

    Get PDF
    This article has been made available through the Brunel Open Access Publishing Fund.This research assesses the next generation of permeable pavement systems (PPS) incorporating ground source heat pumps (geothermal paving systems). Twelve experimental pilot-scaled pavement systems were assessed for its stormwater treatability in Edinburgh, UK. The relatively high variability of temperatures during the heating and cooling cycle of a ground source heat pump system embedded into the pavement structure did not allow the ecological risk of pathogenic microbial expansion and survival. Carbon dioxide monitoring indicated relatively high microbial activity on a geotextile layer and within the pavement structure. Anaerobic degradation processes were concentrated around the geotextile zone, where carbon dioxide concentrations reached up to 2000 ppm. The overall water treatment potential was high with up to 99% biochemical oxygen demand removal. The pervious pavement systems reduced the ecological risk of stormwater discharges and provided a low risk of pathogen growth

    More "normal" than normal: scaling distributions and complex systems

    Get PDF
    One feature of many naturally occurring or engineered complex systems is tremendous variability in event sizes. To account for it, the behavior of these systems is often described using power law relationships or scaling distributions, which tend to be viewed as "exotic" because of their unusual properties (e.g., infinite moments). An alternate view is based on mathematical, statistical, and data-analytic arguments and suggests that scaling distributions should be viewed as "more normal than normal". In support of this latter view that has been advocated by Mandelbrot for the last 40 years, we review in this paper some relevant results from probability theory and illustrate a powerful statistical approach for deciding whether the variability associated with observed event sizes is consistent with an underlying Gaussian-type (finite variance) or scaling-type (infinite variance) distribution. We contrast this approach with traditional model fitting techniques and discuss its implications for future modeling of complex systems

    Predictive intelligence to the edge through approximate collaborative context reasoning

    Get PDF
    We focus on Internet of Things (IoT) environments where a network of sensing and computing devices are responsible to locally process contextual data, reason and collaboratively infer the appearance of a specific phenomenon (event). Pushing processing and knowledge inference to the edge of the IoT network allows the complexity of the event reasoning process to be distributed into many manageable pieces and to be physically located at the source of the contextual information. This enables a huge amount of rich data streams to be processed in real time that would be prohibitively complex and costly to deliver on a traditional centralized Cloud system. We propose a lightweight, energy-efficient, distributed, adaptive, multiple-context perspective event reasoning model under uncertainty on each IoT device (sensor/actuator). Each device senses and processes context data and infers events based on different local context perspectives: (i) expert knowledge on event representation, (ii) outliers inference, and (iii) deviation from locally predicted context. Such novel approximate reasoning paradigm is achieved through a contextualized, collaborative belief-driven clustering process, where clusters of devices are formed according to their belief on the presence of events. Our distributed and federated intelligence model efficiently identifies any localized abnormality on the contextual data in light of event reasoning through aggregating local degrees of belief, updates, and adjusts its knowledge to contextual data outliers and novelty detection. We provide comprehensive experimental and comparison assessment of our model over real contextual data with other localized and centralized event detection models and show the benefits stemmed from its adoption by achieving up to three orders of magnitude less energy consumption and high quality of inference

    Solution Path Clustering with Adaptive Concave Penalty

    Full text link
    Fast accumulation of large amounts of complex data has created a need for more sophisticated statistical methodologies to discover interesting patterns and better extract information from these data. The large scale of the data often results in challenging high-dimensional estimation problems where only a minority of the data shows specific grouping patterns. To address these emerging challenges, we develop a new clustering methodology that introduces the idea of a regularization path into unsupervised learning. A regularization path for a clustering problem is created by varying the degree of sparsity constraint that is imposed on the differences between objects via the minimax concave penalty with adaptive tuning parameters. Instead of providing a single solution represented by a cluster assignment for each object, the method produces a short sequence of solutions that determines not only the cluster assignment but also a corresponding number of clusters for each solution. The optimization of the penalized loss function is carried out through an MM algorithm with block coordinate descent. The advantages of this clustering algorithm compared to other existing methods are as follows: it does not require the input of the number of clusters; it is capable of simultaneously separating irrelevant or noisy observations that show no grouping pattern, which can greatly improve data interpretation; it is a general methodology that can be applied to many clustering problems. We test this method on various simulated datasets and on gene expression data, where it shows better or competitive performance compared against several clustering methods.Comment: 36 page

    Robust PCA as Bilinear Decomposition with Outlier-Sparsity Regularization

    Full text link
    Principal component analysis (PCA) is widely used for dimensionality reduction, with well-documented merits in various applications involving high-dimensional data, including computer vision, preference measurement, and bioinformatics. In this context, the fresh look advocated here permeates benefits from variable selection and compressive sampling, to robustify PCA against outliers. A least-trimmed squares estimator of a low-rank bilinear factor analysis model is shown closely related to that obtained from an 0\ell_0-(pseudo)norm-regularized criterion encouraging sparsity in a matrix explicitly modeling the outliers. This connection suggests robust PCA schemes based on convex relaxation, which lead naturally to a family of robust estimators encompassing Huber's optimal M-class as a special case. Outliers are identified by tuning a regularization parameter, which amounts to controlling sparsity of the outlier matrix along the whole robustification path of (group) least-absolute shrinkage and selection operator (Lasso) solutions. Beyond its neat ties to robust statistics, the developed outlier-aware PCA framework is versatile to accommodate novel and scalable algorithms to: i) track the low-rank signal subspace robustly, as new data are acquired in real time; and ii) determine principal components robustly in (possibly) infinite-dimensional feature spaces. Synthetic and real data tests corroborate the effectiveness of the proposed robust PCA schemes, when used to identify aberrant responses in personality assessment surveys, as well as unveil communities in social networks, and intruders from video surveillance data.Comment: 30 pages, submitted to IEEE Transactions on Signal Processin

    Modelling activated sludge wastewater treatment plants using artificial intelligence techniques (fuzzy logic and neural networks)

    Get PDF
    Activated sludge process (ASP) is the most commonly used biological wastewater treatment system. Mathematical modelling of this process is important for improving its treatment efficiency and thus the quality of the effluent released into the receiving water body. This is because the models can help the operator to predict the performance of the plant in order to take cost-effective and timely remedial actions that would ensure consistent treatment efficiency and meeting discharge consents. However, due to the highly complex and non-linear characteristics of this biological system, traditional mathematical modelling of this treatment process has remained a challenge. This thesis presents the applications of Artificial Intelligence (AI) techniques for modelling the ASP. These include the Kohonen Self Organising Map (KSOM), backpropagation artificial neural networks (BPANN), and adaptive network based fuzzy inference system (ANFIS). A comparison between these techniques has been made and the possibility of the hybrids between them was also investigated and tested. The study demonstrated that AI techniques offer viable, flexible and effective modelling methodology alternative for the activated sludge system. The KSOM was found to be an attractive tool for data preparation because it can easily accommodate missing data and outliers and because of its power in extracting salient features from raw data. As a consequence of the latter, the KSOM offers an excellent tool for the visualisation of high dimensional data. In addition, the KSOM was used to develop a software sensor to predict biological oxygen demand. This soft-sensor represents a significant advance in real-time BOD operational control by offering a very fast estimation of this important wastewater parameter when compared to the traditional 5-days bio-essay BOD test procedure. Furthermore, hybrids of KSOM-ANN and KSOM-ANFIS were shown to result much more improved model performance than using the respective modelling paradigms on their own.Damascus Universit

    Exploratory Cluster Analysis from Ubiquitous Data Streams using Self-Organizing Maps

    Get PDF
    This thesis addresses the use of Self-Organizing Maps (SOM) for exploratory cluster analysis over ubiquitous data streams, where two complementary problems arise: first, to generate (local) SOM models over potentially unbounded multi-dimensional non-stationary data streams; second, to extrapolate these capabilities to ubiquitous environments. Towards this problematic, original contributions are made in terms of algorithms and methodologies. Two different methods are proposed regarding the first problem. By focusing on visual knowledge discovery, these methods fill an existing gap in the panorama of current methods for cluster analysis over data streams. Moreover, the original SOM capabilities in performing both clustering of observations and features are transposed to data streams, characterizing these contributions as versatile compared to existing methods, which target an individual clustering problem. Also, additional methodologies that tackle the ubiquitous aspect of data streams are proposed in respect to the second problem, allowing distributed and collaborative learning strategies. Experimental evaluations attest the effectiveness of the proposed methods and realworld applications are exemplified, namely regarding electric consumption data, air quality monitoring networks and financial data, motivating their practical use. This research study is the first to clearly address the use of the SOM towards ubiquitous data streams and opens several other research opportunities in the future
    corecore