76,981 research outputs found
An evaluation framework for input variable selection algorithms for environmental data-driven models
Abstract not availableStefano Galelli, Greer B. Humphrey, Holger R. Maier, Andrea Castelletti, Graeme C. Dandy, Matthew S. Gibb
Recommended from our members
A novel improved model for building energy consumption prediction based on model integration
Building energy consumption prediction plays an irreplaceable role in energy planning, management, and conservation. Constantly improving the performance of prediction models is the key to ensuring the efficient operation of energy systems. Moreover, accuracy is no longer the only factor in revealing model performance, it is more important to evaluate the model from multiple perspectives, considering the characteristics of engineering applications. Based on the idea of model integration, this paper proposes a novel improved integration model (stacking model) that can be used to forecast building energy consumption. The stacking model combines advantages of various base prediction algorithms and forms them into “meta-features” to ensure that the final model can observe datasets from different spatial and structural angles. Two cases are used to demonstrate practical engineering applications of the stacking model. A comparative analysis is performed to evaluate the prediction performance of the stacking model in contrast with existing well-known prediction models including Random Forest, Gradient Boosted Decision Tree, Extreme Gradient Boosting, Support Vector Machine, and K-Nearest Neighbor. The results indicate that the stacking method achieves better performance than other models, regarding accuracy (improvement of 9.5%–31.6% for Case A and 16.2%–49.4% for Case B), generalization (improvement of 6.7%–29.5% for Case A and 7.1%-34.6% for Case B), and robustness (improvement of 1.5%–34.1% for Case A and 1.8%–19.3% for Case B). The proposed model enriches the diversity of algorithm libraries of empirical models
Stream Learning in Energy IoT Systems: A Case Study in Combined Cycle Power Plants
The prediction of electrical power produced in combined cycle power plants is a key challenge in the electrical power and energy systems field. This power production can vary depending on environmental variables, such as temperature, pressure, and humidity. Thus, the business problem is how to predict the power production as a function of these environmental conditions, in order to maximize the profit. The research community has solved this problem by applying Machine Learning techniques, and has managed to reduce the computational and time costs in comparison with the traditional thermodynamical analysis. Until now, this challenge has been tackled from a batch learning perspective, in which data is assumed to be at rest, and where models do not continuously integrate new information into already constructed models. We present an approach closer to the Big Data and Internet of Things paradigms, in which data are continuously arriving and where models learn incrementally, achieving significant enhancements in terms of data processing (time, memory and computational costs), and obtaining competitive performances. This work compares and examines the hourly electrical power prediction of several streaming regressors, and discusses about the best technique in terms of time processing and predictive performance to be applied on this streaming scenario.This work has been partially supported by the EU project iDev40. This project has received funding
from the ECSEL Joint Undertaking (JU) under grant agreement No 783163. The JU receives support from the
European Union’s Horizon 2020 research and innovation programme and Austria, Germany, Belgium, Italy,
Spain, Romania. It has also been supported by the Basque Government (Spain) through the project VIRTUAL
(KK-2018/00096), and by Ministerio de EconomĂa y Competitividad of Spain (Grant Ref. TIN2017-85887-C2-2-P)
Damage identification in structural health monitoring: a brief review from its implementation to the Use of data-driven applications
The damage identification process provides relevant information about the current state of a structure under inspection, and it can be approached from two different points of view. The first approach uses data-driven algorithms, which are usually associated with the collection of data using sensors. Data are subsequently processed and analyzed. The second approach uses models to analyze information about the structure. In the latter case, the overall performance of the approach is associated with the accuracy of the model and the information that is used to define it. Although both approaches are widely used, data-driven algorithms are preferred in most cases because they afford the ability to analyze data acquired from sensors and to provide a real-time solution for decision making; however, these approaches involve high-performance processors due to the high computational cost. As a contribution to the researchers working with data-driven algorithms and applications, this work presents a brief review of data-driven algorithms for damage identification in structural health-monitoring applications. This review covers damage detection, localization, classification, extension, and prognosis, as well as the development of smart structures. The literature is systematically reviewed according to the natural steps of a structural health-monitoring system. This review also includes information on the types of sensors used as well as on the development of data-driven algorithms for damage identification.Peer ReviewedPostprint (published version
Uncertainty Analysis for Data-Driven Chance-Constrained Optimization
In this contribution our developed framework for data-driven chance-constrained optimization is extended with an uncertainty analysis module. The module quantifies uncertainty in output variables of rigorous simulations. It chooses the most accurate parametric continuous probability distribution model, minimizing deviation between model and data. A constraint is added to favour less complex models with a minimal required quality regarding the fit. The bases of the module are over 100 probability distribution models provided in the Scipy package in Python, a rigorous case-study is conducted selecting the four most relevant models for the application at hand. The applicability and precision of the uncertainty analyser module is investigated for an impact factor calculation in life cycle impact assessment to quantify the uncertainty in the results. Furthermore, the extended framework is verified with data from a first principle process model of a chloralkali plant, demonstrating the increased precision of the uncertainty description of the output variables, resulting in 25% increase in accuracy in the chance-constraint calculation.BMWi, 0350013A, ChemEFlex - Umsetzbarkeitsanalyse zur Lastflexibilisierung elektrochemischer Verfahren in der Industrie; Teilvorhaben: Modellierung der Chlor-Alkali-Elektrolyse sowie anderer Prozesse und deren Bewertung hinsichtlich Wirtschaftlichkeit und möglicher HemmnisseDFG, 414044773, Open Access Publizieren 2019 - 2020 / Technische Universität Berli
Biologically informed ecological niche models for an example pelagic, highly mobile species
Background: Although pelagic seabirds are broadly recognised as indicators of the health of marine systems, numerous gaps exist in knowledge of their at-sea distributions at the species level. These gaps have profound negative impacts on the robustness of marine conservation policies. Correlative modelling techniques have provided some information, but few studies have explored model development for non-breeding pelagic seabirds. Here, I present a first phase in developing robust niche models for highly mobile species as a baseline for further development.Methodology: Using observational data from a 12-year time period, 217 unique model parameterisations across three correlative modelling algorithms (boosted regression trees, Maxent and minimum volume ellipsoids) were tested in a time-averaged approach for their ability to recreate the at-sea distribution of non-breeding Wandering Albatrosses (Diomedea exulans) to provide a baseline for further development.Principle Findings/Results: Overall, minimum volume ellipsoids outperformed both boosted regression trees and Maxent. However, whilst the latter two algorithms generally overfit the data, minimum volume ellipsoids tended to underfit the data. Conclusions: The results of this exercise suggest a necessary evolution in how correlative modelling for highly mobile species such as pelagic seabirds should be approached. These insights are crucial for understanding seabird–environment interactions at macroscales, which can facilitate the ability to address population declines and inform effective marine conservation policy in the wake of rapid global change
- …