59 research outputs found

    A Survey on Data Mining Techniques Applied to Energy Time Series Forecasting

    Get PDF
    Data mining has become an essential tool during the last decade to analyze large sets of data. The variety of techniques it includes and the successful results obtained in many application fields, make this family of approaches powerful and widely used. In particular, this work explores the application of these techniques to time series forecasting. Although classical statistical-based methods provides reasonably good results, the result of the application of data mining outperforms those of classical ones. Hence, this work faces two main challenges: (i) to provide a compact mathematical formulation of the mainly used techniques; (ii) to review the latest works of time series forecasting and, as case study, those related to electricity price and demand markets.Ministerio de Economía y Competitividad TIN2014-55894-C2-RJunta de Andalucía P12- TIC-1728Universidad Pablo de Olavide APPB81309

    Intelligent energy management system : techniques and methods

    Get PDF
    ABSTRACT Our environment is an asset to be managed carefully and is not an expendable resource to be taken for granted. The main original contribution of this thesis is in formulating intelligent techniques and simulating case studies to demonstrate the significance of the present approach for achieving a low carbon economy. Energy boosts crop production, drives industry and increases employment. Wise energy use is the first step to ensuring sustainable energy for present and future generations. Energy services are essential for meeting internationally agreed development goals. Energy management system lies at the heart of all infrastructures from communications, economy, and society’s transportation to the society. This has made the system more complex and more interdependent. The increasing number of disturbances occurring in the system has raised the priority of energy management system infrastructure which has been improved with the aid of technology and investment; suitable methods have been presented to optimize the system in this thesis. Since the current system is facing various problems from increasing disturbances, the system is operating on the limit, aging equipments, load change etc, therefore an improvement is essential to minimize these problems. To enhance the current system and resolve the issues that it is facing, smart grid has been proposed as a solution to resolve power problems and to prevent future failures. This thesis argues that smart grid consists of computational intelligence and smart meters to improve the reliability, stability and security of power. In comparison with the current system, it is more intelligent, reliable, stable and secure, and will reduce the number of blackouts and other failures that occur on the power grid system. Also, the thesis has reported that smart metering is technically feasible to improve energy efficiency. In the thesis, a new technique using wavelet transforms, floating point genetic algorithm and artificial neural network based hybrid model for gaining accurate prediction of short-term load forecast has been developed. Adopting the new model is more accuracy than radial basis function network. Actual data has been used to test the proposed new method and it has been demonstrated that this integrated intelligent technique is very effective for the load forecast. Choosing the appropriate algorithm is important to implement the optimization during the daily task in the power system. The potential for application of swarm intelligence to Optimal Reactive Power Dispatch (ORPD) has been shown in this thesis. After making the comparison of the results derived from swarm intelligence, improved genetic algorithm and a conventional gradient-based optimization method, it was concluded that swam intelligence is better in terms of performance and precision in solving optimal reactive power dispatch problems.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Impact of COVID-19 on Forecasting Stock Prices: An Integration of Stationary Wavelet Transform and Bidirectional Long Short-Term Memory

    Full text link
    COVID-19 is an infectious disease that mostly affects the respiratory system. At the time of this research being performed, there were more than 1.4 million cases of COVID-19, and one of the biggest anxieties is not just our health, but our livelihoods, too. In this research, authors investigate the impact of COVID-19 on the global economy, more specifically, the impact of COVID-19 on financial movement of Crude Oil price and three U.S. stock indexes: DJI, S&P 500 and NASDAQ Composite. The proposed system for predicting commodity and stock prices integrates the Stationary Wavelet Transform (SWT) and Bidirectional Long Short-Term Memory (BDLSTM) networks. Firstly, SWT is used to decompose the data into approximation and detail coefficients. After decomposition, data of Crude Oil price and stock market indexes along with COVID-19 confirmed cases were used as input variables for future price movement forecasting. As a result, the proposed system BDLSTM+WT-ADA achieved satisfactory results in terms of five-day Crude Oil price forecast.Comment: 26 pages, 9 figure

    Forecasting Mid-Term Electricity Market Clearing Price Using Support Vector Machines

    Get PDF
    In a deregulated electricity market, offering the appropriate amount of electricity at the right time with the right bidding price is of paramount importance. The forecasting of electricity market clearing price (MCP) is a prediction of future electricity price based on given forecast of electricity demand, temperature, sunshine, fuel cost, precipitation and other related factors. Currently, there are many techniques available for short-term electricity MCP forecasting, but very little has been done in the area of mid-term electricity MCP forecasting. The mid-term electricity MCP forecasting focuses electricity MCP on a time frame from one month to six months. Developing mid-term electricity MCP forecasting is essential for mid-term planning and decision making, such as generation plant expansion and maintenance schedule, reallocation of resources, bilateral contracts and hedging strategies. Six mid-term electricity MCP forecasting models are proposed and compared in this thesis: 1) a single support vector machine (SVM) forecasting model, 2) a single least squares support vector machine (LSSVM) forecasting model, 3) a hybrid SVM and auto-regression moving average with external input (ARMAX) forecasting model, 4) a hybrid LSSVM and ARMAX forecasting model, 5) a multiple SVM forecasting model and 6) a multiple LSSVM forecasting model. PJM interconnection data are used to test the proposed models. Cross-validation technique was used to optimize the control parameters and the selection of training data of the six proposed mid-term electricity MCP forecasting models. Three evaluation techniques, mean absolute error (MAE), mean absolute percentage error (MAPE) and mean square root error (MSRE), are used to analysis the system forecasting accuracy. According to the experimental results, the multiple SVM forecasting model worked the best among all six proposed forecasting models. The proposed multiple SVM based mid-term electricity MCP forecasting model contains a data classification module and a price forecasting module. The data classification module will first pre-process the input data into corresponding price zones and then the forecasting module will forecast the electricity price in four parallel designed SVMs. This proposed model can best improve the forecasting accuracy on both peak prices and overall system compared with other 5 forecasting models proposed in this thesis

    Streamflow and soil moisture forecasting with hybrid data intelligent machine learning approaches: case studies in the Australian Murray-Darling basin

    Get PDF
    For a drought-prone agricultural nation such as Australia, hydro-meteorological imbalances and increasing demand for water resources are immensely constraining terrestrial water reservoirs and regional-scale agricultural productivity. Two important components of the terrestrial water reservoir i.e., streamflow water level (SWL) and soil moisture (SM), are imperative both for agricultural and hydrological applications. Forecasted SWL and SM can enable prudent and sustainable decisionmaking for agriculture and water resources management. To feasibly emulate SWL and SM, machine learning data-intelligent models are a promising tool in today’s rapidly advancing data science era. Yet, the naturally chaotic characteristics of hydro-meteorological variables that can exhibit non-linearity and non-stationarity behaviors within the model dataset, is a key challenge for non-tuned machine learning models. Another important issue that could confound model accuracy or applicability is the selection of relevant features to emulate SWL and SM since the use of too fewer inputs can lead to insufficient information to construct an accurate model while the use of an excessive number and redundant model inputs could obscure the performance of the simulation algorithm. This research thesis focusses on the development of hybridized dataintelligent models in forecasting SWL and SM in the upper layer (surface to 0.2 m) and the lower layer (0.2–1.5 m depth) within the agricultural region of the Murray-Darling Basin, Australia. The SWL quantifies the availability of surface water resources, while, the upper layer SM (or the surface SM) is important for surface runoff, evaporation, and energy exchange at the Earth-Atmospheric interface. The lower layer (or the root zone) SM is essential for groundwater recharge purposes, plant uptake and transpiration. This research study is constructed upon four primary objectives designed for the forecasting of SWL and SM with subsequent robust evaluations by means of statistical metrics, in tandem with the diagnostic plots of observed and modeled datasets. The first objective establishes the importance of feature selection (or optimization) in the forecasting of monthly SWL at three study sites within the Murray-Darling Basin. Artificial neural network (ANN) model optimized with iterative input selection (IIS) algorithm named IIS-ANN is developed whereby the IIS algorithm achieves feature optimization. The IIS-ANN model outperforms the standalone models and a further hybridization is performed by integrating a nondecimated and advanced maximum overlap discrete wavelet transformation (MODWT) technique. The IIS selected inputs are transformed into wavelet subseries via MODWT to unveil the embedded features leading to IIS-W-ANN model. The IIS-W-ANN outperforms the comparative IIS-W-M5 Model Tree, IIS-based and standalone models. In the second objective, improved self-adaptive multi-resolution analysis (MRA) techniques, ensemble empirical mode decomposition (EEMD) and complete ensemble empirical mode decomposition with adaptive noise (CEEMDAN) are utilized to address the non-stationarity issues in forecasting monthly upper and lower layer soil moisture at seven sites. The SM time-series are decomposed using EEMD/CEEMDAN into respective intrinsic mode functions (IMFs) and residual components. Then the partial-auto correlation function based significant lags are utilized as inputs to the extreme learning machine (ELM) and random forest (RF) models. The hybrid EEMD-ELM yielded better results in comparison to the CEEMDAN-ELM, EEMD-RF, CEEMDAN-RF and the classical ELM and RF models. Since SM is contingent upon many influential meteorological, hydrological and atmospheric parameters, for the third objective sixty predictor inputs are collated in forecasting upper and lower layer soil moisture at four sites. An ANN-based ensemble committee of models (ANN-CoM) is developed integrating a two-phase feature optimization via Neighborhood Component Analysis based feature selection algorithm for regression (fsrnca) and a basic ELM. The ANN-CoM shows better predictive performance in comparison to the standalone second order Volterra, M5 Model Tree, RF, and ELM models. In the fourth objective, a new multivariate sequential EEMD based modelling is developed. The establishment of multivariate sequential EEMD is an advancement of the classical single input EEMD approach, achieving a further methodological improvement. This multivariate approach is developed to allow for the utilization of multiple inputs in forecasting SM. The multivariate sequential EEMD optimized with cross-correlation function and Boruta feature selection algorithm is integrated with the ELM model in emulating weekly SM at four sites. The resulting hybrid multivariate sequential EEMD-Boruta-ELM attained a better performance in comparison with the multivariate adaptive regression splines (MARS) counterpart (EEMD-Boruta-MARS) and standalone ELM and MARS models. The research study ascertains the applicability of feature selection algorithms integrated with appropriate MRA for improved hydrological forecasting. Forecasting at shorter and near-real-time horizons (i.e., weekly) would help reinforce scientific tenets in designing knowledge-based systems for precision agriculture and climate change adaptation policy formulations

    A Statistical Perspective of the Empirical Mode Decomposition

    Get PDF
    This research focuses on non-stationary basis decompositions methods in time-frequency analysis. Classical methodologies in this field such as Fourier Analysis and Wavelet Transforms rely on strong assumptions of the underlying moment generating process, which, may not be valid in real data scenarios or modern applications of machine learning. The literature on non-stationary methods is still in its infancy, and the research contained in this thesis aims to address challenges arising in this area. Among several alternatives, this work is based on the method known as the Empirical Mode Decomposition (EMD). The EMD is a non-parametric time-series decomposition technique that produces a set of time-series functions denoted as Intrinsic Mode Functions (IMFs), which carry specific statistical properties. The main focus is providing a general and flexible family of basis extraction methods with minimal requirements compared to those within the Fourier or Wavelet techniques. This is highly important for two main reasons: first, more universal applications can be taken into account; secondly, the EMD has very little a priori knowledge of the process required to apply it, and as such, it can have greater generalisation properties in statistical applications across a wide array of applications and data types. The contributions of this work deal with several aspects of the decomposition. The first set regards the construction of an IMF from several perspectives: (1) achieving a semi-parametric representation of each basis; (2) extracting such semi-parametric functional forms in a computationally efficient and statistically robust framework. The EMD belongs to the class of path-based decompositions and, therefore, they are often not treated as a stochastic representation. (3) A major contribution involves the embedding of the deterministic pathwise decomposition framework into a formal stochastic process setting. One of the assumptions proper of the EMD construction is the requirement for a continuous function to apply the decomposition. In general, this may not be the case within many applications. (4) Various multi-kernel Gaussian Process formulations of the EMD will be proposed through the introduced stochastic embedding. Particularly, two different models will be proposed: one modelling the temporal mode of oscillations of the EMD and the other one capturing instantaneous frequencies location in specific frequency regions or bandwidths. (5) The construction of the second stochastic embedding will be achieved with an optimisation method called the cross-entropy method. Two formulations will be provided and explored in this regard. Application on speech time-series are explored to study such methodological extensions given that they are non-stationary

    Fast Characterization of Power Quality Events Based on Discrete Signal Processing and Data Mining

    Get PDF
    The extensive use of solid-state power electronics technology in industrial, commercial and residential equipment causes degradation of quality of electric power with the deterioration of the supply voltage. The disturbances result in degradation of the efficiency, decaying the life span of the equipment, increase in the losses, electromagnetic interference, the malfunctions of equipment and other harmful fallout. Generally, the power quality is the measurement of an ideal power supply. More over the power quality is the continuity and characteristics of the supply voltage in terms of frequency, magnitude and symmetry. The mitigation of power quality (PQ) disturbances requires detection of the source and causes of disturbances. The MODWT is a suitable method for forecasting of further occurrence of disturbance. However proper and quick detection and localization of the disturbances plays a crucial role in the power quality environment. Hence, in this thesis, a fast detection technique has been proposed along with the MODWT in order to provide time-scale representation of the signals by removing the drawback of the traditional methods like DWT and ST. Comparative analysis shows that SGWT is a best technique for localization and detection of distortions than the conventional methods. During the course of the research, it is found that suitable algorithms are required for the characterization of the disturbances for smooth mitigation of the distortions. So, data mining based classifier has been proposed for discrimination of both single and multiple disturbances. Further, the suitable features are needed for efficient characterization of the disturbances. Hence, the suitable features are extracted in order to ii reduce the number of raw data. The data normalization also plays a crucial role for efficient classification. These classification techniques are fast and able to analyze large number of disturbances. In this thesis, large numbers of signals are synthesized both in noisy and noise free environment. In the real time environment, these techniques have been performed satisfactorily. This leads to increase in the overall efficiency of the combination of the detection and classification method. In recent times, with the advancement of renewable source requires better quality of power. The important issue of the today’s distributed generation based interconnected power system is the islanding detection. Non detection zone is a good and reliable measurement of the islanding. However, failure to detect islanding situation sometimes leads to number of serious problem both for the utility and the customers. Hence, this thesis also provides a comparative analysis of the benefits and the drawbacks of aforementioned detection methods which are applied in power quality environment. The voltage signal at the PCC of the renewable distributed generation embedded with IEEE−14 bus system is captured and given as input to the analysis methods in order to extract features from the output of the analysis. The proposed SGWT properly discriminates power quality disturbances from the islanding events by introducing threshold selection. The data mining classifiers are implemented for classification of power quality as well as islanding events captured from IEEE bus system. Similar to the previous cases, the signals of same length are given to all the detection methods in ordered to compare the time of operation of each these methods. Moreover, the proposed techniques have been applied in noise free and noisy environment, bus system embedded with renewable source, real time environment etc. The overall findings of the thesis could be useful for the industrial and domestic applications. Since the detection methods are simple and faster, they could be useful for power industry and other applications such as medical science etc. Similarly, the classification can be used for application such as stock exchange, medical science etc