7,655 research outputs found
Real-time extraction of the Madden-Julian oscillation using empirical mode decomposition and statistical forecasting with a VARMA model
A simple guide to the new technique of empirical mode decomposition (EMD) in a meteorological-climate forecasting context is presented. A single application of EMD to a time series essentially acts as a local high-pass filter. Hence, successive applications can be used to produce a bandpass filter that is highly efficient at extracting a broadband signal such as the Madden-Julian Oscillation (MJO). The basic EMD method is adapted to minimize end effects, such that it is suitable for use in real time. The EMD process is then used to efficiently extract the MJO signal from gridded time series of outgoing longwave radiation (OLR) data. A range of statistical models from the general class of vector autoregressive moving average (VARMA) models was then tested for their suitability in forecasting the MJO signal, as isolated by the EMD. A VARMA (5, 1) model was selected and its parameters determined by a maximum likelihood method using 17 yr of OLR data from 1980 to 1996. Forecasts were then made on the remaining independent data from 1998 to 2004. These were made in real time, as only data up to the date the forecast was made were used. The median skill of forecasts was accurate (defined as an anomaly correlation above 0.6) at lead times up to 25 days
Ensemble Sales Forecasting Study in Semiconductor Industry
Sales forecasting plays a prominent role in business planning and business
strategy. The value and importance of advance information is a cornerstone of
planning activity, and a well-set forecast goal can guide sale-force more
efficiently. In this paper CPU sales forecasting of Intel Corporation, a
multinational semiconductor industry, was considered. Past sale, future
booking, exchange rates, Gross domestic product (GDP) forecasting, seasonality
and other indicators were innovatively incorporated into the quantitative
modeling. Benefit from the recent advances in computation power and software
development, millions of models built upon multiple regressions, time series
analysis, random forest and boosting tree were executed in parallel. The models
with smaller validation errors were selected to form the ensemble model. To
better capture the distinct characteristics, forecasting models were
implemented at lead time and lines of business level. The moving windows
validation process automatically selected the models which closely represent
current market condition. The weekly cadence forecasting schema allowed the
model to response effectively to market fluctuation. Generic variable
importance analysis was also developed to increase the model interpretability.
Rather than assuming fixed distribution, this non-parametric permutation
variable importance analysis provided a general framework across methods to
evaluate the variable importance. This variable importance framework can
further extend to classification problem by modifying the mean absolute
percentage error(MAPE) into misclassify error. Please find the demo code at :
https://github.com/qx0731/ensemble_forecast_methodsComment: 14 pages, Industrial Conference on Data Mining 2017 (ICDM 2017
Air Quality Prediction in Smart Cities Using Machine Learning Technologies Based on Sensor Data: A Review
The influence of machine learning technologies is rapidly increasing and penetrating almost in every field, and air pollution prediction is not being excluded from those fields. This paper covers the revision of the studies related to air pollution prediction using machine learning algorithms based on sensor data in the context of smart cities. Using the most popular databases and executing the corresponding filtration, the most relevant papers were selected. After thorough reviewing those papers, the main features were extracted, which served as a base to link and compare them to each other. As a result, we can conclude that: (1) instead of using simple machine learning techniques, currently, the authors apply advanced and sophisticated techniques, (2) China was the leading country in terms of a case study, (3) Particulate matter with diameter equal to 2.5 micrometers was the main prediction target, (4) in 41% of the publications the authors carried out the prediction for the next day, (5) 66% of the studies used data had an hourly rate, (6) 49% of the papers used open data and since 2016 it had a tendency to increase, and (7) for efficient air quality prediction it is important to consider the external factors such as weather conditions, spatial characteristics, and temporal features
Recommended from our members
A novel improved model for building energy consumption prediction based on model integration
Building energy consumption prediction plays an irreplaceable role in energy planning, management, and conservation. Constantly improving the performance of prediction models is the key to ensuring the efficient operation of energy systems. Moreover, accuracy is no longer the only factor in revealing model performance, it is more important to evaluate the model from multiple perspectives, considering the characteristics of engineering applications. Based on the idea of model integration, this paper proposes a novel improved integration model (stacking model) that can be used to forecast building energy consumption. The stacking model combines advantages of various base prediction algorithms and forms them into “meta-features” to ensure that the final model can observe datasets from different spatial and structural angles. Two cases are used to demonstrate practical engineering applications of the stacking model. A comparative analysis is performed to evaluate the prediction performance of the stacking model in contrast with existing well-known prediction models including Random Forest, Gradient Boosted Decision Tree, Extreme Gradient Boosting, Support Vector Machine, and K-Nearest Neighbor. The results indicate that the stacking method achieves better performance than other models, regarding accuracy (improvement of 9.5%–31.6% for Case A and 16.2%–49.4% for Case B), generalization (improvement of 6.7%–29.5% for Case A and 7.1%-34.6% for Case B), and robustness (improvement of 1.5%–34.1% for Case A and 1.8%–19.3% for Case B). The proposed model enriches the diversity of algorithm libraries of empirical models
- …