4,678 research outputs found

    Ensemble Sales Forecasting Study in Semiconductor Industry

    Full text link
    Sales forecasting plays a prominent role in business planning and business strategy. The value and importance of advance information is a cornerstone of planning activity, and a well-set forecast goal can guide sale-force more efficiently. In this paper CPU sales forecasting of Intel Corporation, a multinational semiconductor industry, was considered. Past sale, future booking, exchange rates, Gross domestic product (GDP) forecasting, seasonality and other indicators were innovatively incorporated into the quantitative modeling. Benefit from the recent advances in computation power and software development, millions of models built upon multiple regressions, time series analysis, random forest and boosting tree were executed in parallel. The models with smaller validation errors were selected to form the ensemble model. To better capture the distinct characteristics, forecasting models were implemented at lead time and lines of business level. The moving windows validation process automatically selected the models which closely represent current market condition. The weekly cadence forecasting schema allowed the model to response effectively to market fluctuation. Generic variable importance analysis was also developed to increase the model interpretability. Rather than assuming fixed distribution, this non-parametric permutation variable importance analysis provided a general framework across methods to evaluate the variable importance. This variable importance framework can further extend to classification problem by modifying the mean absolute percentage error(MAPE) into misclassify error. Please find the demo code at : https://github.com/qx0731/ensemble_forecast_methodsComment: 14 pages, Industrial Conference on Data Mining 2017 (ICDM 2017

    Power System Parameters Forecasting Using Hilbert-Huang Transform and Machine Learning

    Get PDF
    A novel hybrid data-driven approach is developed for forecasting power system parameters with the goal of increasing the efficiency of short-term forecasting studies for non-stationary time-series. The proposed approach is based on mode decomposition and a feature analysis of initial retrospective data using the Hilbert-Huang transform and machine learning algorithms. The random forests and gradient boosting trees learning techniques were examined. The decision tree techniques were used to rank the importance of variables employed in the forecasting models. The Mean Decrease Gini index is employed as an impurity function. The resulting hybrid forecasting models employ the radial basis function neural network and support vector regression. Apart from introduction and references the paper is organized as follows. The section 2 presents the background and the review of several approaches for short-term forecasting of power system parameters. In the third section a hybrid machine learning-based algorithm using Hilbert-Huang transform is developed for short-term forecasting of power system parameters. Fourth section describes the decision tree learning algorithms used for the issue of variables importance. Finally in section six the experimental results in the following electric power problems are presented: active power flow forecasting, electricity price forecasting and for the wind speed and direction forecasting

    An investigation into machine learning approaches for forecasting spatio-temporal demand in ride-hailing service

    Full text link
    In this paper, we present machine learning approaches for characterizing and forecasting the short-term demand for on-demand ride-hailing services. We propose the spatio-temporal estimation of the demand that is a function of variable effects related to traffic, pricing and weather conditions. With respect to the methodology, a single decision tree, bootstrap-aggregated (bagged) decision trees, random forest, boosted decision trees, and artificial neural network for regression have been adapted and systematically compared using various statistics, e.g. R-square, Root Mean Square Error (RMSE), and slope. To better assess the quality of the models, they have been tested on a real case study using the data of DiDi Chuxing, the main on-demand ride hailing service provider in China. In the current study, 199,584 time-slots describing the spatio-temporal ride-hailing demand has been extracted with an aggregated-time interval of 10 mins. All the methods are trained and validated on the basis of two independent samples from this dataset. The results revealed that boosted decision trees provide the best prediction accuracy (RMSE=16.41), while avoiding the risk of over-fitting, followed by artificial neural network (20.09), random forest (23.50), bagged decision trees (24.29) and single decision tree (33.55).Comment: Currently under review for journal publicatio

    Forecasting Workforce Requirement for State Transportation Agencies: A Machine Learning Approach

    Get PDF
    A decline in the number of construction engineers and inspectors available at State Transportation Agencies (STAs) to manage the ever-increasing lane miles has emphasized the importance of workforce planning in this sector. One of the crucial aspects of workforce planning involves forecasting the required workforce for any industry or agency. This thesis developed machine learning models to estimate the person-hour requirements of STAs at the agency and project levels. The Arkansas Department of Transportation (ARDOT) was used as a case study, using its employee data between 2012 and 2021. At the project level, machine learning regressors ranging from linear, tree ensembles, kernel-based, and neural network-based models were developed. At the agency level, a classic time series modeling approach, as well as neural networks-based models, were developed to forecast the monthly person-hour requirements of the agency. Parametric and non-parametric tests were employed in comparing the models across both levels. The results indicated a high performance from the random forest regressor, a tree ensemble with bagging, which recorded an average R-squared value of 0.91. The one-dimensional convolutional neural network model was the most effective model for forecasting the monthly person requirements at the agency level. It recorded an average RMSE of 4,500 person-hours monthly over short-range forecasting and an average of 5,000 person-hours monthly over long-range forecasting. These findings underscore the capability of machine learning models to provide more accurate workforce demand forecasts for STAs and the construction industry. This enhanced accuracy in workforce planning will contribute to improved resource allocation and management
    corecore