2,424 research outputs found

    Using multiple classifiers for predicting the risk of endovascular aortic aneurysm repair re-intervention through hybrid feature selection.

    Get PDF
    Feature selection is essential in medical area; however, its process becomes complicated with the presence of censoring which is the unique character of survival analysis. Most survival feature selection methods are based on Cox's proportional hazard model, though machine learning classifiers are preferred. They are less employed in survival analysis due to censoring which prevents them from directly being used to survival data. Among the few work that employed machine learning classifiers, partial logistic artificial neural network with auto-relevance determination is a well-known method that deals with censoring and perform feature selection for survival data. However, it depends on data replication to handle censoring which leads to unbalanced and biased prediction results especially in highly censored data. Other methods cannot deal with high censoring. Therefore, in this article, a new hybrid feature selection method is proposed which presents a solution to high level censoring. It combines support vector machine, neural network, and K-nearest neighbor classifiers using simple majority voting and a new weighted majority voting method based on survival metric to construct a multiple classifier system. The new hybrid feature selection process uses multiple classifier system as a wrapper method and merges it with iterated feature ranking filter method to further reduce features. Two endovascular aortic repair datasets containing 91% censored patients collected from two centers were used to construct a multicenter study to evaluate the performance of the proposed approach. The results showed the proposed technique outperformed individual classifiers and variable selection methods based on Cox's model such as Akaike and Bayesian information criterions and least absolute shrinkage and selector operator in p values of the log-rank test, sensitivity, and concordance index. This indicates that the proposed classifier is more powerful in correctly predicting the risk of re-intervention enabling doctor in selecting patients' future follow-up plan

    Groundwater Management Optimization and Saltwater Intrusion Mitigation under Uncertainty

    Get PDF
    Groundwater is valuable to supply fresh water to the public, industries, agriculture, etc. However, excessive pumping has caused groundwater storage degradation, water quality deterioration and saltwater intrusion problems. Reliable groundwater flow and solute transport modeling is needed for sustainable groundwater management and aquifer remediation design. However, challenges exist because of highly complex subsurface environments, computationally intensive groundwater models as well as inevitable uncertainties. The first research goal is to explore conjunctive use of feasible hydraulic control approaches for groundwater management and aquifer remediation. Water budget analysis is conducted to understand how groundwater withdrawals affect water levels. A mixed integer multi-objective optimization model is constructed to derive optimal freshwater pumping strategies and investigate how to promote the optimality through regulating pumping locations. A solute transport model for the Baton Rouge multi-aquifer system is developed to assess saltwater encroachment under current condition. Potential saltwater scavenging approach is proposed to mitigate the salinization issue in the Baton Rouge area. The second research goal aims to develop robust surrogate-assisted simulation-optimization modeling methods for saltwater intrusion mitigation. Machine learning based surrogate models (response surface regression model, artificial neural network and support vector machine) were developed to replace a complex high-fidelity solute transport model for predicting saltwater intrusion. Two different methods including Bayesian model averaging and Bayesian set pair analysis are used to construct ensemble surrogates and quantify model prediction uncertainties. Besides. different optimization models that incorporate multiple ensemble surrogates are formulated to obtain optimal saltwater scavenging strategies. Chance-constrained programming is used to account for model selection uncertainty in probabilistic nonlinear concentration constraints. The results show that conjunctive use of hydraulic control approaches would be effective to mitigate saltwater intrusion but needs decades. Machine learning based ensemble surrogates can build accurate models with high computing efficiency, and hence save great efforts in groundwater remediation design. Including model selection uncertainty through multimodel inference and model averaging provides more reliable remediation strategies compared with the single-surrogate assisted approach

    Multilayer perceptron network optimization for chaotic time series modeling

    Get PDF
    Chaotic time series are widely present in practice, but due to their characteristics—such as internal randomness, nonlinearity, and long-term unpredictability—it is difficult to achieve high-precision intermediate or long-term predictions. Multi-layer perceptron (MLP) networks are an effective tool for chaotic time series modeling. Focusing on chaotic time series modeling, this paper presents a generalized degree of freedom approximation method of MLP. We then obtain its Akachi information criterion, which is designed as the loss function for training, hence developing an overall framework for chaotic time series analysis, including phase space reconstruction, model training, and model selection. To verify the effectiveness of the proposed method, it is applied to two artificial chaotic time series and two real-world chaotic time series. The numerical results show that the proposed optimized method is effective to obtain the best model from a group of candidates. Moreover, the optimized models perform very well in multi-step prediction tasks.This research was funded in part by the NSFC grant numbers 61972174 and 62272192, the Science-Technology Development Plan Project of Jilin Province grant number 20210201080GX, the Jilin Province Development and Reform Commission grant number 2021C044-1, the Guangdong Universities’ Innovation Team grant number 2021KCXTD015, and Key Disciplines Projects grant number 2021ZDJS138

    Ensemble Sales Forecasting Study in Semiconductor Industry

    Full text link
    Sales forecasting plays a prominent role in business planning and business strategy. The value and importance of advance information is a cornerstone of planning activity, and a well-set forecast goal can guide sale-force more efficiently. In this paper CPU sales forecasting of Intel Corporation, a multinational semiconductor industry, was considered. Past sale, future booking, exchange rates, Gross domestic product (GDP) forecasting, seasonality and other indicators were innovatively incorporated into the quantitative modeling. Benefit from the recent advances in computation power and software development, millions of models built upon multiple regressions, time series analysis, random forest and boosting tree were executed in parallel. The models with smaller validation errors were selected to form the ensemble model. To better capture the distinct characteristics, forecasting models were implemented at lead time and lines of business level. The moving windows validation process automatically selected the models which closely represent current market condition. The weekly cadence forecasting schema allowed the model to response effectively to market fluctuation. Generic variable importance analysis was also developed to increase the model interpretability. Rather than assuming fixed distribution, this non-parametric permutation variable importance analysis provided a general framework across methods to evaluate the variable importance. This variable importance framework can further extend to classification problem by modifying the mean absolute percentage error(MAPE) into misclassify error. Please find the demo code at : https://github.com/qx0731/ensemble_forecast_methodsComment: 14 pages, Industrial Conference on Data Mining 2017 (ICDM 2017

    Hybrid statistical and mechanistic mathematical model guides mobile health intervention for chronic pain

    Full text link
    Nearly a quarter of visits to the Emergency Department are for conditions that could have been managed via outpatient treatment; improvements that allow patients to quickly recognize and receive appropriate treatment are crucial. The growing popularity of mobile technology creates new opportunities for real-time adaptive medical intervention, and the simultaneous growth of big data sources allows for preparation of personalized recommendations. Here we focus on the reduction of chronic suffering in the sickle cell disease community. Sickle cell disease is a chronic blood disorder in which pain is the most frequent complication. There currently is no standard algorithm or analytical method for real-time adaptive treatment recommendations for pain. Furthermore, current state-of-the-art methods have difficulty in handling continuous-time decision optimization using big data. Facing these challenges, in this study we aim to develop new mathematical tools for incorporating mobile technology into personalized treatment plans for pain. We present a new hybrid model for the dynamics of subjective pain that consists of a dynamical systems approach using differential equations to predict future pain levels, as well as a statistical approach tying system parameters to patient data (both personal characteristics and medication response history). Pilot testing of our approach suggests that it has significant potential to predict pain dynamics given patients' reported pain levels and medication usages. With more abundant data, our hybrid approach should allow physicians to make personalized, data driven recommendations for treating chronic pain.Comment: 13 pages, 15 figures, 5 table

    Prediction of Energy Consumption of an Administrative Building using Machine Learning and Statistical Methods

    Get PDF
    oai:ojs.pkp.sfu.ca:article/4099Energy management is now essential in light of the current energy issues, particularly in the building industry, which accounts for a sizable amount of global energy use. Predicting energy consumption is of great interest in developing an effective energy management strategy. This study aims to prove the outperformance of machine learning models over SARIMA models in predicting heating energy usage in an administrative building in Chefchaouen City, Morocco. It also highlights the effectiveness of SARIMA models in predicting energy with limited data size in the training phase. The prediction is carried out using machine learning (artificial neural networks, bagging trees, boosting trees, and support vector machines) and statistical methods (14 SARIMA models). To build the models, external temperature, internal temperature, solar radiation, and the factor of time are selected as model inputs. Building energy simulation is conducted in the TRNSYS environment to generate a database for the training and validation of the models. The models' performances are compared based on three statistical indicators: normalized root mean square error (nRMSE), mean average error (MAE), and correlation coefficient (R). The results show that all studied models have good accuracy, with a correlation coefficient of 0.90 < R < 0.97. The artificial neural network outperforms all other models (R=0.97, nRMSE=12.60%, MAE= 0.19 kWh). Although machine learning methods, in general terms, seemingly outperform statistical methods, it is worth noting that SARIMA models reached good prediction accuracy without requiring too much data in the training phase. Doi: 10.28991/CEJ-2023-09-05-01 Full Text: PD
    • …
    corecore