252 research outputs found

    Advanced Data Analytics Methodologies for Anomaly Detection in Multivariate Time Series Vehicle Operating Data

    Get PDF
    Early detection of faults in the vehicle operating systems is a research domain of high significance to sustain full control of the systems since anomalous behaviors usually result in performance loss for a long time before detecting them as critical failures. In other words, operating systems exhibit degradation when failure begins to occur. Indeed, multiple presences of the failures in the system performance are not only anomalous behavior signals but also show that taking maintenance actions to keep the system performance is vital. Maintaining the systems in the nominal performance for the lifetime with the lowest maintenance cost is extremely challenging and it is important to be aware of imminent failure before it arises and implement the best countermeasures to avoid extra losses. In this context, the timely anomaly detection of the performance of the operating system is worthy of investigation. Early detection of imminent anomalous behaviors of the operating system is difficult without appropriate modeling, prediction, and analysis of the time series records of the system. Data based technologies have prepared a great foundation to develop advanced methods for modeling and prediction of time series data streams. In this research, we propose novel methodologies to predict the patterns of multivariate time series operational data of the vehicle and recognize the second-wise unhealthy states. These approaches help with the early detection of abnormalities in the behavior of the vehicle based on multiple data channels whose second-wise records for different functional working groups in the operating systems of the vehicle. Furthermore, a real case study data set is used to validate the accuracy of the proposed prediction and anomaly detection methodologies

    CRISIS TRANSMITTING EFFECTS DETECTION AND EARLY WARNING SYSTEMS DEVELOPMENT FOR CHINA’S FINANCIAL MARKETS

    Get PDF
    In the background of China’s economic development mode being focused the worldwide attention, there is a growing trend to study the risk transmission pattern and the crisis forecasting mechanism for China’s financial markets by domestic and global academics. The study progress, however, is observed to be affected by two gaping research problems: 1) few studies construct comparative contagion models and integrated crisis forecasting systems for China’s financial markets and 2) current econometric models hired to the risk spreading effects detection and the financial crisis forecasts are yet deterministically investigated in terms of the effectiveness on China. To fill the gaps, this research proposes two hybrid contagion models and prototypes the early warning systems with motivations of first analyzing the crisis linkages and transmission channels across domestic markets in hierarchical frameworks, and then predicting the market turbulence by integrating the crisis identifying techniques and time-dependent deep learning neuron networks. To accomplish our aims, the full project is progressed in phases by solving four technical challenges that portray two literature gaps of A) the crisis identification on the basis of price volatility state distinction, B) the decomposition for multivariate correlated patterns to infer the interdependence structure and risk spillover dynamics respectively, C) the real-time warning signals generation in comparison of between traditional and stylized predictive models and D) the contagion information fusion in the EWS frameworks to distinguish the leading indicators from between internal macroeconomic factors and external risk transmitters in statistical validation metrics. The research mainly contributes to the comparative analysis on financial contagion effects detection and market turbulence prediction through the hybrid model innovations for CM and EWS development, and meanwhile brings practical significance to improve the risk management in investing activities and support the crisis prevention in policy-making. In addition, the model experimented results corroborate the China-characterized mode on risk transmissions and crisis warnings that 1) the stocks and real estate markets are verified to play the central role among risk transmitters, while the managed floating foreign exchange rate and the non-fully liberalized bond market are peripheral during the crisis; and 2) the all-round opening up policy increases the possibility of domestic security markets being exposed to external risk factors, especially relating to the cash flows, energy commodities and precious metals

    A Comprehensive Survey on Rare Event Prediction

    Full text link
    Rare event prediction involves identifying and forecasting events with a low probability using machine learning and data analysis. Due to the imbalanced data distributions, where the frequency of common events vastly outweighs that of rare events, it requires using specialized methods within each step of the machine learning pipeline, i.e., from data processing to algorithms to evaluation protocols. Predicting the occurrences of rare events is important for real-world applications, such as Industry 4.0, and is an active research area in statistical and machine learning. This paper comprehensively reviews the current approaches for rare event prediction along four dimensions: rare event data, data processing, algorithmic approaches, and evaluation approaches. Specifically, we consider 73 datasets from different modalities (i.e., numerical, image, text, and audio), four major categories of data processing, five major algorithmic groupings, and two broader evaluation approaches. This paper aims to identify gaps in the current literature and highlight the challenges of predicting rare events. It also suggests potential research directions, which can help guide practitioners and researchers.Comment: 44 page

    A review of data mining applications in semiconductor manufacturing

    Get PDF
    The authors acknowledge Fundacao para a Ciencia e a Tecnologia (FCT-MCTES) for its financial support via the project UIDB/00667/2020 (UNIDEMI).For decades, industrial companies have been collecting and storing high amounts of data with the aim of better controlling and managing their processes. However, this vast amount of information and hidden knowledge implicit in all of this data could be utilized more efficiently. With the help of data mining techniques unknown relationships can be systematically discovered. The production of semiconductors is a highly complex process, which entails several subprocesses that employ a diverse array of equipment. The size of the semiconductors signifies a high number of units can be produced, which require huge amounts of data in order to be able to control and improve the semiconductor manufacturing process. Therefore, in this paper a structured review is made through a sample of 137 papers of the published articles in the scientific community regarding data mining applications in semiconductor manufacturing. A detailed bibliometric analysis is also made. All data mining applications are classified in function of the application area. The results are then analyzed and conclusions are drawn.publishersversionpublishe

    Empirical mode decomposition with least square support vector machine model for river flow forecasting

    Get PDF
    Accurate information on future river flow is a fundamental key for water resources planning, and management. Traditionally, single models have been introduced to predict the future value of river flow. However, single models may not be suitable to capture the nonlinear and non-stationary nature of the data. In this study, a three-step-prediction method based on Empirical Mode Decomposition (EMD), Kernel Principal Component Analysis (KPCA) and Least Square Support Vector Machine (LSSVM) model, referred to as EMD-KPCA-LSSVM is introduced. EMD is used to decompose the river flow data into several Intrinsic Mode Functions (IMFs) and residue. Then, KPCA is used to reduce the dimensionality of the dataset, which are then input into LSSVM for forecasting purposes. This study also presents comparison between the proposed model of EMD-KPCA-LSSVM with EMD-PCA-LSSVM, EMD-LSSVM, Benchmark EMD-LSSVM model proposed by previous researchers and few other benchmark models such as Single LSSVM and Support Vector Machine (SVM) model, EMD-SVM, PCA-LSSVM, and PCA-SVM. These models are ranked based on five statistical measures namely Mean Absolute Error (MAE), Root Mean Square Error (RMSE), Correlation Coefficient ( r ), Correlation of Efficiency (CE) and Mean Absolute Percentage Error (MAPE). Then, the best ranked model is measured using Mean of Forecasting Error (MFE) to determine its under and over-predicted forecast rate. The results show that EMD-KPCA-LSSVM ranked first based on five measures for Muda, Selangor and Tualang Rivers. This model also indicates a small percentage of under-predicted values compared to the observed river flow values of 1.36%, 0.66%, 4.8% and 2.32% for Muda, Bernam, Selangor and Tualang Rivers, respectively. The study concludes by recommending the application of an EMD-based combined model particularly with kernel-based dimension reduction approach for river flow forecasting due to better prediction results and stability than those achieved from single models

    Forecasting Stock Market Volatility: A Forecast Combination Approach

    Get PDF
    Recently, with the development of financial markets and due to the importance of these markets and their close relationship with other macroeconomic variables, using advanced mathematical models with complicated structures for forecasting these markets has become very popular. Besides, neural network models have gained a special position compared to other advanced models due to their high accuracy in forecasting different variables. Therefore, the main purpose of this study was to forecast the volatilities of TSE index by regressive models with long memory feature, feed forward neural network and hybrid models (based on forecast combination approach) using daily data. The results were indicative of the fact that based on the criteria for assessing forecasting error, i.e., MSE and RMSE, although forecasting errors of the feed forward neural network model were less than ARFIMA-FIGARCH model, the accuracy of the hybrid model of neural network and best GARCH was higher than each one of these models

    Forecasting Stock Market Volatility: A Forecast Combination Approach

    Get PDF
    Recently, with the development of financial markets and due to the importance of these markets and their close relationship with other macroeconomic variables, using advanced mathematical models with complicated structures for forecasting these markets has become very popular. Besides, neural network models have gained a special position compared to other advanced models due to their high accuracy in forecasting different variables. Therefore, the main purpose of this study was to forecast the volatilities of TSE index by regressive models with long memory feature, feed forward neural network and hybrid models (based on forecast combination approach) using daily data. The results were indicative of the fact that based on the criteria for assessing forecasting error, i.e., MSE and RMSE, although forecasting errors of the feed forward neural network model were less than ARFIMA-FIGARCH model, the accuracy of the hybrid model of neural network and best GARCH was higher than each one of these models

    Hybrid Advanced Optimization Methods with Evolutionary Computation Techniques in Energy Forecasting

    Get PDF
    More accurate and precise energy demand forecasts are required when energy decisions are made in a competitive environment. Particularly in the Big Data era, forecasting models are always based on a complex function combination, and energy data are always complicated. Examples include seasonality, cyclicity, fluctuation, dynamic nonlinearity, and so on. These forecasting models have resulted in an over-reliance on the use of informal judgment and higher expenses when lacking the ability to determine data characteristics and patterns. The hybridization of optimization methods and superior evolutionary algorithms can provide important improvements via good parameter determinations in the optimization process, which is of great assistance to actions taken by energy decision-makers. This book aimed to attract researchers with an interest in the research areas described above. Specifically, it sought contributions to the development of any hybrid optimization methods (e.g., quadratic programming techniques, chaotic mapping, fuzzy inference theory, quantum computing, etc.) with advanced algorithms (e.g., genetic algorithms, ant colony optimization, particle swarm optimization algorithm, etc.) that have superior capabilities over the traditional optimization approaches to overcome some embedded drawbacks, and the application of these advanced hybrid approaches to significantly improve forecasting accuracy
    corecore