19 research outputs found

    Rolling bearing fault diagnosis based on health baseline method

    Get PDF
    In order to excavate the relationship between the different features of the vibration signal, and to provide more useful information for the fault diagnosis of rolling bearings, this paper developed a new method of fault diagnosis-health baseline method and introduced the technological process of this method in detail. Through the case study, a health baseline based on two kinds of linear models was constructed. After testing, this method can distinguish the normal state of the rolling bearing, the external ring fault and the rolling element fault, which indicates that the method was feasible and effective for the fault diagnosis of the rolling bearing

    Statistical Comparisons of the Top 10 Algorithms in Data Mining for Classification Task

    Get PDF
    This work is builds on the study of the 10 top data mining algorithms identified by the IEEE International Conference on Data Mining (ICDM) community in December 2006. We address the same study, but with the application of statistical tests to establish, a more appropriate and justified ranking classifier for classification tasks. Current studies and practices on theoretical and empirical comparison of several methods, approaches, advocated tests that are more appropriate. Thereby, recent studies recommend a set of simple and robust non-parametric tests for statistical comparisons classifiers. In this paper, we propose to perform non-parametric statistical tests by the Friedman test with post-hoc tests corresponding to the comparison of several classifiers on multiple data sets. The tests provide a better judge for the relevance of these algorithms

    An Enhanced Random Linear Oracle Ensemble Method using Feature Selection Approach based on Naïve Bayes Classifier

    Get PDF
    Random Linear Oracle (RLO) ensemble replaced each classifier with two mini-ensembles, allowing base classifiers to be trained using different data set, improving the variety of trained classifiers. Naïve Bayes (NB) classifier was chosen as the base classifier for this research due to its simplicity and computational inexpensive. Different feature selection algorithms are applied to RLO ensemble to investigate the effect of different sized data towards its performance. Experiments were carried out using 30 data sets from UCI repository, as well as 6 learning algorithms, namely NB classifier, RLO ensemble, RLO ensemble trained with Genetic Algorithm (GA) feature selection using accuracy of NB classifier as fitness function, RLO ensemble trained with GA feature selection using accuracy of RLO ensemble as fitness function, RLO ensemble trained with t-test feature selection, and RLO ensemble trained with Kruskal-Wallis test feature selection. The results showed that RLO ensemble could significantly improve the diversity of NB classifier in dealing with distinctively selected feature sets through its fusionselection paradigm. Consequently, feature selection algorithms could greatly benefit RLO ensemble, with properly selected number of features from filter approach, or GA natural selection from wrapper approach, it received great classification accuracy improvement, as well as growth in diversity

    An Evaluation of Selection Strategies for Active Learning with Regression

    Get PDF
    While active learning for classification problems has received considerable attention in recent years, studies on problems of regression are rare. This paper provides a systematic review of the most commonly used selection strategies for active learning within the context of linear regression. The recently developed Exploration Guided Active Learning (EGAL) algorithm, previously deployed within a classification context, is explored as a selection strategy for regression problems. Active learning is demonstrated to significantly improve the learning rate of linear regression models. Experimental results show that a purely diversity-based approach t

    Anomaly detection and classification in traffic flow data from fluctuations in the flow-density relationship

    Get PDF
    We describe and validate a novel data-driven approach to the real time detection and classification of traffic anomalies based on the identification of atypical fluctuations in the relationship between density and flow. For aggregated data under stationary conditions, flow and density are related by the fundamental diagram. However, high resolution data obtained from modern sensor networks is generally non-stationary and disaggregated. Such data consequently show significant statistical fluctuations. These fluctuations are best described using a bivariate probability distribution in the density-flow plane. By applying kernel density estimation to high-volume data from the UK National Traffic Information Service (NTIS), we empirically construct these distributions for London's M25 motorway. Curves in the density-flow plane are then constructed, analogous to quantiles of univariate distributions. These curves quantitatively separate atypical fluctuations from typical traffic states. Although the algorithm identifies anomalies in general rather than specific events, we find that fluctuations outside the 95\% probability curve correlate strongly with the spikes in travel time associated with significant congestion events. Moreover, the size of an excursion from the typical region provides a simple, real-time measure of the severity of detected anomalies. We validate the algorithm by benchmarking its ability to identify labelled events in historical NTIS data against some commonly used methods from the literature. Detection rate, time-to-detect and false alarm rate are used as metrics and found to be generally comparable except in situations when the speed distribution is bi-modal. In such situations, the new algorithm achieves a much lower false alarm rate without suffering significant degradation on the other metrics. This method has the additional advantage of being self-calibrating.Comment: 23 pages, 12 figure

    A reduced-uncertainty hybrid evolutionary algorithm for solving dynamic shortest-path routing problem

    Get PDF
    The need for effective packet transmission to deliver advanced performance in wireless networks creates the need to find shortest network paths efficiently and quickly. This paper addresses a Reduced Uncertainty Based Hybrid Evolutionary Algorithm (RUBHEA) to solve Dynamic Shortest Path Routing Problem (DSPRP) effectively and rapidly. Genetic Algorithm (GA) and Particle Swarm Optimization (PSO) are integrated as a hybrid algorithm to find the best solution within the search space of dynamically changing networks. Both GA and PSO share context of individuals to reduce uncertainty in RUBHEA. Various regions of search space are explored and learned by RUBHEA. By employing a modified priority encoding method, each individual in both GA and PSO are represented as a potential solution for DSPRP. A Complete statistical analysis has been performed to compare the performance of RUBHEA with various state-of-the-art algorithms. It shows that RUBHEA is considerably superior (reducing the failure rate by up to 50%) to similar approaches with increasing number of nodes encountered in the networks

    Forecasting Workforce Requirement for State Transportation Agencies: A Machine Learning Approach

    Get PDF
    A decline in the number of construction engineers and inspectors available at State Transportation Agencies (STAs) to manage the ever-increasing lane miles has emphasized the importance of workforce planning in this sector. One of the crucial aspects of workforce planning involves forecasting the required workforce for any industry or agency. This thesis developed machine learning models to estimate the person-hour requirements of STAs at the agency and project levels. The Arkansas Department of Transportation (ARDOT) was used as a case study, using its employee data between 2012 and 2021. At the project level, machine learning regressors ranging from linear, tree ensembles, kernel-based, and neural network-based models were developed. At the agency level, a classic time series modeling approach, as well as neural networks-based models, were developed to forecast the monthly person-hour requirements of the agency. Parametric and non-parametric tests were employed in comparing the models across both levels. The results indicated a high performance from the random forest regressor, a tree ensemble with bagging, which recorded an average R-squared value of 0.91. The one-dimensional convolutional neural network model was the most effective model for forecasting the monthly person requirements at the agency level. It recorded an average RMSE of 4,500 person-hours monthly over short-range forecasting and an average of 5,000 person-hours monthly over long-range forecasting. These findings underscore the capability of machine learning models to provide more accurate workforce demand forecasts for STAs and the construction industry. This enhanced accuracy in workforce planning will contribute to improved resource allocation and management

    Forecasting Workforce Requirement for State Transportation Agencies: A Machine Learning Approach

    Get PDF
    A decline in the number of construction engineers and inspectors available at State Transportation Agencies (STAs) to manage the ever-increasing lane miles has emphasized the importance of workforce planning in this sector. One of the crucial aspects of workforce planning involves forecasting the required workforce for any industry or agency. This thesis developed machine learning models to estimate the person-hour requirements of STAs at the agency and project levels. The Arkansas Department of Transportation (ARDOT) was used as a case study, using its employee data between 2012 and 2021. At the project level, machine learning regressors ranging from linear, tree ensembles, kernel-based, and neural network-based models were developed. At the agency level, a classic time series modeling approach, as well as neural networks-based models, were developed to forecast the monthly person-hour requirements of the agency. Parametric and non-parametric tests were employed in comparing the models across both levels. The results indicated a high performance from the random forest regressor, a tree ensemble with bagging, which recorded an average R-squared value of 0.91. The one-dimensional convolutional neural network model was the most effective model for forecasting the monthly person requirements at the agency level. It recorded an average RMSE of 4,500 person-hours monthly over short-range forecasting and an average of 5,000 person-hours monthly over long-range forecasting. These findings underscore the capability of machine learning models to provide more accurate workforce demand forecasts for STAs and the construction industry. This enhanced accuracy in workforce planning will contribute to improved resource allocation and management

    diseño de investigación para la comparación de algoritmos de machine learning aplicados a la predicción del valor del precio de criptomonedas, a través de pruebas estadísticas de contraste y post hoc, para seleccionar aquellos con el mejor desempeño.

    Get PDF
    Compara algoritmos de machine learning aplicados a la predicción del valor del precio de criptomonedas, a través de pruebas estadísticas de contraste y post hoc, para seleccionar aquellos con el mejor desempeño
    corecore