8,423 research outputs found

    An update on statistical boosting in biomedicine

    Get PDF
    Statistical boosting algorithms have triggered a lot of research during the last decade. They combine a powerful machine-learning approach with classical statistical modelling, offering various practical advantages like automated variable selection and implicit regularization of effect estimates. They are extremely flexible, as the underlying base-learners (regression functions defining the type of effect for the explanatory variables) can be combined with any kind of loss function (target function to be optimized, defining the type of regression setting). In this review article, we highlight the most recent methodological developments on statistical boosting regarding variable selection, functional regression and advanced time-to-event modelling. Additionally, we provide a short overview on relevant applications of statistical boosting in biomedicine

    Flight Data of Airplane for Wind Forecasting

    Get PDF
    This research solely focuses on understanding and predicting weather behavior, which is one of the important factors that affect airplanes in flight. The future weather information is used for informing pilots about changing flight conditions. In this paper, we present a new approach towards forecasting one component of weather information, wind speed, from data captured by airplanes in flight. We compare NASA’s ACT-America project against NOAA’s Wind Aloft program for prediction suitability. A collinearity analysis between these datasets reveals better model performance and smaller test error with NASA’s dataset. We then apply machine learning and a genetic algorithm to process the data further and arrive at a competitive error rate. The sliding window approach is used to find the best window size, and then we create a forecasting model that predicts wind speed at high altitudes 10 mins ahead of time. Finally, a stacking-based framework was used for better performance than individual learning algorithms to get root means square error (RMSE) of the best combination as 0.674, which is 98.4% better than the state-of-the-art approach

    An Overview of Carbon Footprint Mitigation Strategies. Machine Learning for Societal Improvement, Modernization, and Progress

    Get PDF
    Among the most pressing issues in the world today is the impact of globalization and energy consumption on the environment. Despite the growing regulatory framework to prevent ecological degradation, sustainability continues to be a problem. Machine learning can help with the transition toward a net-zero carbon society. Substantial work has been done in this direction. Changing electrical systems, transportation, buildings, industry, and land use are all necessary to reduce greenhouse gas emissions. Considering the carbon footprint aspect of sustainability, this chapter provides a detailed overview of how machine learning can be applied to forge a path to ecological sustainability in each of these areas. The chapter highlights how various machine learning algorithms are used to increase the use of renewable energy, efficient transportation, and waste management systems to reduce the carbon footprint. The authors summarize the findings from the current research literature and conclude by providing a few future directions

    A novel framework for medium-term wind power prediction based on temporal attention mechanisms

    Full text link
    Wind energy is a widely distributed, recyclable and environmentally friendly energy source that plays an important role in mitigating global warming and energy shortages. Wind energy's uncertainty and fluctuating nature makes grid integration of large-scale wind energy systems challenging. Medium-term wind power forecasts can provide an essential basis for energy dispatch, so accurate wind power forecasts are essential. Much research has yielded excellent results in recent years. However, many of them require additional experimentation and analysis when applied to other data. In this paper, we propose a novel short-term forecasting framework by tree-structured parzen estimator (TPE) and decomposition algorithms. This framework defines the TPE-VMD-TFT method for 24-h and 48-h ahead wind power forecasting based on variational mode decomposition (VMD) and time fusion transformer (TFT). In the Engie wind dataset from the electricity company in France, the results show that the proposed method significantly improves the prediction accuracy. In addition, the proposed framework can be used to other decomposition algorithms and require little manual work in model training

    VAT tax gap prediction: a 2-steps Gradient Boosting approach

    Full text link
    Tax evasion is the illegal evasion of taxes by individuals, corporations, and trusts. The revenue loss from tax avoidance can undermine the effectiveness and equity of the government policies. A standard measure of tax evasion is the tax gap, that can be estimated as the difference between the total amounts of tax theoretically collectable and the total amounts of tax actually collected in a given period. This paper presents an original contribution to bottom-up approach, based on results from fiscal audits, through the use of Machine Learning. The major disadvantage of bottom-up approaches is represented by selection bias when audited taxpayers are not randomly selected, as in the case of audits performed by the Italian Revenue Agency. Our proposal, based on a 2-steps Gradient Boosting model, produces a robust tax gap estimate and, embeds a solution to correct for the selection bias which do not require any assumptions on the underlying data distribution. The 2-steps Gradient Boosting approach is used to estimate the Italian Value-added tax (VAT) gap on individual firms on the basis of fiscal and administrative data income tax returns gathered from Tax Administration Data Base, for the fiscal year 2011. The proposed method significantly boost the performance in predicting with respect to the classical parametric approaches.Comment: 27 pages, 4 figures, 8 tables Presented at NTTS 2019 conference Under review at another peer-reviewed journa

    Spatio-temporal traffic anomaly detection for urban networks

    Get PDF
    Urban road networks are often affected by disruptions such as accidents and roadworks, giving rise to congestion and delays, which can, in turn, create a wide range of negative impacts to the economy, environment, safety and security. Accurate detection of the onset of traffic anomalies, specifically Recurrent Congestion (RC) and Nonrecurrent Congestion (NRC) in the traffic networks, is an important ITS function to facilitate proactive intervention measures to reduce the level of severity of congestion. A substantial body of literature is dedicated to models with varying levels of complexity that attempt to identify such anomalies. Given the complexity of the problem, however, very less effort is dedicated to the development of methods that attempt to detect traffic anomalies using spatio-temporal features. Driven both by the recent advances in deep learning techniques and the development of Traffic Incident Management Systems (TIMS), the aim of this research is to develop novel traffic anomaly detection models that can incorporate both spatial and temporal traffic information to detect traffic anomalies at a network level. This thesis first reviews the state of the art in traffic anomaly detection techniques, including the existing methods and emerging machine learning and deep learning methods, before identifying the gaps in the current understanding of traffic anomaly and its detection. One of the problems in terms of adapting the deep learning models to traffic anomaly detection is the translation of time series traffic data from multiple locations to the format necessary for the deep learning model to learn the spatial and temporal features effectively. To address this challenging problem and build a systematic traffic anomaly detection method at a network level, this thesis proposes a methodological framework consisting of (a) the translation layer (which is designed to translate the time series traffic data from multiple locations over the road network into a desired format with spatial and temporal features), (b) detection methods and (c) localisation. This methodological framework is subsequently tested for early RC detection and NRC detection. Three translation layers including connectivity matrix, geographical grid translation and spatial temporal translation are presented and evaluated for both RC and NRC detection. The early RC detection approach is a deep learning based method that combines Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM). The NRC detection, on the other hand, involves only the application of the CNN. The performance of the proposed approach is compared against other conventional congestion detection methods, using a comprehensive evaluation framework that includes metrics such as detection rates and false positive rates, and the sensitivity analysis of time windows as well as prediction horizons. The conventional congestion detection methods used for the comparison include Multilayer Perceptron, Random Forest and Gradient Boost Classifier, all of which are commonly used in the literature. Real-world traffic data from the City of Bath are used for the comparative analysis of RC, while traffic data in conjunction with incident data extracted from Central London are used for NRC detection. The results show that while the connectivity matrix may be capable of extracting features of a small network, the increased sparsity in the matrix in a large network reduces its effectiveness in feature learning compared to geographical grid translation. The results also indicate that the proposed deep learning method demonstrates superior detection accuracy compared to alternative methods and that it can detect recurrent congestion as early as one hour ahead with acceptable accuracy. The proposed method is capable of being implemented within a real-world ITS system making use of traffic sensor data, thereby providing a practically useful tool for road network managers to manage traffic proactively. In addition, the results demonstrate that a deep learning-based approach may improve the accuracy of incident detection and locate traffic anomalies precisely, especially in a large urban network. Finally, the framework is further tested for robustness in terms of network topology, sensor faults and missing data. The robustness analysis demonstrates that the proposed traffic anomaly detection approaches are transferable to different sizes of road networks, and that they are robust in the presence of sensor faults and missing data.Open Acces

    Gesture recognition by learning local motion signatures using smartphones

    Get PDF
    In recent years, gesture or activity recognition is an important area of research for the modern health care system. An activity is recognized by learning from human body postures and signatures. Presently all smartphones are equipped with accelerometer and gyroscopes sensors, and the reading of these sensors can be utilized as an input to a classifier to predict the human activity. Although the human activity recognition gained a notable scientific interest in recent years, still accuracy, scalability and robustness need significant improvement to cater as a solution of most of the real world problems. This paper aims to fill the identified research gap and proposes Grid Search based Logistic Regression and Gradient Boosting Decision Tree multistage prediction model. UCI-HAR dataset has been used to perform Gesture recognition by learning local motion signatures. The proposed approach exhibits improved accuracy over preexisting techniques concerning to human activity recognition
    • …
    corecore