5 research outputs found

    The impact of psychopathology, social adversity and stress-relevant DNA methylation on prospective risk for post-traumatic stress: A machine learning approach

    Get PDF
    Background: A range of factors have been identified that contribute to greater incidence, severity, and prolonged course of post-traumatic stress disorder (PTSD), including: comorbid and/or prior psychopathology; social adversity such as low socioeconomic position, perceived discrimination, and isolation; and biological factors such as genomic variation at glucocorticoid receptor regulatory network (GRRN) genes. This complex etiology and clinical course make identification of people at higher risk of PTSD challenging. Here we leverage machine learning (ML) approaches to identify a core set of factors that may together predispose persons to PTSD. Methods: We used multiple ML approaches to assess the relationship among DNA methylation (DNAm) at GRRN genes, prior psychopathology, social adversity, and prospective risk for PTS severity (PTSS). Results: ML models predicted prospective risk of PTSS with high accuracy. The Gradient Boost approach was the top-performing model with mean absolute error of 0.135, mean square error of 0.047, root mean square error of 0.217, and R2 of 95.29%. Prior PTSS ranked highest in predicting the prospective risk of PTSS, accounting for >88% of the prediction. The top ranked GRRN CpG site was cg05616442, in AKT1, and the top ranked social adversity feature was loneliness. Conclusion: Multiple factors including prior PTSS, social adversity, and DNAm play a role in predicting prospective risk of PTSS. ML models identified factors accounting for increased PTSS risk with high accuracy, which may help to target risk factors that reduce the likelihood or course of PTSD, potentially pointing to approaches that can lead to early intervention. Limitation: One of the limitations of this study is small sample size

    A new approach to calibrating functional complexity weight in software development effort estimation

    Get PDF
    Function point analysis is a widely used metric in the software industry for development effort estimation. It was proposed in the 1970s, and then standardized by the International Function Point Users Group, as accepted by many organizations worldwide. While the software industry has grown rapidly, the weight values specified for the standard function point counting have remained the same since its inception. Another problem is that software development in different industry sectors is peculiar, but basic rules apply to all. These raise important questions about the validity of weight values in practical applications. In this study, we propose an algorithm for calibrating the standardized functional complexity weights, aiming to estimate a more accurate software size that fits specific software applications, reflects software industry trends, and improves the effort estimation of software projects. The results show that the proposed algorithms improve effort estimation accuracy against the baseline method.RVO/FAI/2021/002Faculty of Applied Informatics, Tomas Bata University in Zlin [RVO/FAI/2021/002

    Machine Learning based Models for Fresh Produce Yield and Price Forecasting for Strawberry Fruit

    Get PDF
    Building market price forecasting models of Fresh Produce (FP) is crucial to protect retailers and consumers from highly priced FP. However, the task of forecasting FP prices is highly complex due to the very short shelf life of FP, inability to store for long term and external factors like weather and climate change. This forecasting problem has been traditionally modelled as a time series problem. Models for grain yield forecasting and other non-agricultural prices forecasting are common. However, forecasting of FP prices is recent and has not been fully explored. In this thesis, the forecasting models built to fill this void are solely machine learning based which is also a novelty. The growth and success of deep learning, a type of machine learning algorithm, has largely been attributed to the availability of big data and high end computational power. In this thesis, work is done on building several machine learning models (both conventional and deep learning based) to predict future yield and prices of FP (price forecast of strawberries are said to be more difficult than other FP and hence is used here as the main product). The data used in building these prediction models comprises of California weather data, California strawberry yield, California strawberry farm-gate prices and a retailer purchase price data. A comparison of the various prediction models is done based on a new aggregated error measure (AGM) proposed in this thesis which combines mean absolute error, mean squared error and R^2 coefficient of determination. The best two models are found to be an Attention CNN-LSTM (AC-LSTM) and an Attention ConvLSTM (ACV-LSTM). Different stacking ensemble techniques such as voting regressor and stacking with Support vector Regression (SVR) are then utilized to come up with the best prediction. The experiment results show that across the various examined applications, the proposed model which is a stacking ensemble of the AC-LSTM and ACV-LSTM using a linear SVR is the best performing based on the proposed aggregated error measure. To show the robustness of the proposed model, it was used also tested for predicting WTI and Brent crude oil prices and the results proved consistent with that of the FP price prediction

    Data Analytics for Automated Near Real Time Detection of Blockages in Smart Wastewater Systems

    Get PDF
    Blockage events account for a substantial portion of the reported failures in the wastewater network, causing flooding, loss of service, environmental pollution and significant clean-up costs. Increasing telemetry in Combined Sewer Overflows (CSOs) provides the opportunity for near real-time data-driven modelling of the sewer network. The research work presented in this thesis describes the development and testing of a novel system, designed for the automatic detection of blockages and other unusual events in near real-time. The methodology utilises an Evolutionary Artificial Neural Network (EANN) model for short term CSO level predictions and Statistical Process Control (SPC) techniques to analyse unusual CSO level behaviour. The system is designed to mimic the work of a trained, experience human technician in determining if a blockage event has occurred. The detection system has been applied to real blockage events from a UK wastewater network. The results obtained illustrate that the methodology can identify different types of blockage events in a reliable and timely manner, and with a low number of false alarms. In addition, a model has been developed for the prediction of water levels in a CSO chamber and the generation of alerts for upcoming spill events. The model consists of a bi-model committee evolutionary artificial neural network (CEANN), composed of two EANN models optimised for wet and dry weather, respectively. The models are combined using a non-linear weighted averaging approach to overcome bias arising from imbalanced data. Both methodologies are designed to be generic and self-learning, thus they can be applied to any CSO location, without requiring input from a human operator. It is envisioned that the technology will allow utilities to respond proactively to developing blockages events, thus reducing potential harm to the sewer network and the surrounding environment
    corecore