15 research outputs found
Property valuation with interpretable machine learning
Property valuation is an important task for various stakeholders, including banks, local authorities, property developers, and brokers. As a result of the characteristics of the real estate market, such as the infrequency of trades, limited supply, negotiated prices, and small submarkets with unique traits, there is no clear market value for properties. Traditionally property valuations are done by expert appraisers. Property valuation can also be done accurately with machine learning methods, but the lack of interpretability with accurate machine learning methods can limit the adoption of those methods. Interpretable machine learning methods could be a solution to this issue, but there are concerns related to the accuracy of these methods.
This thesis aims to evaluate the feasibility of interpretable machine learning methods in property valuation by comparing a promising interpretable method to a more complex machine learning method that has had good results in property valuation previously. The promising interpretable method and the well-performed machine learning method are chosen based on previous literature. The two chosen methods, Extreme Gradient Boosting (XGB) and Explainable Boosting Machine (EBM) are compared in terms of prediction accuracy of properties in six big municipalities of Denmark. In addition to the accuracy comparison, the interpretability of the EBM is highlighted. The accuracy of the XGB method is better, even though there are no big differences between the two methods in individual municipalities. The interpretability of the EBM is good, as it is possible to understand, how the model makes predictions in general, and how individual predictions are made
Ensemble Machine Learning Model Generalizability and its Application to Indirect Tool Condition Monitoring
A practical, accurate, robust, and generalizable system for monitoring tool condition during a machining process would enable advancements in manufacturing process automation, cost reduction, and efficiency improvement. Previously proposed systems using various individual machine learning (ML) models and other analysis techniques have struggled with low generalizability to new machining and environmental conditions, as well as a common reliance on expensive or intrusive sensory equipment which hinders their industry adoption. While ensemble ML techniques offer significant advantages over individual models in terms of performance, overfitting reduction, and generalizability improvement, they have only begun to see limited applications within the field of tool condition monitoring (TCM).
To address the research gaps which currently surround TCM system generalizability and optimal ensemble model configuration for this application, nine ML model types, including five heterogeneous and homogeneous ensemble models, are employed for tool wear classification. Sound, spindle power, and axial load signals are utilized through the sensor fusion of practical external and internal machine sensors. This original experimental process data is collected through tool wear experiments using a variety of machining conditions. Four feature selection methods and multiple tool wear classification resolution values are compared for this application, and the performance of the ML models is compared across metrics including k-fold cross validation and leave-one-group-out cross validation. The generalizability of the models to data from unseen experiments and machining conditions is evaluated, and a method of improving the generalizability levels using noisy training data is examined. T-tests are used to measure the significance of model performance differences. The extra-trees ensemble ML method, which had never before been applied to signal-based TCM, shows the best performance of the nine models.M.S
A fresh engineering approach for the forecast of financial index volatility and hedging strategies
This thesis attempts a new light on a problem of importance in Financial Engineering. Volatility is a commonly accepted measure of risk in the investment field. The daily volatility is the determining factor in evaluating option prices and in conducting different hedging strategies. The volatility estimation and forecast are still far from successfully complete for industry acceptance, judged by their generally lower than 50% forecasting accuracy.
By judiciously coordinating the current engineering theory and analytical techniques such as wavelet transform, evolutionary algorithms in a Time Series Data Mining framework, and the Markov chain based discrete stochastic optimization methods, this work formulates a systematic strategy to characterize and forecast crucial as well as critical financial time series. Typical forecast features have been extracted from different index volatility data sets which exhibit abrupt drops, jumps and other embedded nonlinear characteristics so that accuracy of forecasting can be markedly improved in comparison with those of the currently prevalent methods adopted in the industry.
The key aspect of the presented approach is "transformation and sequential deployment": i) transform the data from being non-observable to observable i.e., from variance into integrated volatility; ii) conduct the wavelet transform to determine the optimal forecasting horizon; iii) transform the wavelet coefficients into 4-lag recursive data sets or viewed differently as a Markov chain; iv) apply certain genetic algorithms to extract a group of rules that characterize different patterns embedded or hidden in the data and attempt to forecast the directions/ranges of the one-step ahead events; and v)apply genetic programming to forecast the values of the one-step ahead events. By following such a step by step approach, complicated problems of time series forecasting become less complex and readily resolvable for industry application.
To implement such an approach, the one year, two year and five year S&PlOO historical data are used as training sets to derive a group of 100 rules that best describe their respective signal characteristics. These rules are then used to forecast the subsequent out-of-sample time series data. This set of tests produces an average of over 75% of correct forecasting rate that surpasses any other publicly available forecast results on any type of financial indices. Genetic programming was then applied on the out of sample data set to forecast the actual value of the one step-ahead event.
The forecasting accuracy reaches an average of 70%, which is a marked improvement over other current forecasts. To validate the proposed approach, indices of S&P500 as well as S&P 100 data are tested with the discrete stochastic optimization method, which is based on Markov chain theory and involves genetic algorithms. Results are further validated by the bootstrapping operation. All these trials showed a good reliability of the proposed methodology in this research work. Finally, the thus established methodology has been shown to have broad applications in option pricing, hedging, risk management, VaR determination, etc
Recommended from our members
Advances in machine learning algorithms for financial risk management
In this thesis, three novel machine learning techniques are introduced to address distinct
yet interrelated challenges involved in financial risk management tasks. These approaches
collectively offer a comprehensive strategy, beginning with the precise classification of credit
risks, advancing through the nuanced forecasting of financial asset volatility, and ending
with the strategic optimisation of financial asset portfolios.
Firstly, a Hybrid Dual-Resampling and Cost-Sensitive technique has been proposed to combat the prevalent issue of class imbalance in financial datasets, particularly in credit risk
assessment. The key process involves the creation of heuristically balanced datasets to effectively address the problem. It uses a resampling technique based on Gaussian mixture
modelling to generate a synthetic minority class from the minority class data and concurrently uses k-means clustering on the majority class. Feature selection is then performed
using the Extra Tree Ensemble technique. Subsequently, a cost-sensitive logistic regression
model is then applied to predict the probability of default using the heuristically balanced
datasets. The results underscore the effectiveness of our proposed technique, with superior
performance observed in comparison to other imbalanced preprocessing approaches. This
advancement in credit risk classification lays a solid foundation for understanding individual
financial behaviours, a crucial first step in the broader context of financial risk management.
Building on this foundation, the thesis then explores the forecasting of financial asset volatility, a critical aspect of understanding market dynamics. A novel model that combines a
Triple Discriminator Generative Adversarial Network with a continuous wavelet transform
is proposed. The proposed model has the ability to decompose volatility time series into
signal-like and noise-like frequency components, to allow the separate detection and monitoring of non-stationary volatility data. The network comprises of a wavelet transform
component consisting of continuous wavelet transforms and inverse wavelet transform components, an auto-encoder component made up of encoder and decoder networks, and a
Generative Adversarial Network consisting of triple Discriminator and Generator networks.
The proposed Generative Adversarial Network employs an ensemble of unsupervised loss derived from the Generative Adversarial Network component during training, supervised
loss and reconstruction loss as part of its framework. Data from nine financial assets are
employed to demonstrate the effectiveness of the proposed model. This approach not only
enhances our understanding of market fluctuations but also bridges the gap between individual credit risk assessment and macro-level market analysis.
Finally the thesis ends with a novel proposal of a novel technique or Portfolio optimisation. This involves the use of a model-free reinforcement learning strategy for portfolio
optimisation using historical Low, High, and Close prices of assets as input with weights of
assets as output. A deep Capsules Network is employed to simulate the investment strategy, which involves the reallocation of the different assets to maximise the expected return
on investment based on deep reinforcement learning. To provide more learning stability in
an online training process, a Markov Differential Sharpe Ratio reward function has been
proposed as the reinforcement learning objective function. Additionally, a Multi-Memory
Weight Reservoir has also been introduced to facilitate the learning process and optimisation of computed asset weights, helping to sequentially re-balance the portfolio throughout
a specified trading period. The use of the insights gained from volatility forecasting into
this strategy shows the interconnected nature of the financial markets. Comparative experiments with other models demonstrated that our proposed technique is capable of achieving
superior results based on risk-adjusted reward performance measures.
In a nut-shell, this thesis not only addresses individual challenges in financial risk management but it also incorporates them into a comprehensive framework; from enhancing the
accuracy of credit risk classification, through the improvement and understanding of market
volatility, to optimisation of investment strategies. These methodologies collectively show
the potential of the use of machine learning to improve financial risk management
Advances and Applications of DSmT for Information Fusion. Collected Works, Volume 5
This fifth volume on Advances and Applications of DSmT for Information Fusion collects theoretical and applied contributions of researchers working in different fields of applications and in mathematics, and is available in open-access. The collected contributions of this volume have either been published or presented after disseminating the fourth volume in 2015 in international conferences, seminars, workshops and journals, or they are new. The contributions of each part of this volume are chronologically ordered.
First Part of this book presents some theoretical advances on DSmT, dealing mainly with modified Proportional Conflict Redistribution Rules (PCR) of combination with degree of intersection, coarsening techniques, interval calculus for PCR thanks to set inversion via interval analysis (SIVIA), rough set classifiers, canonical decomposition of dichotomous belief functions, fast PCR fusion, fast inter-criteria analysis with PCR, and improved PCR5 and PCR6 rules preserving the (quasi-)neutrality of (quasi-)vacuous belief assignment in the fusion of sources of evidence with their Matlab codes.
Because more applications of DSmT have emerged in the past years since the apparition of the fourth book of DSmT in 2015, the second part of this volume is about selected applications of DSmT mainly in building change detection, object recognition, quality of data association in tracking, perception in robotics, risk assessment for torrent protection and multi-criteria decision-making, multi-modal image fusion, coarsening techniques, recommender system, levee characterization and assessment, human heading perception, trust assessment, robotics, biometrics, failure detection, GPS systems, inter-criteria analysis, group decision, human activity recognition, storm prediction, data association for autonomous vehicles, identification of maritime vessels, fusion of support vector machines (SVM), Silx-Furtif RUST code library for information fusion including PCR rules, and network for ship classification.
Finally, the third part presents interesting contributions related to belief functions in general published or presented along the years since 2015. These contributions are related with decision-making under uncertainty, belief approximations, probability transformations, new distances between belief functions, non-classical multi-criteria decision-making problems with belief functions, generalization of Bayes theorem, image processing, data association, entropy and cross-entropy measures, fuzzy evidence numbers, negator of belief mass, human activity recognition, information fusion for breast cancer therapy, imbalanced data classification, and hybrid techniques mixing deep learning with belief functions as well
Social work with airports passengers
Social work at the airport is in to offer to passengers social services. The main
methodological position is that people are under stress, which characterized by a
particular set of characteristics in appearance and behavior. In such circumstances
passenger attracts in his actions some attention. Only person whom he trusts can help him
with the documents or psychologically
Advances and Applications of Dezert-Smarandache Theory (DSmT) for Information Fusion (Collected Works), Vol. 4
The fourth volume on Advances and Applications of Dezert-Smarandache Theory (DSmT) for information fusion collects theoretical and applied contributions of researchers working in different fields of applications and in mathematics. The contributions (see List of Articles published in this book, at the end of the volume) have been published or presented after disseminating the third volume (2009, http://fs.unm.edu/DSmT-book3.pdf) in international conferences, seminars, workshops and journals.
First Part of this book presents the theoretical advancement of DSmT, dealing with Belief functions, conditioning and deconditioning, Analytic Hierarchy Process, Decision Making, Multi-Criteria, evidence theory, combination rule, evidence distance, conflicting belief, sources of evidences with different importance and reliabilities, importance of sources, pignistic probability transformation, Qualitative reasoning under uncertainty, Imprecise belief
structures, 2-Tuple linguistic label, Electre Tri Method, hierarchical proportional redistribution, basic belief assignment, subjective probability measure, Smarandache codification, neutrosophic logic, Evidence theory, outranking methods, Dempster-Shafer Theory, Bayes fusion rule, frequentist probability, mean square error, controlling factor, optimal assignment solution, data association, Transferable Belief Model, and others.
More applications of DSmT have emerged in the past years since the apparition of the third book of DSmT 2009. Subsequently, the second part of this volume is about applications of DSmT in correlation with Electronic Support Measures, belief function, sensor networks, Ground Moving Target and Multiple target tracking, Vehicle-Born Improvised Explosive Device, Belief Interacting Multiple Model filter, seismic and acoustic sensor, Support Vector Machines, Alarm
classification, ability of human visual system, Uncertainty Representation and Reasoning Evaluation Framework, Threat Assessment, Handwritten Signature Verification, Automatic Aircraft Recognition, Dynamic Data-Driven Application System, adjustment of secure communication trust analysis, and so on.
Finally, the third part presents a List of References related with DSmT published or presented along the years since its inception in 2004, chronologically ordered