2,226 research outputs found

    Understanding and improving the applicability of randomised controlled trials: subgroup reporting and the statistical calibration of trials to real-world populations

    Get PDF
    Context and objective Randomised controlled trials (hereafter, trials) are widely regarded as the gold standard for evaluating treatment efficacy in medical interventions. They employ strict study designs, rigorous eligibility criteria, standardised protocols, and close participant monitoring under controlled conditions, contributing to high internal validity. However, these stringent criteria and procedures may limit the generalisability of trial findings to real-world situations, which often involve diverse patient populations such as multimorbidity and frailty patients. Consequently, there is growing interest in the applicability of trials to real-world clinical practice. In this thesis I will 1) evaluate how well major trials report on variation in treatment effects and 2) examine the use of trial calibration methods to test trial applicability. Methods 1) A comprehensive and consistent subgroup reporting description was presented, which contributes to the exploration of subgroup effects and treatment heterogeneity for informed decision-making in tailored subgroup populations within routine practice. The study evaluated 2,235 trials from clinicaltrial.gov that involve multiple chronic medical conditions, assessing the presence of subgroup reporting in corresponding publications and extracting subgroup terms. These terms were then standardised and summarised using Medical Subject Headings and WHO Anatomical Therapeutic Chemical codes. Logistic and Poisson regression models were employed to identify independent predictors of subgroup reporting patterns. 2) Two calibration models, namely the regression-based model and inverse odds of sampling weights (IOSW) were implemented. These models were utilised to apply the findings from two influential heart failure (HF) trials - COMET and DIG - to a real-world HF registry in Scotland consisting of 8,012 HF patients mainly with reduced ejection fraction, using individual participant data (IPD) from both datasets. Additionally, calibration was conducted within the subgroup population (lowest and highest risk group) of the real-world Scottish HF registry for exploratory analyses. The study provided comparisons of baseline characteristics and calibrated and uncalibrated results between the trial and registry. Furthermore, it assessed the impact of calibration on the results with the focus on overall effects and precision. Results The subgroup reporting study showed that among 2,235 eligible trials, 48% (1,082 trials) reported overall results and 23% (524 trials) reported subgroups. Age (51%), gender (45%), racial group (28%) and geographical locations (17%) were the most frequently reported subgroups among 524 trials. Characteristics related to the index condition (severity/duration/types, etc.) were somewhat commonly reported. However, reporting on metrics of comorbidity or frailty and mental health were rare. Follow-up time, enrolment size, trial starting year and specific index conditions (e.g., hypercholesterolemia, hypertension etc.) were significant predictors for any subgroup reporting after adjusting for enrolment size and index conditions while funding source and number of arms were not associated with subgroup reporting. The trial calibration study showed that registry patients were, on average, older, had poorer renal function and received higher-doses of loop diuretics than trial participants. The key findings from two HF trials remained consistent after calibration in the registry, with a tolerable decrease in precision (larger confidence intervals) for the effect estimates. Treatment-effect estimates were also similar when trials were calibrated to high-risk and low-risk registry patients, albeit with a greater reduction in precision. Conclusion Variations in subgroup reporting among different trials limited the feasibility to evaluate subgroup effects and examine heterogeneity of treatment effects. If IPD or IPD alternative summarised data is available from trials and the registry, trial applicability can be assessed by performing calibration

    Geodetic monitoring of complex shaped infrastructures using Ground-Based InSAR

    Get PDF
    In the context of climate change, alternatives to fossil energies need to be used as much as possible to produce electricity. Hydroelectric power generation through the utilisation of dams stands out as an exemplar of highly effective methodologies in this endeavour. Various monitoring sensors can be installed with different characteristics w.r.t. spatial resolution, temporal resolution and accuracy to assess their safe usage. Among the array of techniques available, it is noteworthy that ground-based synthetic aperture radar (GB-SAR) has not yet been widely adopted for this purpose. Despite its remarkable equilibrium between the aforementioned attributes, its sensitivity to atmospheric disruptions, specific acquisition geometry, and the requisite for phase unwrapping collectively contribute to constraining its usage. Several processing strategies are developed in this thesis to capitalise on all the opportunities of GB-SAR systems, such as continuous, flexible and autonomous observation combined with high resolutions and accuracy. The first challenge that needs to be solved is to accurately localise and estimate the azimuth of the GB-SAR to improve the geocoding of the image in the subsequent step. A ray tracing algorithm and tomographic techniques are used to recover these external parameters of the sensors. The introduction of corner reflectors for validation purposes confirms a significant error reduction. However, for the subsequent geocoding, challenges persist in scenarios involving vertical structures due to foreshortening and layover, which notably compromise the geocoding quality of the observed points. These issues arise when multiple points at varying elevations are encapsulated within a singular resolution cell, posing difficulties in pinpointing the precise location of the scattering point responsible for signal return. To surmount these hurdles, a Bayesian approach grounded in intensity models is formulated, offering a tool to enhance the accuracy of the geocoding process. The validation is assessed on a dam in the black forest in Germany, characterised by a very specific structure. The second part of this thesis is focused on the feasibility of using GB-SAR systems for long-term geodetic monitoring of large structures. A first assessment is made by testing large temporal baselines between acquisitions for epoch-wise monitoring. Due to large displacements, the phase unwrapping can not recover all the information. An improvement is made by adapting the geometry of the signal processing with the principal component analysis. The main case study consists of several campaigns from different stations at Enguri Dam in Georgia. The consistency of the estimated displacement map is assessed by comparing it to a numerical model calibrated on the plumblines data. It exhibits a strong agreement between the two results and comforts the usage of GB-SAR for epoch-wise monitoring, as it can measure several thousand points on the dam. It also exhibits the possibility of detecting local anomalies in the numerical model. Finally, the instrument has been installed for continuous monitoring for over two years at Enguri Dam. An adequate flowchart is developed to eliminate the drift happening with classical interferometric algorithms to achieve the accuracy required for geodetic monitoring. The analysis of the obtained time series confirms a very plausible result with classical parametric models of dam deformations. Moreover, the results of this processing strategy are also confronted with the numerical model and demonstrate a high consistency. The final comforting result is the comparison of the GB-SAR time series with the output from four GNSS stations installed on the dam crest. The developed algorithms and methods increase the capabilities of the GB-SAR for dam monitoring in different configurations. It can be a valuable and precious supplement to other classical sensors for long-term geodetic observation purposes as well as short-term monitoring in cases of particular dam operations

    Advances in machine learning algorithms for financial risk management

    Get PDF
    In this thesis, three novel machine learning techniques are introduced to address distinct yet interrelated challenges involved in financial risk management tasks. These approaches collectively offer a comprehensive strategy, beginning with the precise classification of credit risks, advancing through the nuanced forecasting of financial asset volatility, and ending with the strategic optimisation of financial asset portfolios. Firstly, a Hybrid Dual-Resampling and Cost-Sensitive technique has been proposed to combat the prevalent issue of class imbalance in financial datasets, particularly in credit risk assessment. The key process involves the creation of heuristically balanced datasets to effectively address the problem. It uses a resampling technique based on Gaussian mixture modelling to generate a synthetic minority class from the minority class data and concurrently uses k-means clustering on the majority class. Feature selection is then performed using the Extra Tree Ensemble technique. Subsequently, a cost-sensitive logistic regression model is then applied to predict the probability of default using the heuristically balanced datasets. The results underscore the effectiveness of our proposed technique, with superior performance observed in comparison to other imbalanced preprocessing approaches. This advancement in credit risk classification lays a solid foundation for understanding individual financial behaviours, a crucial first step in the broader context of financial risk management. Building on this foundation, the thesis then explores the forecasting of financial asset volatility, a critical aspect of understanding market dynamics. A novel model that combines a Triple Discriminator Generative Adversarial Network with a continuous wavelet transform is proposed. The proposed model has the ability to decompose volatility time series into signal-like and noise-like frequency components, to allow the separate detection and monitoring of non-stationary volatility data. The network comprises of a wavelet transform component consisting of continuous wavelet transforms and inverse wavelet transform components, an auto-encoder component made up of encoder and decoder networks, and a Generative Adversarial Network consisting of triple Discriminator and Generator networks. The proposed Generative Adversarial Network employs an ensemble of unsupervised loss derived from the Generative Adversarial Network component during training, supervised loss and reconstruction loss as part of its framework. Data from nine financial assets are employed to demonstrate the effectiveness of the proposed model. This approach not only enhances our understanding of market fluctuations but also bridges the gap between individual credit risk assessment and macro-level market analysis. Finally the thesis ends with a novel proposal of a novel technique or Portfolio optimisation. This involves the use of a model-free reinforcement learning strategy for portfolio optimisation using historical Low, High, and Close prices of assets as input with weights of assets as output. A deep Capsules Network is employed to simulate the investment strategy, which involves the reallocation of the different assets to maximise the expected return on investment based on deep reinforcement learning. To provide more learning stability in an online training process, a Markov Differential Sharpe Ratio reward function has been proposed as the reinforcement learning objective function. Additionally, a Multi-Memory Weight Reservoir has also been introduced to facilitate the learning process and optimisation of computed asset weights, helping to sequentially re-balance the portfolio throughout a specified trading period. The use of the insights gained from volatility forecasting into this strategy shows the interconnected nature of the financial markets. Comparative experiments with other models demonstrated that our proposed technique is capable of achieving superior results based on risk-adjusted reward performance measures. In a nut-shell, this thesis not only addresses individual challenges in financial risk management but it also incorporates them into a comprehensive framework; from enhancing the accuracy of credit risk classification, through the improvement and understanding of market volatility, to optimisation of investment strategies. These methodologies collectively show the potential of the use of machine learning to improve financial risk management

    Effects of municipal smoke-free ordinances on secondhand smoke exposure in the Republic of Korea

    Get PDF
    ObjectiveTo reduce premature deaths due to secondhand smoke (SHS) exposure among non-smokers, the Republic of Korea (ROK) adopted changes to the National Health Promotion Act, which allowed local governments to enact municipal ordinances to strengthen their authority to designate smoke-free areas and levy penalty fines. In this study, we examined national trends in SHS exposure after the introduction of these municipal ordinances at the city level in 2010.MethodsWe used interrupted time series analysis to assess whether the trends of SHS exposure in the workplace and at home, and the primary cigarette smoking rate changed following the policy adjustment in the national legislation in ROK. Population-standardized data for selected variables were retrieved from a nationally representative survey dataset and used to study the policy action’s effectiveness.ResultsFollowing the change in the legislation, SHS exposure in the workplace reversed course from an increasing (18% per year) trend prior to the introduction of these smoke-free ordinances to a decreasing (−10% per year) trend after adoption and enforcement of these laws (β2 = 0.18, p-value = 0.07; β3 = −0.10, p-value = 0.02). SHS exposure at home (β2 = 0.10, p-value = 0.09; β3 = −0.03, p-value = 0.14) and the primary cigarette smoking rate (β2 = 0.03, p-value = 0.10; β3 = 0.008, p-value = 0.15) showed no significant changes in the sampled period. Although analyses stratified by sex showed that the allowance of municipal ordinances resulted in reduced SHS exposure in the workplace for both males and females, they did not affect the primary cigarette smoking rate as much, especially among females.ConclusionStrengthening the role of local governments by giving them the authority to enact and enforce penalties on SHS exposure violation helped ROK to reduce SHS exposure in the workplace. However, smoking behaviors and related activities seemed to shift to less restrictive areas such as on the streets and in apartment hallways, negating some of the effects due to these ordinances. Future studies should investigate how smoke-free policies beyond public places can further reduce the SHS exposure in ROK

    Vibration-based damage localisation: Impulse response identification and model updating methods

    Get PDF
    Structural health monitoring has gained more and more interest over the recent decades. As the technology has matured and monitoring systems are employed commercially, the development of more powerful and precise methods is the logical next step in this field. Especially vibration sensor networks with few measurement points combined with utilisation of ambient vibration sources are attractive for practical applications, as this approach promises to be cost-effective while requiring minimal modification to the monitored structures. Since efficient methods for damage detection have already been developed for such sensor networks, the research focus shifts towards extracting more information from the measurement data, in particular to the localisation and quantification of damage. Two main concepts have produced promising results for damage localisation. The first approach involves a mechanical model of the structure, which is used in a model updating scheme to find the damaged areas of the structure. Second, there is a purely data-driven approach, which relies on residuals of vibration estimations to find regions where damage is probable. While much research has been conducted following these two concepts, different approaches are rarely directly compared using the same data sets. Therefore, this thesis presents advanced methods for vibration-based damage localisation using model updating as well as a data-driven method and provides a direct comparison using the same vibration measurement data. The model updating approach presented in this thesis relies on multiobjective optimisation. Hence, the applied numerical optimisation algorithms are presented first. On this basis, the model updating parameterisation and objective function formulation is developed. The data-driven approach employs residuals from vibration estimations obtained using multiple-input finite impulse response filters. Both approaches are then verified using a simulated cantilever beam considering multiple damage scenarios. Finally, experimentally obtained data from an outdoor girder mast structure is used to validate the approaches. In summary, this thesis provides an assessment of model updating and residual-based damage localisation by means of verification and validation cases. It is found that the residual-based method exhibits numerical performance sufficient for real-time applications while providing a high sensitivity towards damage. However, the localisation accuracy is found to be superior using the model updating method

    Data- og ekspertdreven variabelseleksjon for prediktive modeller i helsevesenet : mot økt tolkbarhet i underbestemte maskinlæringsproblemer

    Get PDF
    Modern data acquisition techniques in healthcare generate large collections of data from multiple sources, such as novel diagnosis and treatment methodologies. Some concrete examples are electronic healthcare record systems, genomics, and medical images. This leads to situations with often unstructured, high-dimensional heterogeneous patient cohort data where classical statistical methods may not be sufficient for optimal utilization of the data and informed decision-making. Instead, investigating such data structures with modern machine learning techniques promises to improve the understanding of patient health issues and may provide a better platform for informed decision-making by clinicians. Key requirements for this purpose include (a) sufficiently accurate predictions and (b) model interpretability. Achieving both aspects in parallel is difficult, particularly for datasets with few patients, which are common in the healthcare domain. In such cases, machine learning models encounter mathematically underdetermined systems and may overfit easily on the training data. An important approach to overcome this issue is feature selection, i.e., determining a subset of informative features from the original set of features with respect to the target variable. While potentially raising the predictive performance, feature selection fosters model interpretability by identifying a low number of relevant model parameters to better understand the underlying biological processes that lead to health issues. Interpretability requires that feature selection is stable, i.e., small changes in the dataset do not lead to changes in the selected feature set. A concept to address instability is ensemble feature selection, i.e. the process of repeating the feature selection multiple times on subsets of samples of the original dataset and aggregating results in a meta-model. This thesis presents two approaches for ensemble feature selection, which are tailored towards high-dimensional data in healthcare: the Repeated Elastic Net Technique for feature selection (RENT) and the User-Guided Bayesian Framework for feature selection (UBayFS). While RENT is purely data-driven and builds upon elastic net regularized models, UBayFS is a general framework for ensembles with the capabilities to include expert knowledge in the feature selection process via prior weights and side constraints. A case study modeling the overall survival of cancer patients compares these novel feature selectors and demonstrates their potential in clinical practice. Beyond the selection of single features, UBayFS also allows for selecting whole feature groups (feature blocks) that were acquired from multiple data sources, as those mentioned above. Importance quantification of such feature blocks plays a key role in tracing information about the target variable back to the acquisition modalities. Such information on feature block importance may lead to positive effects on the use of human, technical, and financial resources if systematically integrated into the planning of patient treatment by excluding the acquisition of non-informative features. Since a generalization of feature importance measures to block importance is not trivial, this thesis also investigates and compares approaches for feature block importance rankings. This thesis demonstrates that high-dimensional datasets from multiple data sources in the medical domain can be successfully tackled by the presented approaches for feature selection. Experimental evaluations demonstrate favorable properties of both predictive performance, stability, as well as interpretability of results, which carries a high potential for better data-driven decision support in clinical practice.Moderne datainnsamlingsteknikker i helsevesenet genererer store datamengder fra flere kilder, som for eksempel nye diagnose- og behandlingsmetoder. Noen konkrete eksempler er elektroniske helsejournalsystemer, genomikk og medisinske bilder. Slike pasientkohortdata er ofte ustrukturerte, høydimensjonale og heterogene og hvor klassiske statistiske metoder ikke er tilstrekkelige for optimal utnyttelse av dataene og god informasjonsbasert beslutningstaking. Derfor kan det være lovende å analysere slike datastrukturer ved bruk av moderne maskinlæringsteknikker for å øke forståelsen av pasientenes helseproblemer og for å gi klinikerne en bedre plattform for informasjonsbasert beslutningstaking. Sentrale krav til dette formålet inkluderer (a) tilstrekkelig nøyaktige prediksjoner og (b) modelltolkbarhet. Å oppnå begge aspektene samtidig er vanskelig, spesielt for datasett med få pasienter, noe som er vanlig for data i helsevesenet. I slike tilfeller må maskinlæringsmodeller håndtere matematisk underbestemte systemer og dette kan lett føre til at modellene overtilpasses treningsdataene. Variabelseleksjon er en viktig tilnærming for å håndtere dette ved å identifisere en undergruppe av informative variabler med hensyn til responsvariablen. Samtidig som variabelseleksjonsmetoder kan lede til økt prediktiv ytelse, fremmes modelltolkbarhet ved å identifisere et lavt antall relevante modellparametere. Dette kan gi bedre forståelse av de underliggende biologiske prosessene som fører til helseproblemer. Tolkbarhet krever at variabelseleksjonen er stabil, dvs. at små endringer i datasettet ikke fører til endringer i hvilke variabler som velges. Et konsept for å adressere ustabilitet er ensemblevariableseleksjon, dvs. prosessen med å gjenta variabelseleksjon flere ganger på en delmengde av prøvene i det originale datasett og aggregere resultater i en metamodell. Denne avhandlingen presenterer to tilnærminger for ensemblevariabelseleksjon, som er skreddersydd for høydimensjonale data i helsevesenet: "Repeated Elastic Net Technique for feature selection" (RENT) og "User-Guided Bayesian Framework for feature selection" (UBayFS). Mens RENT er datadrevet og bygger på elastic net-regulariserte modeller, er UBayFS et generelt rammeverk for ensembler som muliggjør inkludering av ekspertkunnskap i variabelseleksjonsprosessen gjennom forhåndsbestemte vekter og sidebegrensninger. En case-studie som modellerer overlevelsen av kreftpasienter sammenligner disse nye variabelseleksjonsmetodene og demonstrerer deres potensiale i klinisk praksis. Utover valg av enkelte variabler gjør UBayFS det også mulig å velge blokker eller grupper av variabler som representerer de ulike datakildene som ble nevnt over. Kvantifisering av viktigheten av variabelgrupper spiller en nøkkelrolle for forståelsen av hvorvidt datakildene er viktige for responsvariablen. Tilgang til slik informasjon kan føre til at bruken av menneskelige, tekniske og økonomiske ressurser kan forbedres dersom informasjonen integreres systematisk i planleggingen av pasientbehandlingen. Slik kan man redusere innsamling av ikke-informative variabler. Siden generaliseringen av viktighet av variabelgrupper ikke er triviell, undersøkes og sammenlignes også tilnærminger for rangering av viktigheten til disse variabelgruppene. Denne avhandlingen viser at høydimensjonale datasett fra flere datakilder fra det medisinske domenet effektivt kan håndteres ved bruk av variabelseleksjonmetodene som er presentert i avhandlingen. Eksperimentene viser at disse kan ha positiv en effekt på både prediktiv ytelse, stabilitet og tolkbarhet av resultatene. Bruken av disse variabelseleksjonsmetodene bærer et stort potensiale for bedre datadrevet beslutningsstøtte i klinisk praksis

    On the Utility of Representation Learning Algorithms for Myoelectric Interfacing

    Get PDF
    Electrical activity produced by muscles during voluntary movement is a reflection of the firing patterns of relevant motor neurons and, by extension, the latent motor intent driving the movement. Once transduced via electromyography (EMG) and converted into digital form, this activity can be processed to provide an estimate of the original motor intent and is as such a feasible basis for non-invasive efferent neural interfacing. EMG-based motor intent decoding has so far received the most attention in the field of upper-limb prosthetics, where alternative means of interfacing are scarce and the utility of better control apparent. Whereas myoelectric prostheses have been available since the 1960s, available EMG control interfaces still lag behind the mechanical capabilities of the artificial limbs they are intended to steer—a gap at least partially due to limitations in current methods for translating EMG into appropriate motion commands. As the relationship between EMG signals and concurrent effector kinematics is highly non-linear and apparently stochastic, finding ways to accurately extract and combine relevant information from across electrode sites is still an active area of inquiry.This dissertation comprises an introduction and eight papers that explore issues afflicting the status quo of myoelectric decoding and possible solutions, all related through their use of learning algorithms and deep Artificial Neural Network (ANN) models. Paper I presents a Convolutional Neural Network (CNN) for multi-label movement decoding of high-density surface EMG (HD-sEMG) signals. Inspired by the successful use of CNNs in Paper I and the work of others, Paper II presents a method for automatic design of CNN architectures for use in myocontrol. Paper III introduces an ANN architecture with an appertaining training framework from which simultaneous and proportional control emerges. Paper Iv introduce a dataset of HD-sEMG signals for use with learning algorithms. Paper v applies a Recurrent Neural Network (RNN) model to decode finger forces from intramuscular EMG. Paper vI introduces a Transformer model for myoelectric interfacing that do not need additional training data to function with previously unseen users. Paper vII compares the performance of a Long Short-Term Memory (LSTM) network to that of classical pattern recognition algorithms. Lastly, paper vIII describes a framework for synthesizing EMG from multi-articulate gestures intended to reduce training burden

    Numerical simulations of ice accretion on wind turbine blades: are performance losses due to ice shape or surface roughness?

    Get PDF
    Ice accretion on wind turbine blades causes both a change in the shape of its sections and an increase in surface roughness. These lead to degraded aerodynamic performances and lower power output. Here, a high-fidelity multi-step method is presented and applied to simulate a 3 h rime icing event on the National Renewable Energy Laboratory 5 MW wind turbine blade. Five sections belonging to the outer half of the blade were considered. Independent time steps were applied to each blade section to obtain detailed ice shapes. The roughness effect on airfoil performance was included in computational fluid dynamics simulations using an equivalent sand-grain approach. The aerodynamic coefficients of the iced sections were computed considering two different roughness heights and extensions along the blade surface. The power curve before and after the icing event was computed according to the Design Load Case 1.1 of the International Electrotechnical Commission. In the icing event under analysis, the decrease in power output strongly depended on wind speed and, in fact, tip speed ratio. Regarding the different roughness heights and extensions along the blade, power losses were qualitatively similar but significantly different in magnitude despite the well-developed ice shapes. It was found that extended roughness regions in the chordwise direction of the blade can become as detrimental as the ice shape itself

    Acoustic Propagation Variation with Temperature Profile in Water Filled Steel Pipes at Pressure

    Get PDF
    Conventional pressure leak testing of buried pipelines compares measurements of pressure with pipe wall temperature. An alternative proposed method uses acoustic velocity measurements to replace pipe wall temperature measurements. Early experiments using this method identified anomalous results of rising acoustic velocities thought to be caused by air solution. This research investigated the anomalous acoustic velocity measurements by evaluation of acoustic velocity variation with pressure, temperature and air solution. Quiescent air solution rate experiments were carried out in water filled pipes. Computer modelling of the air bubble shape variation with pipe diameter was found to agree with bubble and drop experiments over the pipe diameter range from 100 mm to 1000 mm. Bubbles were found to maintain constant width over a large volume range confirmed by experiments and modelling

    Reinforcement learning in large state action spaces

    Get PDF
    Reinforcement learning (RL) is a promising framework for training intelligent agents which learn to optimize long term utility by directly interacting with the environment. Creating RL methods which scale to large state-action spaces is a critical problem towards ensuring real world deployment of RL systems. However, several challenges limit the applicability of RL to large scale settings. These include difficulties with exploration, low sample efficiency, computational intractability, task constraints like decentralization and lack of guarantees about important properties like performance, generalization and robustness in potentially unseen scenarios. This thesis is motivated towards bridging the aforementioned gap. We propose several principled algorithms and frameworks for studying and addressing the above challenges RL. The proposed methods cover a wide range of RL settings (single and multi-agent systems (MAS) with all the variations in the latter, prediction and control, model-based and model-free methods, value-based and policy-based methods). In this work we propose the first results on several different problems: e.g. tensorization of the Bellman equation which allows exponential sample efficiency gains (Chapter 4), provable suboptimality arising from structural constraints in MAS(Chapter 3), combinatorial generalization results in cooperative MAS(Chapter 5), generalization results on observation shifts(Chapter 7), learning deterministic policies in a probabilistic RL framework(Chapter 6). Our algorithms exhibit provably enhanced performance and sample efficiency along with better scalability. Additionally, we also shed light on generalization aspects of the agents under different frameworks. These properties have been been driven by the use of several advanced tools (e.g. statistical machine learning, state abstraction, variational inference, tensor theory). In summary, the contributions in this thesis significantly advance progress towards making RL agents ready for large scale, real world applications
    • …
    corecore