152 research outputs found
Computation Approaches for Continuous Reinforcement Learning Problems
Optimisation theory is at the heart of any control process, where we seek to control the behaviour of a system through a set of actions. Linear control problems have been extensively studied, and optimal control laws have been identified. But the world around us is highly non-linear and unpredictable. For these dynamic systems, which donāt possess the nice mathematical properties of the linear counterpart, the classic control theory breaks and other methods have to be employed. But nature thrives by optimising non-linear and over-complicated systems. Evolutionary Computing (EC) methods exploit natureās way by imitating the evolution process
and avoid to solve the control problem analytically.
Reinforcement Learning (RL) from the other side regards the optimal control problem as a sequential one. In every discrete time step an action is applied. The transition of the system to a new state is accompanied by a sole numerical value, the ārewardā that designate the quality of the control action. Even though the amount of feedback information is limited into a sole
real number, the introduction of the Temporal Difference method made possible to have accurate predictions of the value-functions. This paved the way to optimise complex structures, like the Neural Networks, which are used to approximate the value functions.
In this thesis we investigate the solution of continuous Reinforcement Learning control problems by EC methodologies. The accumulated reward of such problems throughout an episode suffices as information to formulate the required measure, fitness, in order to optimise a population of candidate solutions. Especially, we explore the limits of applicability of a specific branch of EC, that of Genetic Programming (GP). The evolving population in the GP case is comprised
from individuals, which are immediately translated to mathematical functions, which can serve
as a control law.
The major contribution of this thesis is the proposed unification of these disparate Artificial Intelligence paradigms. The provided information from the systems are exploited by a step by step basis from the RL part of the proposed scheme and by an episodic basis from GP. This makes possible to augment the function set of the GP scheme with adaptable Neural Networks. In the quest to achieve stable behaviour of the RL part of the system a modification of the Actor-Critic
algorithm has been implemented.
Finally we successfully apply the GP method in multi-action control problems extending the spectrum of the problems that this method has been proved to solve. Also we investigated the capability of GP in relation to problems from the food industry. These type of problems exhibit also non-linearity and there is no definite model describing its behaviour
Assessing the Effectiveness of Automated Emotion Recognition in Adults and Children for Clinical Investigation
Recent success stories in automated object or face recognition, partly fuelled by deep learning artiļ¬cial neural network (ANN) architectures, has led to the advancement of biometric research platforms and, to some extent, the resurrection of Artiļ¬cial Intelligence (AI). In line with this general trend, inter-disciplinary approaches have taken place to automate the recognition of emotions in adults or children for the beneļ¬t of various applications such as identiļ¬cation of children emotions prior to a clinical investigation. Within this context, it turns out that automating emotion recognition is far from being straight forward with several challenges arising for both science(e.g., methodology underpinned by psychology) and technology (e.g., iMotions biometric research platform). In this paper, we present a methodology, experiment and interesting ļ¬ndings, which raise the following research questions for the recognition of emotions and attention in humans: a) adequacy of well-established techniques such as the International Affective Picture System (IAPS), b) adequacy of state-of-the-art biometric research platforms, c) the extent to which emotional responses may be different among children or adults. Our ļ¬ndings and ļ¬rst attempts to answer some of these research questions, are all based on a mixed sample of adults and children, who took part in the experiment resulting into a statistical analysis of numerous variables. These are related with, both automatically and interactively, captured responses of participants to a sample of IAPS pictures
Essays in multivariate duration models
Duration analysis, which is also known as survival analysis, is a core
subject of applied statistics and econometrics. Application of duration
analysis techniques can be found in actuarial science, demography,
economics, finance, marketing, and many other scientific fields. In the
univariate case, the tools of duration analysis are used for the study of
the distribution of a certain duration variable which is possibly associated
with a set of explanatory covariates. This variable measures the time to the
occurrence of an event of interest such as transition from unemployment to
employment, retirement time, onset of a disease, purchase of a product. The
main difference between duration analysis and standard regression analysis
is that sometimes the duration variable is right-censored, namely, the only
available information we have is that its realization exceeds a certain
value.
Multivariate duration analysis is the natural extension of the univariate
analysis. In this set up, multiple duration variables, which specify the
time to the occurrence of multiple events, are considered and their joint
distribution is analyzed for describing the association among them. These
variables can be either parallel or sequential. Parallel duration variables
refer to cases in which the multiple duration variables are measured by
using the same reference point of time. On the other hand, sequential
duration variables refer to cases in which the measurement of each duration
variable starts after the realization of some other duration variable. Death
times of twins or the corresponding time of onset of several diseases are
multivariate examples with parallel duration variables. On the other hand,
unemployment duration and the subsequent employment duration is an example
with sequential durations.
The current PhD dissertation deals with multivariate duration models. In
particular, it consists of three independent essays on multivariate duration
models. In the next three paragraphs, a synopsis of each essay is given.
The first essay, written jointly with Gerard J. van den Berg, considers bivariate frailty models in which the frailty
terms enter multiplicatively on the corresponding hazard rates. The frailty
terms capture unobserved or nonmeasurable characteristics that affect the
duration outcomes. We assume that the joint distribution of the frailty
terms is characterized by gamma marginals. In particular, the gamma
distribution is widely used in empirical analysis for modelling the
distribution for the unobserved heterogeneity terms. Both analytical and
graphical arguments have been developed in the past which rationalize this
specific choice. First, the focus of the paper is on the concepts of
negative quadrant dependence and positive quadrant dependence between the
duration variables. Second, two measures of association between the duration
variables are considered; the Pearson's correlation coefficient and the
Kendall's tau. In particular, (sharp) bounds for these measures are derived
and the necessary conditions are discussed discussion is provided about the
conditions which should be satisfied so that the bounds are approached very
well.
The second essay, written jointly with Carlos Hernandez Mireles and Gerard Tellis, is concerned with a new trivariate hazard rate model which
can be applied to study the relationship among the timing of the
corresponding events. The suggested model allows for three types of
dependence among the timing of the underlying events: due to unobserved
heterogeneity, lagged dependence, and due to causality. As shown in the
paper, this model can be nonparametrically identified and as consequence the
three different types of dependence are disentangled. The new model is
adopted to study the endogenous relationship between the timing of three
important events in the sales and prices of new products. Specifically, we
investigate the causal relationship between the sales crash, price crash,
and sales recovery. A sales crash is a significant and permanent cut in the
sales of a new product. On the other hand, the sales recovery is a sales
peak which is realized after the crash. Finally, the price crash is a deep
and permanent reduction in the price of a new product.
The last essay deals with competing risks models which are very popular in
the scientific field of duration analysis. Such models deal with cases in
which we observe only the minimum duration among several multiple durations
for each individual unit under study. The goal of this paper is the
development of statistical properties of the cumulative incidence function.
This function, which is common in the empirical practice, specifies the
probability that a particular duration variable will be realized by a
certain point of time and before the other duration variables. The proposed
estimator is nonparametric, that is, no parametric assumptions are made
regarding the data generating process. In addition, the estimator allows for
Missing At Random observations. More precisely, for some observations we
have information about the value of the minimum duration variable, but not
information about which duration variable is the one smallest value of
realization
Dependence Measures in Bivariate Gamma Frailty Models
Bivariate duration data frequently arise in economics, biostatistics and other areas. In bivariate frailty models, dependence between the frailties (i.e., unobserved determinants) induces dependence between the durations. Using notions of quadrant dependence, we study restrictions that this imposes on the implied dependence of the durations, if the frailty terms act multiplicatively on the corresponding hazard rates. Marginal frailty distributions are often taken to be gamma distributions. For such cases we calculate general bounds for two association measures, Pearson's correlation coefficient and Kendall's tau. The results are employed to compare the flexibility of specific families of bivariate gamma frailty distributions
Aberrant levels of hematopoietic/neuronal growth and differentiation factors in euthyroid women at risk for autoimmune thyroid disease
Background Subjects at risk for major mood disorders have a higher risk to develop autoimmune thyroid disease (AITD) and vice-versa, implying a shared pathogenesis. In mood disorder patients, an abnormal profile of hematopoietic/neuronal growth factors is observed, suggesting that growth/differentiation abnormalities of these cell lineages may predispose to mood disorders. The first objective of our study was to investigate whether an aberrant profile of these hematopoietic/neuronal growth factors is also detectable in subjects at risk for AITD. A second objective was to study the inter relationship of these factors with previously determined and published growth factors/cytokines in the same subjects. Methods We studied 64 TPO-Ab-negative females with at least 1 first-or second-degree relative with AITD, 32 of whom did and 32 who did not seroconvert to TPO-Ab positivity in 5-year follow-up. Subjects were compared with 32 healthy controls (HCs). We measured serum levels of brain-derived neurotrophic factor (BDNF), Stem Cell Factor (SCF), Insulin-like Growth Factor-Binding Protein 2 (IGFBP-2), Epidermal Growth Factor (EGF) and IL-7 at baseline. Results BDNF was significantly lower (8.2 vs 18.9 ng/ml, P<0.001), while EGF (506.9 vs 307.6 pg/ml, P = 0.003) and IGFBP-2 (388.3 vs 188.5 ng/ml, P = 0.028) were significantly higher in relatives than in HCs. Relatives who seroconverted in the next 5 years had significantly higher levels of SCF than non-seroconverters (26.5 vs 16.7 pg/ml, P = 0.017). In a cluster analysis with the previously published growth factors/cytokines SCF clustered together with IL-1Ī², IL-6 and CCL-3, of which high levels also prec
Selenium supplementation modulates apoptotic processes in thyroid follicular cells
Selenium (Se) is an essential micronutrient modulating several physiopathological processes in the human body. The aim of the study is to characterize the molecular effects determined by Se-supplementation in thyroid follicular cells, using as model the well-differentiated rat thyroid follicular cell line FRTL5. Experiments have been performed to evaluate the effects of Se on cell growth, mortality and proliferation and on modulation of pro- and antiapoptotic pathways. The results indicate that Se-supplementation improves FRTL5 growth rate. Furthermore, Se reduces the proportion of cell death and modulates both proapoptotic (p53 and Bim) and antiapoptotic (NF-kB and Bcl2) mRNA levels. In addition, incubation with high doses of Na-Se might prevent the ER-stress apoptosis induced by tunicamycin, as assessed by membrane integrity maintenance, reduction in caspase 3/7 activities, and reduction in Casp-3 and PARP cleavage. Taken together, these results provide molecular evidences indicating the role of Se supplementation on cell death and apoptosis modulation in thyroid follicular cells. These observations may be useful to understand the effects of this micronutrient on the physiopathology of the thyroid gland. Ā© 2017 The Authors BioFactors published by Wiley Periodicals, Inc. on behalf of International Union of Biochemistry and Molecular Biology, 2017
- ā¦