49 research outputs found
Recommended from our members
Essays in information relaxations and scenario analysis for partially observable settings
This dissertation consists of three main essays in which we study important problems in engineering and finance.
In the first part of this dissertation, we study the use of Information Relaxations to obtain dual bounds in the context of Partially Observable Markov Decision Processes (POMDPs). POMDPs are in general intractable problems and the best we can do is obtain suboptimal policies. To evaluate these policies, we investigate and extend the information relaxation approach developed originally for Markov Decision Processes. The use of information relaxation duality for POMDPs presents important challenges, and we show how change-of-measure arguments can be used to overcome them. As a second contribution, we show that many value function approximations for POMDPs are supersolutions. By constructing penalties from supersolutions we are able to achieve significant variance reduction when estimating the duality gap directly, and the resulting dual bounds are guaranteed to provide tighter bounds than those provided by the supersolutions themselves. Applications in robotic navigation and telecommunications are given in Chapter 2. A further application of this approach is provided in Chapter 5 in the context of personalized medicine.
In the second part of this dissertation, we discuss a number of weaknesses inherent in traditional scenario analysis. For instance, the standard approach to scenario analysis aims to compute the P&L of a portfolio resulting from joint stresses to underlying risk factors, leaving all unstressed risk factors set to zero. This approach ignores thereby the conditional distribution of the unstressed risk factors given the stressed risk factors. We address these weaknesses by embedding the scenario analysis within a dynamic factor model for the underlying risk factors. We recur to multivariate state-space models that allow the modeling of real-world behavior of financial markets, like volatility clustering for example. Additionally, these models are sufficiently tractable to permit the computation (or simulation from) the conditional distribution of unstressed risk factors. Our approach permits the use of observable and unobservable risk factors. We provide applications to fixed income and options portfolios, where we are able to show the degree in which the two scenario analysis approaches can lead to dramatic differences.
In the third part, we propose a framework to study a Human-Machine interaction system within the context of financial Robo-advising. In this setting, based on risk-sensitive dynamic games, the robo-advisor adaptively learns the preferences of the investor as the investor makes decisions that optimize her risk-sensitive criterion. The investor and machine's objectives are aligned but the presence of asymmetric information makes this joint optimization process a game with strategic interactions. By considering an investor with mean-variance risk preferences we are able to reduce the game to a POMDP. The human-machine interaction protocol features a trade-off between allowing the robo-advisor to learn the investors preferences through costly communications and optimizing the investor's objective relying on outdated information
Decision making in an uncertain world
Campus Scene; Undatedhttps://egrove.olemiss.edu/phay_laf/1508/thumbnail.jp
Recommended from our members
Advances in Modeling Natural Resource Management under Uncertainty : Forest Mortality, Policy Design, and the Value of Information
Advancing the understanding of natural resource management is an important step in mitigating the effects of human activity on the environment, and ensuring efficient outcomes for many sectors of the economy. As humanity’s role in the natural world becomes better understood, the importance of interdisciplinary modeling has grown in leaps and bounds. This is evidenced by the rise of fields such as bioeconomics, the economics of climate change, and the increasing influence of “societal dimensions” departments in universities around the country. It is becoming evident that a holistic understanding of feedbacks between the natural and economic realms is crucial for developing the research agenda of tomorrow. In addition, advances in computing resources have made research questions previously restricted by their computational complexity viable for analysis. Both of these developments bode well for interdisciplinary modeling; however, much of these developments remain unrealized in the literature. For instance, the continued utilization of large scale earth system models such as the Community Earth System Model (CESM) for impact studies (e.g. Law et al., 2018) has highlighted the importance of representing the social systems accurately within the model. Despite this, the use of natural resource models that are consistent with economic theory are nowhere to be found amongst the many modules of CESM, or other similar models. Instead, economic models are used to inform the input datasets of these models, which is rigorous but unsatisfying once one realizes that this approach completely fails to capture the feedback between the natural and social systems that intuition tells us is there. The lack of such modeling also precludes running sophisticated policy experiments within CESM and her sister models. These policy experiments, with their robust representations of physical processes, can be better positioned to examine the effect of these policies on a variety of outcomes, both environmental and economic than what currently exists. This is in addition to the fact that there are still many aspects of policy design that are unexplored in natural resource management. The details about the design of environmental policies, especially those targeting the private provision of ecosystem benefits, must be fine tuned to achieve an optimal outcome. One particular aspect of policy design that is understudied in the literature is that of the duration of contracts for ecosystem service programs. Many policies currently in practice base the duration of the contract on environmental goals of the policy. However, economic incentives could change the impacts of the policy should the duration be changed. The efficient design of policies depends on the feedbacks between social and natural systems. Though models such as CESM can address uncertainties about future effects of climate change and disturbance, it is a deterministic model of natural resources. In reality, natural resources effectively behave in a stochastic manner. This results in management strategies that require substantial investments in monitoring and learning, as good information is crucial for optimal management. This has led to many studies examining adaptive management of natural resources, and learning in systems such as fisheries (Kling et al. 2017), livestock management (MacLachlan et al., 2017), and regulatory enforcement (White, 2005). There is a substantial gap in what the literature addresses. Previous studies ignore the role of price stochasticity, as well as stochasticity in other observable variables, in determining the optimal learning strategy of natural resource owners. This is a more generalized description of natural resource management that has implications far outside of private natural resource management. This dissertation advances the the design and application of modeling techniques in natural resource management, as well as theory behind these models. In what follows, we analyze the feedback between natural and social systems in forestry. We show that the forest sector adapts to disturbance events such as wildfire or pine beetle outbreaks through shifting harvests to different areas. This model has the potential to improve the representation of social systems within large scale earth system models, and to allow for economic policy experiments on a larger scale than what has been previously observed in the literature. We explore the economics of contract duration within a forest-based carbon offset program, which is the first time such a question has been addressed through modeling. It also contributes to current discussions of implementing forest-based carbon offsets in Oregon’s carbon abatement plan. This dissertation achieves an advancement of the economics of information in partially observable resource systems by solving a model of forest management where the volume of timber is observed imperfectly, and observations are costly and noisy. In Chapter 1, I introduce the common themes of the dissertation, and provide an overview of what is to follow. The natural resource system this work addresses is primarily forestry. In particular, it focuses on the issues surrounding ecosystem service provision and management within private forestry. In Chapter 2, I construct a partial equilibrium (PE) model of the forest sector in the western United States. The model is spatially explicit, and overcomes issues involving its solve time by utilizing a novel algorithm that simulates an auction between agents in the model. Furthermore, the model can be coupled to CESM in order to obtain a more realistic representation of biological processes and climate change relative to what is available to forest sector models currently. The realism of the model is aided by the incorporation of numerous datasets such as land ownership and transportation costs. The model is unique in its scale, and is solvable over a larger range and with a higher resolution than other forest sector models. It also has a realistic depiction of the ecology of forestry through its ability to couple to CESM. This model is particularly useful for modeling the feedback between the natural system of the forest and economic system of the forest sector. Specifically, it’s beneficial for understanding the impact of forest disturbances on the economy, and how that shapes future disturbance patterns. The results suggest that in the short run, the spatial distribution of harvests changes substantially, with the difference in overall harvests growing over time due to the effects the disturbances have on mill capacity and profitability. We also utilize our model for understanding the impacts of policies specifically addressing disturbance vulnerability, as well as the impacts of state-level policies and how those may affect the surrounding region. In Chapter 3, I utilize a regional forest sector model of western Oregon in order to analyze the effects of changing the duration of forest-based carbon offset contracts. The model is a spatially explicit model that tracks both sawtimber and pulp production, as well as price levels and mill capacities. It keeps track of the amount of timber being exported as well, and average management decisions such as rotation lengths. The model is applied to scenarios that vary in the duration of the contract as well as the price of the carbon, which is fixed during the model run. Whereas previous studies have examined the effects of these contracts on the Oregon forest sector (Latta et al., 2011), no study has yet addressed the role of contract duration on enrollment and program performance. We find that market forces stabilize the amount of carbon being removed from the landscape every time step. This analysis is useful in serving as a critique of current approaching to contracting for forest-based carbon offset programs such as the one in California by showing that alternative contract lengths are capable of higher levels of sequestration over given time periods. In Chapter 4, I construct a model of forest management under state uncertainty that optimizes both the timing of harvest as well as measurement of the forest resource, known as “inventory”. Forest resources, along with practically every other natural resource, exhibit state uncertainty – uncertainty about the present state of the resource. Oftentimes natural resources are only observed when investments are made in measurement of the resource. Furthermore, a perfect measurement of the resource is oftentimes infeasible, either for reasons having to do with the biology of the resource or because it is cost prohibitive. In this chapter I solve the forest manager’s problem under state uncertainty as a continuous-state Mixed Observability Markov Decision Process (MOMDP). I find that the optimal timing of learning is influenced not just by price level, but surprisingly by price stochasticity as well. Chapter 4’s innovation is that it presents the first continuous state model of natural resource management under state uncertainty that includes price stochasticity. For a majority of natural resource management problems, price stochasticity plays an important role, and the results from this project allow us to understand how it influences not just harvest timing, but the optimal investments in measurement and learning. We find that learning is valuable. Using an empirical model of forest growth that captures its natural stochasticity, we are able to calculate the costs associated with state uncertainty when inventory is not an option. We find that conducting costly yet accurate inventories in an optimal way greatly reduces the burden of state uncertainty, and increases the value of the stand through improved management. This chapter also presents the first model of forest inventory that is grounded in microeconomic theory. The expansion of interdisciplinary research as well as the availability of new computational techniques in the field of economics have resulted in opportunities for researchers looking to address difficult problems in natural resource economics. My dissertation is a combination of methodological advances, as well as inquiries into potential policy applications. I hope that what follows from here will aid both future researchers interested in similar topics, as well as policymakers with questions about the design of schemes targeting private forest landowners. The extensions and limitations of all of these studies will be discussed as they are presented. Because of the methodological nature of much of this dissertation’s content, the possibility exists to greatly expand on what has been done here in future studies
Improved Intention Discovery with Classified Emotions in A Modified POMDP
Emotions are one of the most proactive topics in psychology, a basis of forceful conversation and divergence from the earliest philosophers and other thinkers to the present day. Human emotion classification using different machine learning techniques is an active area of research over the last decade. This investigation discusses a new approach for virtual agents to better understand and interact with the user. Our research focuses on deducing the belief state of a user who interacts with a single agent using recognized emotions from the text/speech based input. We built a customized decision tree with six primary states of emotions being recognized from different sets of inputs. The belief state at each given instance of time slice is inferred by drawing a belief network using the different sets of emotions and calculating state of belief using a POMDP (Partially Observable Markov Decision Process) based solver. Hence the existing POMDP model is customized in order to incorporate emotion as observations for finding the possible user intentions. This helps to overcome the limitations of the present methods to better recognize the belief state. As well, the new approach allows us to analyze human emotional behaviour in indefinite environments and helps to generate an effective interaction between the human and the computer
Discrete Event Simulations
Considered by many authors as a technique for modelling stochastic, dynamic and discretely evolving systems, this technique has gained widespread acceptance among the practitioners who want to represent and improve complex systems. Since DES is a technique applied in incredibly different areas, this book reflects many different points of view about DES, thus, all authors describe how it is understood and applied within their context of work, providing an extensive understanding of what DES is. It can be said that the name of the book itself reflects the plurality that these points of view represent. The book embraces a number of topics covering theory, methods and applications to a wide range of sectors and problem areas that have been categorised into five groups. As well as the previously explained variety of points of view concerning DES, there is one additional thing to remark about this book: its richness when talking about actual data or actual data based analysis. When most academic areas are lacking application cases, roughly the half part of the chapters included in this book deal with actual problems or at least are based on actual data. Thus, the editor firmly believes that this book will be interesting for both beginners and practitioners in the area of DES
Many-agent Reinforcement Learning
Multi-agent reinforcement learning (RL) solves the problem of how each agent should behave optimally in a stochastic environment in which multiple agents are learning simultaneously. It is an interdisciplinary domain with a long history that lies in the joint area of psychology, control theory, game theory, reinforcement learning, and deep learning. Following the remarkable success of the AlphaGO series in single-agent RL, 2019 was a booming year that witnessed significant advances in multi-agent RL techniques; impressive breakthroughs have been made on developing AIs that outperform humans on many challenging tasks, especially multi-player video games. Nonetheless, one of the key challenges of multi-agent RL techniques is the scalability; it is still non-trivial to design efficient learning algorithms that can solve tasks including far more than two agents (), which I name by \emph{many-agent reinforcement learning} (MARL\footnote{I use the world of ``MARL" to denote multi-agent reinforcement learning with a particular focus on the cases of many agents; otherwise, it is denoted as ``Multi-Agent RL" by default.}) problems. In this thesis, I contribute to tackling MARL problems from four aspects. Firstly, I offer a self-contained overview of multi-agent RL techniques from a game-theoretical perspective. This overview fills the research gap that most of the existing work either fails to cover the recent advances since 2010 or does not pay adequate attention to game theory, which I believe is the cornerstone to solving many-agent learning problems. Secondly, I develop a tractable policy evaluation algorithm -- -Rank -- in many-agent systems. The critical advantage of -Rank is that it can compute the solution concept of -Rank tractably in multi-player general-sum games with no need to store the entire pay-off matrix. This is in contrast to classic solution concepts such as Nash equilibrium which is known to be -hard in even two-player cases. -Rank allows us, for the first time, to practically conduct large-scale multi-agent evaluations. Thirdly, I introduce a scalable policy learning algorithm -- mean-field MARL -- in many-agent systems. The mean-field MARL method takes advantage of the mean-field approximation from physics, and it is the first provably convergent algorithm that tries to break the curse of dimensionality for MARL tasks. With the proposed algorithm, I report the first result of solving the Ising model and multi-agent battle games through a MARL approach. Fourthly, I investigate the many-agent learning problem in open-ended meta-games (i.e., the game of a game in the policy space). Specifically, I focus on modelling the behavioural diversity in meta-games, and developing algorithms that guarantee to enlarge diversity during training. The proposed metric based on determinantal point processes serves as the first mathematically rigorous definition for diversity. Importantly, the diversity-aware learning algorithms beat the existing state-of-the-art game solvers in terms of exploitability by a large margin. On top of the algorithmic developments, I also contribute two real-world applications of MARL techniques. Specifically, I demonstrate the great potential of applying MARL to study the emergent population dynamics in nature, and model diverse and realistic interactions in autonomous driving. Both applications embody the prospect that MARL techniques could achieve huge impacts in the real physical world, outside of purely video games
Combining evolutionary algorithms and agent-based simulation for the development of urbanisation policies
Urban-planning authorities continually face the problem of optimising the allocation of green space over time in developing urban environments. To help in these decision-making processes, this thesis provides an empirical study of using evolutionary approaches to solve sequential decision making problems under uncertainty in stochastic environments. To achieve this goal, this work is underpinned by developing a theoretical framework based on the economic model of Alonso and the associated methodology for modelling spatial and temporal urban growth, in order to better understand the complexity inherent in this kind of system and to generate and improve relevant knowledge for the urban planning community. The model was hybridised with cellular automata and agent-based model and extended to encompass green space planning based on urban cost and satisfaction. Monte Carlo sampling techniques and the use of the urban model as a surrogate tool were the two main elements investigated and applied to overcome the noise and uncertainty derived from dealing with future trends and expectations. Once the evolutionary algorithms were equipped with these mechanisms, the problem under consideration was defined and characterised as a type of adaptive submodular. Afterwards, the performance of a non-adaptive evolutionary approach with a random search and a very smart greedy algorithm was compared and in which way the complexity that is linked with the configuration of the problem modifies the performance of both algorithms was analysed. Later on, the application of very distinct frameworks incorporating evolutionary algorithm approaches for this problem was explored: (i) an ‘offline’ approach, in which a candidate solution encodes a complete set of decisions, which is then evaluated by full simulation, and (ii) an ‘online’ approach which involves a sequential series of optimizations, each making only a single decision, and starting its simulations from the endpoint of the previous run
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents
Humans are social beings; we pursue social goals in our daily interactions,
which is a crucial aspect of social intelligence. Yet, AI systems' abilities in
this realm remain elusive. We present SOTOPIA, an open-ended environment to
simulate complex social interactions between artificial agents and evaluate
their social intelligence. In our environment, agents role-play and interact
under a wide variety of scenarios; they coordinate, collaborate, exchange, and
compete with each other to achieve complex social goals. We simulate the
role-play interaction between LLM-based agents and humans within this task
space and evaluate their performance with a holistic evaluation framework
called SOTOPIA-Eval. With SOTOPIA, we find significant differences between
these models in terms of their social intelligence, and we identify a subset of
SOTOPIA scenarios, SOTOPIA-hard, that is generally challenging for all models.
We find that on this subset, GPT-4 achieves a significantly lower goal
completion rate than humans and struggles to exhibit social commonsense
reasoning and strategic communication skills. These findings demonstrate
SOTOPIA's promise as a general platform for research on evaluating and
improving social intelligence in artificial agents.Comment: Preprint, 43 pages. The first two authors contribute equall