381 research outputs found
Controller Synthesis for Autonomous Systems Interacting With Human Operators
We propose an approach to synthesize control protocols for autonomous systems that account for uncertainties and imperfections in interactions with human operators. As an illustrative example, we consider a scenario involving road network surveillance by an unmanned aerial vehicle (UAV) that is controlled remotely by a human operator but also has a certain degree of autonomy. Depending on the type (i.e., probabilistic and/or nondeterministic) of knowledge about the uncertainties and imperfections in the operatorautonomy interactions, we use abstractions based on Markov decision processes and augment these models to stochastic two-player games. Our approach enables the synthesis of operator-dependent optimal mission plans for the UAV, highlighting the effects of operator characteristics (e.g., workload, proficiency, and fatigue) on UAV mission performance; it can also provide informative feedback (e.g., Pareto curves showing the trade-offs between multiple mission objectives), potentially assisting the operator in decision-making
"How May I Help You?": Modeling Twitter Customer Service Conversations Using Fine-Grained Dialogue Acts
Given the increasing popularity of customer service dialogue on Twitter,
analysis of conversation data is essential to understand trends in customer and
agent behavior for the purpose of automating customer service interactions. In
this work, we develop a novel taxonomy of fine-grained "dialogue acts"
frequently observed in customer service, showcasing acts that are more suited
to the domain than the more generic existing taxonomies. Using a sequential
SVM-HMM model, we model conversation flow, predicting the dialogue act of a
given turn in real-time. We characterize differences between customer and agent
behavior in Twitter customer service conversations, and investigate the effect
of testing our system on different customer service industries. Finally, we use
a data-driven approach to predict important conversation outcomes: customer
satisfaction, customer frustration, and overall problem resolution. We show
that the type and location of certain dialogue acts in a conversation have a
significant effect on the probability of desirable and undesirable outcomes,
and present actionable rules based on our findings. The patterns and rules we
derive can be used as guidelines for outcome-driven automated customer service
platforms.Comment: 13 pages, 6 figures, IUI 201
Synchronization and Control in Intrinsic and Designed Computation: An Information-Theoretic Analysis of Competing Models of Stochastic Computation
We adapt tools from information theory to analyze how an observer comes to
synchronize with the hidden states of a finitary, stationary stochastic
process. We show that synchronization is determined by both the process's
internal organization and by an observer's model of it. We analyze these
components using the convergence of state-block and block-state entropies,
comparing them to the previously known convergence properties of the Shannon
block entropy. Along the way, we introduce a hierarchy of information
quantifiers as derivatives and integrals of these entropies, which parallels a
similar hierarchy introduced for block entropy. We also draw out the duality
between synchronization properties and a process's controllability. The tools
lead to a new classification of a process's alternative representations in
terms of minimality, synchronizability, and unifilarity.Comment: 25 pages, 13 figures, 1 tabl
Actor-Critic Policy Learning in Cooperative Planning
In this paper, we introduce a method for learning and adapting cooperative control strategies in real-time stochastic domains. Our framework is an instance of the intelligent cooperative control architecture (iCCA)[superscript 1]. The agent starts by following the "safe" plan calculated by the planning module and incrementally adapting its policy to maximize the cumulative rewards. Actor-critic and consensus-based bundle algorithm (CBBA) were employed as the building blocks of the iCCA framework. We demonstrate the performance of our approach by simulating limited fuel unmanned aerial vehicles aiming for stochastic targets. In one experiment where the optimal solution can be calculated, the integrated framework boosted the optimality of the solution by an average of %10, when compared to running each of the modules individually, while keeping the computational load within the requirements for real-time implementation.Boeing Scientific Research LaboratoriesUnited States. Air Force Office of Scientific Research (Grant FA9550-08-1-0086
Chronic psychosocial and financial burden accelerates 5-year telomere shortening: findings from the Coronary Artery Risk Development in Young Adults Study.
Leukocyte telomere length, a marker of immune system function, is sensitive to exposures such as psychosocial stressors and health-maintaining behaviors. Past research has determined that stress experienced in adulthood is associated with shorter telomere length, but is limited to mostly cross-sectional reports. We test whether repeated reports of chronic psychosocial and financial burden is associated with telomere length change over a 5-year period (years 15 and 20) from 969 participants in the Coronary Artery Risk Development in Young Adults (CARDIA) Study, a longitudinal, population-based cohort, ages 18-30 at time of recruitment in 1985. We further examine whether multisystem resiliency, comprised of social connections, health-maintaining behaviors, and psychological resources, mitigates the effects of repeated burden on telomere attrition over 5 years. Our results indicate that adults with high chronic burden do not show decreased telomere length over the 5-year period. However, these effects do vary by level of resiliency, as regression results revealed a significant interaction between chronic burden and multisystem resiliency. For individuals with high repeated chronic burden and low multisystem resiliency (1 SD below the mean), there was a significant 5-year shortening in telomere length, whereas no significant relationships between chronic burden and attrition were evident for those at moderate and higher levels of resiliency. These effects apply similarly across the three components of resiliency. Results imply that interventions should focus on establishing strong social connections, psychological resources, and health-maintaining behaviors when attempting to ameliorate stress-related decline in telomere length among at-risk individuals
Measurement-Adaptive Cellular Random Access Protocols
This work considers a single-cell random access channel (RACH) in cellular
wireless networks. Communications over RACH take place when users try to
connect to a base station during a handover or when establishing a new
connection. Within the framework of Self-Organizing Networks (SONs), the system
should self- adapt to dynamically changing environments (channel fading,
mobility, etc.) without human intervention. For the performance improvement of
the RACH procedure, we aim here at maximizing throughput or alternatively
minimizing the user dropping rate. In the context of SON, we propose protocols
which exploit information from measurements and user reports in order to
estimate current values of the system unknowns and broadcast global
action-related values to all users. The protocols suggest an optimal pair of
user actions (transmission power and back-off probability) found by minimizing
the drift of a certain function. Numerical results illustrate considerable
benefits of the dropping rate, at a very low or even zero cost in power
expenditure and delay, as well as the fast adaptability of the protocols to
environment changes. Although the proposed protocol is designed to minimize
primarily the amount of discarded users per cell, our framework allows for
other variations (power or delay minimization) as well.Comment: 31 pages, 13 figures, 3 tables. Springer Wireless Networks 201
The Complexity of Graph-Based Reductions for Reachability in Markov Decision Processes
We study the never-worse relation (NWR) for Markov decision processes with an
infinite-horizon reachability objective. A state q is never worse than a state
p if the maximal probability of reaching the target set of states from p is at
most the same value from q, regard- less of the probabilities labelling the
transitions. Extremal-probability states, end components, and essential states
are all special cases of the equivalence relation induced by the NWR. Using the
NWR, states in the same equivalence class can be collapsed. Then, actions
leading to sub- optimal states can be removed. We show the natural decision
problem associated to computing the NWR is coNP-complete. Finally, we ex- tend
a previously known incomplete polynomial-time iterative algorithm to
under-approximate the NWR
Maximizing the Conditional Expected Reward for Reaching the Goal
The paper addresses the problem of computing maximal conditional expected
accumulated rewards until reaching a target state (briefly called maximal
conditional expectations) in finite-state Markov decision processes where the
condition is given as a reachability constraint. Conditional expectations of
this type can, e.g., stand for the maximal expected termination time of
probabilistic programs with non-determinism, under the condition that the
program eventually terminates, or for the worst-case expected penalty to be
paid, assuming that at least three deadlines are missed. The main results of
the paper are (i) a polynomial-time algorithm to check the finiteness of
maximal conditional expectations, (ii) PSPACE-completeness for the threshold
problem in acyclic Markov decision processes where the task is to check whether
the maximal conditional expectation exceeds a given threshold, (iii) a
pseudo-polynomial-time algorithm for the threshold problem in the general
(cyclic) case, and (iv) an exponential-time algorithm for computing the maximal
conditional expectation and an optimal scheduler.Comment: 103 pages, extended version with appendices of a paper accepted at
TACAS 201
Mean-Payoff Optimization in Continuous-Time Markov Chains with Parametric Alarms
Continuous-time Markov chains with alarms (ACTMCs) allow for alarm events
that can be non-exponentially distributed. Within parametric ACTMCs, the
parameters of alarm-event distributions are not given explicitly and can be
subject of parameter synthesis. An algorithm solving the -optimal
parameter synthesis problem for parametric ACTMCs with long-run average
optimization objectives is presented. Our approach is based on reduction of the
problem to finding long-run average optimal strategies in semi-Markov decision
processes (semi-MDPs) and sufficient discretization of parameter (i.e., action)
space. Since the set of actions in the discretized semi-MDP can be very large,
a straightforward approach based on explicit action-space construction fails to
solve even simple instances of the problem. The presented algorithm uses an
enhanced policy iteration on symbolic representations of the action space. The
soundness of the algorithm is established for parametric ACTMCs with
alarm-event distributions satisfying four mild assumptions that are shown to
hold for uniform, Dirac and Weibull distributions in particular, but are
satisfied for many other distributions as well. An experimental implementation
shows that the symbolic technique substantially improves the efficiency of the
synthesis algorithm and allows to solve instances of realistic size.Comment: This article is a full version of a paper accepted to the Conference
on Quantitative Evaluation of SysTems (QEST) 201
The Impatient May Use Limited Optimism to Minimize Regret
Discounted-sum games provide a formal model for the study of reinforcement
learning, where the agent is enticed to get rewards early since later rewards
are discounted. When the agent interacts with the environment, she may regret
her actions, realizing that a previous choice was suboptimal given the behavior
of the environment. The main contribution of this paper is a PSPACE algorithm
for computing the minimum possible regret of a given game. To this end, several
results of independent interest are shown. (1) We identify a class of
regret-minimizing and admissible strategies that first assume that the
environment is collaborating, then assume it is adversarial---the precise
timing of the switch is key here. (2) Disregarding the computational cost of
numerical analysis, we provide an NP algorithm that checks that the regret
entailed by a given time-switching strategy exceeds a given value. (3) We show
that determining whether a strategy minimizes regret is decidable in PSPACE
- …