Search CORE

9,625 research outputs found

Trading Performance for Stability in Markov Decision Processes

Author: Brázdil Tomáš
Chatterjee Krishnendu
Forejt Vojtěch
Kučera Antonín
Publication venue
Publication date: 01/01/2013
Field of study

We study the complexity of central controller synthesis problems for finite-state Markov decision processes, where the objective is to optimize both the expected mean-payoff performance of the system and its stability. We argue that the basic theoretical notion of expressing the stability in terms of the variance of the mean-payoff (called global variance in our paper) is not always sufficient, since it ignores possible instabilities on respective runs. For this reason we propose alernative definitions of stability, which we call local and hybrid variance, and which express how rewards on each run deviate from the run's own mean-payoff and from the expected mean-payoff, respectively. We show that a strategy ensuring both the expected mean-payoff and the variance below given bounds requires randomization and memory, under all the above semantics of variance. We then look at the problem of determining whether there is a such a strategy. For the global variance, we show that the problem is in PSPACE, and that the answer can be approximated in pseudo-polynomial time. For the hybrid variance, the analogous decision problem is in NP, and a polynomial-time approximating algorithm also exists. For local variance, we show that the decision problem is in NP. Since the overall performance can be traded for stability (and vice versa), we also present algorithms for approximating the associated Pareto curve in all the three cases. Finally, we study a special case of the decision problems, where we require a given expected mean-payoff together with zero variance. Here we show that the problems can be all solved in polynomial time.Comment: Extended version of a paper presented at LICS 201

arXiv.org e-Print Archive

Crossref

Oxford University Research Archive

IST Austria: PubRep (Institute of Science and Technology)

Recommended from our members

Structural balance emerges and explains performance in risky decision-making.

Author: Askarisichani Omid
Bullo Francesco
Friedkin Noah E
Lane Jacqueline Ng
Singh Ambuj K
Uzzi Brian
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Polarization affects many forms of social organization. A key issue focuses on which affective relationships are prone to change and how their change relates to performance. In this study, we analyze a financial institutional over a two-year period that employed 66 day traders, focusing on links between changes in affective relations and trading performance. Traders' affective relations were inferred from their IMs (>2 million messages) and trading performance was measured from profit and loss statements (>1 million trades). Here, we find that triads of relationships, the building blocks of larger social structures, have a propensity towards affective balance, but one unbalanced configuration resists change. Further, balance is positively related to performance. Traders with balanced networks have the "hot hand", showing streaks of high performance. Research implications focus on how changes in polarization relate to performance and polarized states can depolarize

eScholarship - University of California

Real-time Tactical and Strategic Sales Management for Intelligent Agents Guided By Economic Regimes

Author: Collins J.
Gini M.
Gupta A.
Ketter W.
Schrater P.
Publication venue
Publication date
Field of study

Many enterprises that participate in dynamic markets need to make product pricing and inventory resource utilization decisions in real-time. We describe a family of statistical models that address these needs by combining characterization of the economic environment with the ability to predict future economic conditions to make tactical (short-term) decisions, such as product pricing, and strategic (long-term) decisions, such as level of finished goods inventories. Our models characterize economic conditions, called economic regimes, in the form of recurrent statistical patterns that have clear qualitative interpretations. We show how these models can be used to predict prices, price trends, and the probability of receiving a customer order at a given price. These â€œregimeâ€ models are developed using statistical analysis of historical data, and are used in real-time to characterize observed market conditions and predict the evolution of market conditions over multiple time scales. We evaluate our models using a testbed derived from the Trading Agent Competition for Supply Chain Management (TAC SCM), a supply chain environment characterized by competitive procurement and sales markets, and dynamic pricing. We show how regime models can be used to inform both short-term pricing decisions and longterm resource allocation decisions. Results show that our method outperforms more traditional shortand long-term predictive modeling approaches.dynamic pricing;trading agent competition;agent-mediated electronic commerce;dynamic markets;economic regimes;enabling technologies;price forecasting;supply-chain

Research Papers in Economics

Expectations or Guarantees? I Want It All! A crossroad between games and MDPs

Author: Bruyère Véronique
Filiot Emmanuel
Randour Mickael
Raskin Jean-François
Publication venue: 'Open Publishing Association'
Publication date: 01/04/2014
Field of study

When reasoning about the strategic capabilities of an agent, it is important to consider the nature of its adversaries. In the particular context of controller synthesis for quantitative specifications, the usual problem is to devise a strategy for a reactive system which yields some desired performance, taking into account the possible impact of the environment of the system. There are at least two ways to look at this environment. In the classical analysis of two-player quantitative games, the environment is purely antagonistic and the problem is to provide strict performance guarantees. In Markov decision processes, the environment is seen as purely stochastic: the aim is then to optimize the expected payoff, with no guarantee on individual outcomes. In this expository work, we report on recent results introducing the beyond worst-case synthesis problem, which is to construct strategies that guarantee some quantitative requirement in the worst-case while providing an higher expected value against a particular stochastic model of the environment given as input. This problem is relevant to produce system controllers that provide nice expected performance in the everyday situation while ensuring a strict (but relaxed) performance threshold even in the event of very bad (while unlikely) circumstances. It has been studied for both the mean-payoff and the shortest path quantitative measures.Comment: In Proceedings SR 2014, arXiv:1404.041

arXiv.org e-Print Archive

Directory of Open Access Journals

DI-fusion

Fractality of profit landscapes and validation of time series models for stock prices

Author: Kim Beom Jun
Oh Gabjin
Yi Il Gu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 05/08/2013
Field of study

We apply a simple trading strategy for various time series of real and artificial stock prices to understand the origin of fractality observed in the resulting profit landscapes. The strategy contains only two parameters

p

and

q

, and the sell (buy) decision is made when the log return is larger (smaller) than

p

(

-q

). We discretize the unit square

(p, q) \in [0, 1] \times [0, 1]

into the

N \times N

square grid and the profit

\Pi (p, q)

is calculated at the center of each cell. We confirm the previous finding that local maxima in profit landscapes are scattered in a fractal-like fashion: The number M of local maxima follows the power-law form

M \sim N^{a}

, but the scaling exponent

a

is found to differ for different time series. From comparisons of real and artificial stock prices, we find that the fat-tailed return distribution is closely related to the exponent

a \approx 1.6

observed for real stock markets. We suggest that the fractality of profit landscape characterized by

a \approx 1.6

can be a useful measure to validate time series model for stock prices.Comment: 10pages, 6figure

arXiv.org e-Print Archive

EDP Sciences OAI-PMH repository (1.2.0)