159 research outputs found

    Structured Possibilistic Planning Using Decision Diagrams

    Get PDF
    National audienceQualitative Possibilistic Mixed-Observable MDPs (π-MOMDPs), generalizing π-MDPs and π-POMDPs, are well-suited models to planning under uncertainty with mixed-observability when transition, observation and reward functions are not precisely known and can be qualitatively described. Functions defining the model as well as intermediate calculations are valued in a finite possibilistic scale L, which induces a finite belief state space under partial observability contrary to its probabilistic counterpart. In this paper, we propose the first study of factored π-MOMDP models in order to solve large structured planning problems under qualitative uncertainty, or considered as qualitative approximations of probabilistic problems. Building upon the SPUDD algorithm for solving factored (probabilistic) MDPs, we conceived a symbolic algorithm named PPUDD for solving factored π-MOMDPs. Whereas SPUDD’s decision diagrams’ leaves may be as large as the state space since their values are real numbers aggregated through additions and multiplications, PPUDD’s ones always remain in the finite scale L via min and max operations only. Our experiments show that PPUDD’s computation time is much lower than SPUDD, Symbolic-HSVI and APPL for possibilistic and probabilistic versions of the same benchmarks under either total or mixed observability, while still providing high-quality policies

    Lexicographic refinements in possibilistic sequential decision-making models

    Get PDF
    Ce travail contribue à la théorie de la décision possibiliste et plus précisément à la prise de décision séquentielle dans le cadre de la théorie des possibilités, à la fois au niveau théorique et pratique. Bien qu'attrayante pour sa capacité à résoudre les problèmes de décision qualitatifs, la théorie de la décision possibiliste souffre d'un inconvénient important : les critères d'utilité qualitatives possibilistes comparent les actions avec les opérateurs min et max, ce qui entraîne un effet de noyade. Pour surmonter ce manque de pouvoir décisionnel, plusieurs raffinements ont été proposés dans la littérature. Les raffinements lexicographiques sont particulièrement intéressants puisqu'ils permettent de bénéficier de l'arrière-plan de l'utilité espérée, tout en restant "qualitatifs". Cependant, ces raffinements ne sont définis que pour les problèmes de décision non séquentiels. Dans cette thèse, nous présentons des résultats sur l'extension des raffinements lexicographiques aux problèmes de décision séquentiels, en particulier aux Arbres de Décision et aux Processus Décisionnels de Markov possibilistes. Cela aboutit à des nouveaux algorithmes de planification plus "décisifs" que leurs contreparties possibilistes. Dans un premier temps, nous présentons des relations de préférence lexicographiques optimistes et pessimistes entre les politiques avec et sans utilités intermédiaires, qui raffinent respectivement les utilités possibilistes optimistes et pessimistes. Nous prouvons que les critères proposés satisfont le principe de l'efficacité de Pareto ainsi que la propriété de monotonie stricte. Cette dernière garantit la possibilité d'application d'un algorithme de programmation dynamique pour calculer des politiques optimales. Nous étudions tout d'abord l'optimisation lexicographique des politiques dans les Arbres de Décision possibilistes et les Processus Décisionnels de Markov à horizon fini. Nous fournissons des adaptations de l'algorithme de programmation dynamique qui calculent une politique optimale en temps polynomial. Ces algorithmes sont basés sur la comparaison lexicographique des matrices de trajectoires associées aux sous-politiques. Ce travail algorithmique est complété par une étude expérimentale qui montre la faisabilité et l'intérêt de l'approche proposée. Ensuite, nous prouvons que les critères lexicographiques bénéficient toujours d'une fondation en termes d'utilité espérée, et qu'ils peuvent être capturés par des utilités espérées infinitésimales. La dernière partie de notre travail est consacrée à l'optimisation des politiques dans les Processus Décisionnels de Markov (éventuellement infinis) stationnaires. Nous proposons un algorithme d'itération de la valeur pour le calcul des politiques optimales lexicographiques. De plus, nous étendons ces résultats au cas de l'horizon infini. La taille des matrices augmentant exponentiellement (ce qui est particulièrement problématique dans le cas de l'horizon infini), nous proposons un algorithme d'approximation qui se limite à la partie la plus intéressante de chaque matrice de trajectoires, à savoir les premières lignes et colonnes. Enfin, nous rapportons des résultats expérimentaux qui prouvent l'efficacité des algorithmes basés sur la troncation des matrices.This work contributes to possibilistic decision theory and more specifically to sequential decision-making under possibilistic uncertainty, at both the theoretical and practical levels. Even though appealing for its ability to handle qualitative decision problems, possibilisitic decision theory suffers from an important drawback: qualitative possibilistic utility criteria compare acts through min and max operators, which leads to a drowning effect. To overcome this lack of decision power, several refinements have been proposed in the literature. Lexicographic refinements are particularly appealing since they allow to benefit from the expected utility background, while remaining "qualitative". However, these refinements are defined for the non-sequential decision problems only. In this thesis, we present results on the extension of the lexicographic preference relations to sequential decision problems, in particular, to possibilistic Decision trees and Markov Decision Processes. This leads to new planning algorithms that are more "decisive" than their original possibilistic counterparts. We first present optimistic and pessimistic lexicographic preference relations between policies with and without intermediate utilities that refine the optimistic and pessimistic qualitative utilities respectively. We prove that these new proposed criteria satisfy the principle of Pareto efficiency as well as the property of strict monotonicity. This latter guarantees that dynamic programming algorithm can be used for calculating lexicographic optimal policies. Considering the problem of policy optimization in possibilistic decision trees and finite-horizon Markov decision processes, we provide adaptations of dynamic programming algorithm that calculate lexicographic optimal policy in polynomial time. These algorithms are based on the lexicographic comparison of the matrices of trajectories associated to the sub-policies. This algorithmic work is completed with an experimental study that shows the feasibility and the interest of the proposed approach. Then we prove that the lexicographic criteria still benefit from an Expected Utility grounding, and can be represented by infinitesimal expected utilities. The last part of our work is devoted to policy optimization in (possibly infinite) stationary Markov Decision Processes. We propose a value iteration algorithm for the computation of lexicographic optimal policies. We extend these results to the infinite-horizon case. Since the size of the matrices increases exponentially (which is especially problematic in the infinite-horizon case), we thus propose an approximation algorithm which keeps the most interesting part of each matrix of trajectories, namely the first lines and columns. Finally, we reports experimental results that show the effectiveness of the algorithms based on the cutting of the matrices

    Lexicographic refinements in possibilistic decision trees and finite-horizon Markov decision processes

    Get PDF
    Possibilistic decision theory has been proposed twenty years ago and has had several extensions since then. Even though ap-pealing for its ability to handle qualitative decision problems, possibilisticdecision theory suffers from an important drawback. Qualitative possibilistic utility criteria compare acts through min and max operators, which leads to a drowning effect. To over-come this lack of decision power of the theory, several refinements have been proposed. Lexicographic refinements are particularly appealing since they allow to benefit from the Expected Utility background, while remaining qualitative. This article aims at extend-ing lexicographic refinements to sequential decision problems i.e., to possibilistic decision trees and possibilistic Markov decision processes, when the horizon is finite. We present two criteria that refine qualitative possibilistic utilities and provide dynamic programming algorithms for calculating lexicographically optimal policies

    Operation and planning of distribution networks with integration of renewable distributed generators considering uncertainties: a review

    Get PDF
    YesDistributed generators (DGs) are a reliable solution to supply economic and reliable electricity to customers. It is the last stage in delivery of electric power which can be defined as an electric power source connected directly to the distribution network or on the customer site. It is necessary to allocate DGs optimally (size, placement and the type) to obtain commercial, technical, environmental and regulatory advantages of power systems. In this context, a comprehensive literature review of uncertainty modeling methods used for modeling uncertain parameters related to renewable DGs as well as methodologies used for the planning and operation of DGs integration into distribution network.This work was supported in part by the SITARA project funded by the British Council and the Department for Business, Innovation and Skills, UK and in part by the University of Bradford, UK under the CCIP grant 66052/000000

    Operations research models and methods for safety stock determination: A review

    Get PDF
    In supply chain inventory management it is generally accepted that safety stocks are a suitable strategy to deal with demand and supply uncertainty aiming to prevent inventory stock-outs. Safety stocks have been the subject of intensive research, typically covering the problems of dimensioning, positioning, managing and placement. Here, we narrow the scope of the discussion to the safety stock dimensioning problem, consisting in determining the proper safety stock level for each product. This paper reports the results of a recent in-depth systematic literature review (SLR) of operations research (OR) models and methods for dimensioning safety stocks. To the best of our knowledge, this is the first systematic review of the application of OR-based approaches to investigate this problem. A set of 95 papers published from 1977 to 2019 has been reviewed to identify the type of model being employed, as well as the modeling techniques and main performance criteria used. At the end, we highlight current literature gaps and discuss potential research directions and trends that may help to guide researchers and practitioners interested in the development of new OR-based approaches for safety stock determination.This work has been supported by FCT – Fundação para a Ciência e Tecnologia within the R&D Units Project Scope: UIDB/00319/2020, and by the European Structural and Investment Funds in the FEDER component, through the Operational Competitiveness and Internationalization Program (COMPETE 2020) [Project no. 39479, Funding reference: POCI-01-0247-FEDER-39479]

    Operational Decision Making under Uncertainty: Inferential, Sequential, and Adversarial Approaches

    Get PDF
    Modern security threats are characterized by a stochastic, dynamic, partially observable, and ambiguous operational environment. This dissertation addresses such complex security threats using operations research techniques for decision making under uncertainty in operations planning, analysis, and assessment. First, this research develops a new method for robust queue inference with partially observable, stochastic arrival and departure times, motivated by cybersecurity and terrorism applications. In the dynamic setting, this work develops a new variant of Markov decision processes and an algorithm for robust information collection in dynamic, partially observable and ambiguous environments, with an application to a cybersecurity detection problem. In the adversarial setting, this work presents a new application of counterfactual regret minimization and robust optimization to a multi-domain cyber and air defense problem in a partially observable environment

    Sparse Randomized Shortest Paths Routing with Tsallis Divergence Regularization

    Full text link
    This work elaborates on the important problem of (1) designing optimal randomized routing policies for reaching a target node t from a source note s on a weighted directed graph G and (2) defining distance measures between nodes interpolating between the least cost (based on optimal movements) and the commute-cost (based on a random walk on G), depending on a temperature parameter T. To this end, the randomized shortest path formalism (RSP, [2,99,124]) is rephrased in terms of Tsallis divergence regularization, instead of Kullback-Leibler divergence. The main consequence of this change is that the resulting routing policy (local transition probabilities) becomes sparser when T decreases, therefore inducing a sparse random walk on G converging to the least-cost directed acyclic graph when T tends to 0. Experimental comparisons on node clustering and semi-supervised classification tasks show that the derived dissimilarity measures based on expected routing costs provide state-of-the-art results. The sparse RSP is therefore a promising model of movements on a graph, balancing sparse exploitation and exploration in an optimal way

    Quantum inspired approach for early classification of time series

    Get PDF
    Is it possible to apply some fundamental principles of quantum-computing to time series classi\ufb01cation algorithms? This is the initial spark that became the research question I decided to chase at the very beginning of my PhD studies. The idea came accidentally after reading a note on the ability of entanglement to express the correlation between two particles, even far away from each other. The test problem was also at hand because I was investigating on possible algorithms for real time bot detection, a challenging problem at present day, by means of statistical approaches for sequential classi\ufb01cation. The quantum inspired algorithm presented in this thesis stemmed as an evolution of the statistical method mentioned above: it is a novel approach to address binary and multinomial classi\ufb01cation of an incoming data stream, inspired by the principles of Quantum Computing, in order to ensure the shortest decision time with high accuracy. The proposed approach exploits the analogy between the intrinsic correlation of two or more particles and the dependence of each item in a data stream with the preceding ones. Starting from the a-posteriori probability of each item to belong to a particular class, we can assign a Qubit state representing a combination of the aforesaid probabilities for all available observations of the time series. By leveraging superposition and entanglement on subsequences of growing length, it is possible to devise a measure of membership to each class, thus enabling the system to take a reliable decision when a suf\ufb01cient level of con\ufb01dence is met. In order to provide an extensive and thorough analysis of the problem, a well-\ufb01tting approach for bot detection was replicated on our dataset and later compared with the statistical algorithm to determine the best option. The winner was subsequently examined against the new quantum-inspired proposal, showing the superior capability of the latter in both binary and multinomial classi\ufb01cation of data streams. The validation of quantum-inspired approach in a synthetically generated use case, completes the research framework and opens new perspectives in on-the-\ufb02y time series classi\ufb01cation, that we have just started to explore. Just to name a few ones, the algorithm is currently being tested with encouraging results in predictive maintenance and prognostics for automotive, in collaboration with University of Bradford (UK), and in action recognition from video streams
    • …
    corecore