97,070 research outputs found

    Discovering Utility-driven Interval Rules

    Full text link
    For artificial intelligence, high-utility sequential rule mining (HUSRM) is a knowledge discovery method that can reveal the associations between events in the sequences. Recently, abundant methods have been proposed to discover high-utility sequence rules. However, the existing methods are all related to point-based sequences. Interval events that persist for some time are common. Traditional interval-event sequence knowledge discovery tasks mainly focus on pattern discovery, but patterns cannot reveal the correlation between interval events well. Moreover, the existing HUSRM algorithms cannot be directly applied to interval-event sequences since the relation in interval-event sequences is much more intricate than those in point-based sequences. In this work, we propose a utility-driven interval rule mining (UIRMiner) algorithm that can extract all utility-driven interval rules (UIRs) from the interval-event sequence database to solve the problem. In UIRMiner, we first introduce a numeric encoding relation representation, which can save much time on relation computation and storage on relation representation. Furthermore, to shrink the search space, we also propose a complement pruning strategy, which incorporates the utility upper bound with the relation. Finally, plentiful experiments implemented on both real-world and synthetic datasets verify that UIRMiner is an effective and efficient algorithm.Comment: Preprint. 11 figures, 5 table

    A Projected Upper Bound for Mining High Utility Patterns from Interval-Based Event Sequences

    Full text link
    High utility pattern mining is an interesting yet challenging problem. The intrinsic computational cost of the problem will impose further challenges if efficiency in addition to the efficacy of a solution is sought. Recently, this problem was studied on interval-based event sequences with a constraint on the length and size of the patterns. However, the proposed solution lacks adequate efficiency. To address this issue, we propose a projected upper bound on the utility of the patterns discovered from sequences of interval-based events. To show its effectiveness, the upper bound is utilized by a pruning strategy employed by the HUIPMiner algorithm. Experimental results show that the new upper bound improves HUIPMiner performance in terms of both execution time and memory usage

    Quasi-regular sequences and optimal schedules for security games

    Get PDF
    We study security games in which a defender commits to a mixed strategy for protecting a finite set of targets of different values. An attacker, knowing the defender's strategy, chooses which target to attack and for how long. If the attacker spends time tt at a target ii of value αi\alpha_i, and if he leaves before the defender visits the target, his utility is t⋅αit \cdot \alpha_i ; if the defender visits before he leaves, his utility is 0. The defender's goal is to minimize the attacker's utility. The defender's strategy consists of a schedule for visiting the targets; it takes her unit time to switch between targets. Such games are a simplified model of a number of real-world scenarios such as protecting computer networks from intruders, crops from thieves, etc. We show that optimal defender play for this continuous time security games reduces to the solution of a combinatorial question regarding the existence of infinite sequences over a finite alphabet, with the following properties for each symbol ii: (1) ii constitutes a prescribed fraction pip_i of the sequence. (2) The occurrences of ii are spread apart close to evenly, in that the ratio of the longest to shortest interval between consecutive occurrences is bounded by a parameter KK. We call such sequences KK-quasi-regular. We show that, surprisingly, 22-quasi-regular sequences suffice for optimal defender play. What is more, even randomized 22-quasi-regular sequences suffice for optimality. We show that such sequences always exist, and can be calculated efficiently. The question of the least KK for which deterministic KK-quasi-regular sequences exist is fascinating. Using an ergodic theoretical approach, we show that deterministic 33-quasi-regular sequences always exist. For 2≤K<32 \leq K < 3 we do not know whether deterministic KK-quasi-regular sequences always exist.Comment: to appear in Proc. of SODA 201

    On the use of a Modified Latin Hypercube Sampling (MLHS) approach in the estimation of a Mixed Logit model for vehicle choice

    Get PDF
    Quasi-random number sequences have been used extensively for many years in the simulation of integrals that do not have a closed-form expression, such as Mixed Logit and Multinomial Probit choice probabilities. Halton sequences are one example of such quasi-random number sequences, and various types of Halton sequences, including standard, scrambled, and shuffled versions, have been proposed and tested in the context of travel demand modeling. In this paper, we propose an alternative to Halton sequences, based on an adapted version of Latin Hypercube Sampling. These alternative sequences, like scrambled and shuffled Halton sequences, avoid the undesirable correlation patterns that arise in standard Halton sequences. However, they are easier to create than scrambled or shuffled Halton sequences. They also provide more uniform coverage in each dimension than any of the Halton sequences. A detailed analysis, using a 16-dimensional Mixed Logit model for choice between alternative-fuelled vehicles in California, was conducted to compare the performance of the different types of draws. The analysis shows that, in this application, the Modified Latin Hypercube Sampling (MLHS) outperforms each type of Halton sequence. This greater accuracy combined with the greater simplicity make the MLHS method an appealing approach for simulation of travel demand models and simulation-based models in general

    Q-learning: flexible learning about useful utilities

    Get PDF
    Dynamic treatment regimes are fast becoming an important part of medicine, with the corresponding change in emphasis from treatment of the disease to treatment of the individual patient. Because of the limited number of trials to evaluate personally tailored treatment sequences, inferring optimal treatment regimes from observational data has increased importance. Q-learning is a popular method for estimating the optimal treatment regime, originally in randomized trials but more recently also in observational data. Previous applications of Q-learning have largely been restricted to continuous utility end-points with linear relationships. This paper is the first attempt at both extending the framework to discrete utilities and implementing the modelling of covariates from linear to more flexible modelling using the generalized additive model (GAM) framework. Simulated data results show that the GAM adapted Q-learning typically outperforms Q-learning with linear models and other frequently-used methods based on propensity scores in terms of coverage and bias/MSE. This represents a promising step toward a more fully general Q-learning approach to estimating optimal dynamic treatment regimes
    • …
    corecore