97,070 research outputs found
Discovering Utility-driven Interval Rules
For artificial intelligence, high-utility sequential rule mining (HUSRM) is a
knowledge discovery method that can reveal the associations between events in
the sequences. Recently, abundant methods have been proposed to discover
high-utility sequence rules. However, the existing methods are all related to
point-based sequences. Interval events that persist for some time are common.
Traditional interval-event sequence knowledge discovery tasks mainly focus on
pattern discovery, but patterns cannot reveal the correlation between interval
events well. Moreover, the existing HUSRM algorithms cannot be directly applied
to interval-event sequences since the relation in interval-event sequences is
much more intricate than those in point-based sequences. In this work, we
propose a utility-driven interval rule mining (UIRMiner) algorithm that can
extract all utility-driven interval rules (UIRs) from the interval-event
sequence database to solve the problem. In UIRMiner, we first introduce a
numeric encoding relation representation, which can save much time on relation
computation and storage on relation representation. Furthermore, to shrink the
search space, we also propose a complement pruning strategy, which incorporates
the utility upper bound with the relation. Finally, plentiful experiments
implemented on both real-world and synthetic datasets verify that UIRMiner is
an effective and efficient algorithm.Comment: Preprint. 11 figures, 5 table
A Projected Upper Bound for Mining High Utility Patterns from Interval-Based Event Sequences
High utility pattern mining is an interesting yet challenging problem. The
intrinsic computational cost of the problem will impose further challenges if
efficiency in addition to the efficacy of a solution is sought. Recently, this
problem was studied on interval-based event sequences with a constraint on the
length and size of the patterns. However, the proposed solution lacks adequate
efficiency. To address this issue, we propose a projected upper bound on the
utility of the patterns discovered from sequences of interval-based events. To
show its effectiveness, the upper bound is utilized by a pruning strategy
employed by the HUIPMiner algorithm. Experimental results show that the new
upper bound improves HUIPMiner performance in terms of both execution time and
memory usage
Quasi-regular sequences and optimal schedules for security games
We study security games in which a defender commits to a mixed strategy for
protecting a finite set of targets of different values. An attacker, knowing
the defender's strategy, chooses which target to attack and for how long. If
the attacker spends time at a target of value , and if he
leaves before the defender visits the target, his utility is ; if the defender visits before he leaves, his utility is 0. The defender's
goal is to minimize the attacker's utility. The defender's strategy consists of
a schedule for visiting the targets; it takes her unit time to switch between
targets. Such games are a simplified model of a number of real-world scenarios
such as protecting computer networks from intruders, crops from thieves, etc.
We show that optimal defender play for this continuous time security games
reduces to the solution of a combinatorial question regarding the existence of
infinite sequences over a finite alphabet, with the following properties for
each symbol : (1) constitutes a prescribed fraction of the
sequence. (2) The occurrences of are spread apart close to evenly, in that
the ratio of the longest to shortest interval between consecutive occurrences
is bounded by a parameter . We call such sequences -quasi-regular.
We show that, surprisingly, -quasi-regular sequences suffice for optimal
defender play. What is more, even randomized -quasi-regular sequences
suffice for optimality. We show that such sequences always exist, and can be
calculated efficiently.
The question of the least for which deterministic -quasi-regular
sequences exist is fascinating. Using an ergodic theoretical approach, we show
that deterministic -quasi-regular sequences always exist. For
we do not know whether deterministic -quasi-regular sequences always exist.Comment: to appear in Proc. of SODA 201
On the use of a Modified Latin Hypercube Sampling (MLHS) approach in the estimation of a Mixed Logit model for vehicle choice
Quasi-random number sequences have been used extensively for many years in the simulation of integrals that do not have a closed-form expression, such as Mixed Logit and Multinomial Probit choice probabilities. Halton sequences are one example of such quasi-random number sequences, and various types of Halton sequences, including standard, scrambled, and shuffled versions, have been proposed and tested in the context of travel demand modeling. In this paper, we propose an alternative to Halton sequences, based on an adapted version of Latin Hypercube Sampling. These alternative sequences, like scrambled and shuffled Halton sequences, avoid the undesirable correlation patterns that arise in standard Halton sequences. However, they are easier to create than scrambled or shuffled Halton sequences. They also provide more uniform coverage in each dimension than any of the Halton sequences. A detailed analysis, using a 16-dimensional Mixed Logit model for choice between alternative-fuelled vehicles in California, was conducted to compare the performance of the different types of draws. The analysis shows that, in this application, the Modified Latin Hypercube Sampling (MLHS) outperforms each type of Halton sequence. This greater accuracy combined with the greater simplicity make the MLHS method an appealing approach for simulation of travel demand models and simulation-based models in general
Q-learning: flexible learning about useful utilities
Dynamic treatment regimes are fast becoming an important part of medicine, with the corresponding change in emphasis from treatment of the disease to treatment of the individual patient. Because of the limited number of trials to evaluate personally tailored treatment sequences, inferring optimal treatment regimes from observational data has increased importance. Q-learning is a popular method for estimating the optimal treatment regime, originally in randomized trials but more recently also in observational data. Previous applications of Q-learning have largely been restricted to continuous utility end-points with linear relationships. This paper is the first attempt at both extending the framework to discrete utilities and implementing the modelling of covariates from linear to more flexible modelling using the generalized additive model (GAM) framework. Simulated data results show that the GAM adapted Q-learning typically outperforms Q-learning with linear models and other frequently-used methods based on propensity scores in terms of coverage and bias/MSE. This represents a promising step toward a more fully general Q-learning approach to estimating optimal dynamic treatment regimes
- …