26,200 research outputs found
Bayesian multitask inverse reinforcement learning
We generalise the problem of inverse reinforcement learning to multiple
tasks, from multiple demonstrations. Each one may represent one expert trying
to solve a different task, or as different experts trying to solve the same
task. Our main contribution is to formalise the problem as statistical
preference elicitation, via a number of structured priors, whose form captures
our biases about the relatedness of different tasks or expert policies. In
doing so, we introduce a prior on policy optimality, which is more natural to
specify. We show that our framework allows us not only to learn to efficiently
from multiple experts but to also effectively differentiate between the goals
of each. Possible applications include analysing the intrinsic motivations of
subjects in behavioural experiments and learning from multiple teachers.Comment: Corrected version. 13 pages, 8 figure
Game Networks
We introduce Game networks (G nets), a novel representation for multi-agent
decision problems. Compared to other game-theoretic representations, such as
strategic or extensive forms, G nets are more structured and more compact; more
fundamentally, G nets constitute a computationally advantageous framework for
strategic inference, as both probability and utility independencies are
captured in the structure of the network and can be exploited in order to
simplify the inference process. An important aspect of multi-agent reasoning is
the identification of some or all of the strategic equilibria in a game; we
present original convergence methods for strategic equilibrium which can take
advantage of strategic separabilities in the G net structure in order to
simplify the computations. Specifically, we describe a method which identifies
a unique equilibrium as a function of the game payoffs, and one which
identifies all equilibria.Comment: Appears in Proceedings of the Sixteenth Conference on Uncertainty in
Artificial Intelligence (UAI2000
A cooperative cellular and broadcast conditional access system for Pay-TV systems
This is the author's accepted manuscript. The final published article is available from the link below. Copyright @ 2009 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other users, including reprinting/ republishing this material for advertising or promotional purposes, creating new collective works for resale or redistribution to servers or lists, or reuse of any copyrighted components of this work in other works.The lack of interoperability between Pay-TV service providers and a horizontally integrated business transaction model have compromised the competition in the Pay-TV market. In addition, the lack of interactivity with customers has resulted in high churn rate and improper security measures have contributed into considerable business loss. These issues are the main cause of high operational costs and subscription fees in the Pay-TV systems. As a result, this paper presents the Mobile Conditional Access System (MICAS) as an end-to-end access control solution for Pay-TV systems. It incorporates the mobile and broadcasting systems and provides a platform whereby service providers can effectively interact with their customers, personalize their services and adopt appropriate security measurements. This would result in the decrease of operating expenses and increase of customers' satisfaction in the system. The paper provides an overview of state-of-the-art conditional access solutions followed by detailed description of design, reference model implementation and analysis of possible MICAS security architectures.Strategy & Technology (S&T) Lt
Every Smile is Unique: Landmark-Guided Diverse Smile Generation
Each smile is unique: one person surely smiles in different ways (e.g.,
closing/opening the eyes or mouth). Given one input image of a neutral face,
can we generate multiple smile videos with distinctive characteristics? To
tackle this one-to-many video generation problem, we propose a novel deep
learning architecture named Conditional Multi-Mode Network (CMM-Net). To better
encode the dynamics of facial expressions, CMM-Net explicitly exploits facial
landmarks for generating smile sequences. Specifically, a variational
auto-encoder is used to learn a facial landmark embedding. This single
embedding is then exploited by a conditional recurrent network which generates
a landmark embedding sequence conditioned on a specific expression (e.g.,
spontaneous smile). Next, the generated landmark embeddings are fed into a
multi-mode recurrent landmark generator, producing a set of landmark sequences
still associated to the given smile class but clearly distinct from each other.
Finally, these landmark sequences are translated into face videos. Our
experimental results demonstrate the effectiveness of our CMM-Net in generating
realistic videos of multiple smile expressions.Comment: Accepted as a poster in Conference on Computer Vision and Pattern
Recognition (CVPR), 201
On Estimating Multi-Attribute Choice Preferences using Private Signals and Matrix Factorization
Revealed preference theory studies the possibility of modeling an agent's
revealed preferences and the construction of a consistent utility function.
However, modeling agent's choices over preference orderings is not always
practical and demands strong assumptions on human rationality and
data-acquisition abilities. Therefore, we propose a simple generative choice
model where agents are assumed to generate the choice probabilities based on
latent factor matrices that capture their choice evaluation across multiple
attributes. Since the multi-attribute evaluation is typically hidden within the
agent's psyche, we consider a signaling mechanism where agents are provided
with choice information through private signals, so that the agent's choices
provide more insight about his/her latent evaluation across multiple
attributes. We estimate the choice model via a novel multi-stage matrix
factorization algorithm that minimizes the average deviation of the factor
estimates from choice data. Simulation results are presented to validate the
estimation performance of our proposed algorithm.Comment: 6 pages, 2 figures, to be presented at CISS conferenc
An efficient and versatile approach to trust and reputation using hierarchical Bayesian modelling
In many dynamic open systems, autonomous agents must interact with one another to achieve their goals. Such agents may be self-interested and, when trusted to perform an action, may betray that trust by not performing the action as required. Due to the scale and dynamism of these systems, agents will often need to interact with other agents with which they have little or no past experience. Each agent must therefore be capable of assessing and identifying reliable interaction partners, even if it has no personal experience with them. To this end, we present HABIT, a Hierarchical And Bayesian Inferred Trust model for assessing how much an agent should trust its peers based on direct and third party information. This model is robust in environments in which third party information is malicious, noisy, or otherwise inaccurate. Although existing approaches claim to achieve this, most rely on heuristics with little theoretical foundation. In contrast, HABIT is based exclusively on principled statistical techniques: it can cope with multiple discrete or continuous aspects of trustee behaviour; it does not restrict agents to using a single shared representation of behaviour; it can improve assessment by using any observed correlation between the behaviour of similar trustees or information sources; and it provides a pragmatic solution to the whitewasher problem (in which unreliable agents assume a new identity to avoid bad reputation). In this paper, we describe the theoretical aspects of HABIT, and present experimental results that demonstrate its ability to predict agent behaviour in both a simulated environment, and one based on data from a real-world webserver domain. In particular, these experiments show that HABIT can predict trustee performance based on multiple representations of behaviour, and is up to twice as accurate as BLADE, an existing state-of-the-art trust model that is both statistically principled and has been previously shown to outperform a number of other probabilistic trust models
The fundamental problem of command : plan and compliance in a partially centralised economy
When a principal gives an order to an agent and advances resources for its implementation, the temptations for the agent to shirk or steal from the principal rather than comply constitute the fundamental problem of command. Historically, partially centralised command economies enforced compliance in various ways, assisted by nesting the fundamental problem of exchange within that of command. The Soviet economy provides some relevant data. The Soviet command system combined several enforcement mechanisms in an equilibrium that shifted as agents learned and each mechanism's comparative costs and benefits changed. When the conditions for an equilibrium disappeared, the system collapsed.Comparative Economic Studies (2005) 47, 296ā314. doi:10.1057/palgrave.ces.810011
- ā¦