155 research outputs found
Video Storytelling: Textual Summaries for Events
Bridging vision and natural language is a longstanding goal in computer
vision and multimedia research. While earlier works focus on generating a
single-sentence description for visual content, recent works have studied
paragraph generation. In this work, we introduce the problem of video
storytelling, which aims at generating coherent and succinct stories for long
videos. Video storytelling introduces new challenges, mainly due to the
diversity of the story and the length and complexity of the video. We propose
novel methods to address the challenges. First, we propose a context-aware
framework for multimodal embedding learning, where we design a Residual
Bidirectional Recurrent Neural Network to leverage contextual information from
past and future. Second, we propose a Narrator model to discover the underlying
storyline. The Narrator is formulated as a reinforcement learning agent which
is trained by directly optimizing the textual metric of the generated story. We
evaluate our method on the Video Story dataset, a new dataset that we have
collected to enable the study. We compare our method with multiple
state-of-the-art baselines, and show that our method achieves better
performance, in terms of quantitative measures and user study.Comment: Published in IEEE Transactions on Multimedi
Multi-keyword multi-click advertisement option contracts for sponsored search
In sponsored search, advertisement (abbreviated ad) slots are usually sold by
a search engine to an advertiser through an auction mechanism in which
advertisers bid on keywords. In theory, auction mechanisms have many desirable
economic properties. However, keyword auctions have a number of limitations
including: the uncertainty in payment prices for advertisers; the volatility in
the search engine's revenue; and the weak loyalty between advertiser and search
engine. In this paper we propose a special ad option that alleviates these
problems. In our proposal, an advertiser can purchase an option from a search
engine in advance by paying an upfront fee, known as the option price. He then
has the right, but no obligation, to purchase among the pre-specified set of
keywords at the fixed cost-per-clicks (CPCs) for a specified number of clicks
in a specified period of time. The proposed option is closely related to a
special exotic option in finance that contains multiple underlying assets
(multi-keyword) and is also multi-exercisable (multi-click). This novel
structure has many benefits: advertisers can have reduced uncertainty in
advertising; the search engine can improve the advertisers' loyalty as well as
obtain a stable and increased expected revenue over time. Since the proposed ad
option can be implemented in conjunction with the existing keyword auctions,
the option price and corresponding fixed CPCs must be set such that there is no
arbitrage between the two markets. Option pricing methods are discussed and our
experimental results validate the development. Compared to keyword auctions, a
search engine can have an increased expected revenue by selling an ad option.Comment: Chen, Bowei and Wang, Jun and Cox, Ingemar J. and Kankanhalli, Mohan
S. (2015) Multi-keyword multi-click advertisement option contracts for
sponsored search. ACM Transactions on Intelligent Systems and Technology, 7
(1). pp. 1-29. ISSN: 2157-690
Fast Yet Effective Machine Unlearning
Unlearning the data observed during the training of a machine learning (ML)
model is an important task that can play a pivotal role in fortifying the
privacy and security of ML-based applications. This paper raises the following
questions: (i) can we unlearn a single or multiple classes of data from an ML
model without looking at the full training data even once? (ii) can we make the
process of unlearning fast and scalable to large datasets, and generalize it to
different deep networks? We introduce a novel machine unlearning framework with
error-maximizing noise generation and impair-repair based weight manipulation
that offers an efficient solution to the above questions. An error-maximizing
noise matrix is learned for the class to be unlearned using the original model.
The noise matrix is used to manipulate the model weights to unlearn the
targeted class of data. We introduce impair and repair steps for a controlled
manipulation of the network weights. In the impair step, the noise matrix along
with a very high learning rate is used to induce sharp unlearning in the model.
Thereafter, the repair step is used to regain the overall performance. With
very few update steps, we show excellent unlearning while substantially
retaining the overall model accuracy. Unlearning multiple classes requires a
similar number of update steps as for the single class, making our approach
scalable to large problems. Our method is quite efficient in comparison to the
existing methods, works for multi-class unlearning, doesn't put any constraints
on the original optimization mechanism or network design, and works well in
both small and large-scale vision tasks. This work is an important step towards
fast and easy implementation of unlearning in deep networks. We will make the
source code publicly available
- …