85 research outputs found
Manipulating Attributes of Natural Scenes via Hallucination
In this study, we explore building a two-stage framework for enabling users
to directly manipulate high-level attributes of a natural scene. The key to our
approach is a deep generative network which can hallucinate images of a scene
as if they were taken at a different season (e.g. during winter), weather
condition (e.g. in a cloudy day) or time of the day (e.g. at sunset). Once the
scene is hallucinated with the given attributes, the corresponding look is then
transferred to the input image while preserving the semantic details intact,
giving a photo-realistic manipulation result. As the proposed framework
hallucinates what the scene will look like, it does not require any reference
style image as commonly utilized in most of the appearance or style transfer
approaches. Moreover, it allows to simultaneously manipulate a given scene
according to a diverse set of transient attributes within a single model,
eliminating the need of training multiple networks per each translation task.
Our comprehensive set of qualitative and quantitative results demonstrate the
effectiveness of our approach against the competing methods.Comment: Accepted for publication in ACM Transactions on Graphic
EVREAL: Towards a Comprehensive Benchmark and Analysis Suite for Event-based Video Reconstruction
Event cameras are a new type of vision sensor that incorporates asynchronous
and independent pixels, offering advantages over traditional frame-based
cameras such as high dynamic range and minimal motion blur. However, their
output is not easily understandable by humans, making the reconstruction of
intensity images from event streams a fundamental task in event-based vision.
While recent deep learning-based methods have shown promise in video
reconstruction from events, this problem is not completely solved yet. To
facilitate comparison between different approaches, standardized evaluation
protocols and diverse test datasets are essential. This paper proposes a
unified evaluation methodology and introduces an open-source framework called
EVREAL to comprehensively benchmark and analyze various event-based video
reconstruction methods from the literature. Using EVREAL, we give a detailed
analysis of the state-of-the-art methods for event-based video reconstruction,
and provide valuable insights into the performance of these methods under
varying settings, challenging scenarios, and downstream tasks.Comment: 19 pages, 9 figures. Has been accepted for publication at the IEEE
Conference on Computer Vision and Pattern Recognition Workshops (CVPRW),
Vancouver, 2023. The project page can be found at
https://ercanburak.github.io/evreal.htm
Detecting Euphemisms with Literal Descriptions and Visual Imagery
This paper describes our two-stage system for the Euphemism Detection shared
task hosted by the 3rd Workshop on Figurative Language Processing in
conjunction with EMNLP 2022. Euphemisms tone down expressions about sensitive
or unpleasant issues like addiction and death. The ambiguous nature of
euphemistic words or expressions makes it challenging to detect their actual
meaning within a context. In the first stage, we seek to mitigate this
ambiguity by incorporating literal descriptions into input text prompts to our
baseline model. It turns out that this kind of direct supervision yields
remarkable performance improvement. In the second stage, we integrate visual
supervision into our system using visual imageries, two sets of images
generated by a text-to-image model by taking terms and descriptions as input.
Our experiments demonstrate that visual supervision also gives a statistically
significant performance boost. Our system achieved the second place with an F1
score of 87.2%, only about 0.9% worse than the best submission.Comment: 7 pages, 1 table, 1 figure. Accepted to the 3rd Workshop on
Figurative Language Processing at EMNLP 2022.
https://github.com/ilkerkesen/euphemis
- …