1,477 research outputs found
Beyond Transmitting Bits: Context, Semantics, and Task-Oriented Communications
Communication systems to date primarily aim at reliably communicating bit
sequences. Such an approach provides efficient engineering designs that are
agnostic to the meanings of the messages or to the goal that the message
exchange aims to achieve. Next generation systems, however, can be potentially
enriched by folding message semantics and goals of communication into their
design. Further, these systems can be made cognizant of the context in which
communication exchange takes place, providing avenues for novel design
insights. This tutorial summarizes the efforts to date, starting from its early
adaptations, semantic-aware and task-oriented communications, covering the
foundations, algorithms and potential implementations. The focus is on
approaches that utilize information theory to provide the foundations, as well
as the significant role of learning in semantics and task-aware communications.Comment: 28 pages, 14 figure
Robust Sequential DeepFake Detection
Since photorealistic faces can be readily generated by facial manipulation
technologies nowadays, potential malicious abuse of these technologies has
drawn great concerns. Numerous deepfake detection methods are thus proposed.
However, existing methods only focus on detecting one-step facial manipulation.
As the emergence of easy-accessible facial editing applications, people can
easily manipulate facial components using multi-step operations in a sequential
manner. This new threat requires us to detect a sequence of facial
manipulations, which is vital for both detecting deepfake media and recovering
original faces afterwards. Motivated by this observation, we emphasize the need
and propose a novel research problem called Detecting Sequential DeepFake
Manipulation (Seq-DeepFake). Unlike the existing deepfake detection task only
demanding a binary label prediction, detecting Seq-DeepFake manipulation
requires correctly predicting a sequential vector of facial manipulation
operations. To support a large-scale investigation, we construct the first
Seq-DeepFake dataset, where face images are manipulated sequentially with
corresponding annotations of sequential facial manipulation vectors. Based on
this new dataset, we cast detecting Seq-DeepFake manipulation as a specific
image-to-sequence task and propose a concise yet effective Seq-DeepFake
Transformer (SeqFakeFormer). To better reflect real-world deepfake data
distributions, we further apply various perturbations on the original
Seq-DeepFake dataset and construct the more challenging Sequential DeepFake
dataset with perturbations (Seq-DeepFake-P). To exploit deeper correlation
between images and sequences when facing Seq-DeepFake-P, a dedicated
Seq-DeepFake Transformer with Image-Sequence Reasoning (SeqFakeFormer++) is
devised, which builds stronger correspondence between image-sequence pairs for
more robust Seq-DeepFake detection.Comment: Extension of our ECCV 2022 paper: arXiv:2207.02204 . Code:
https://github.com/rshaojimmy/SeqDeepFak
Object Detection in 20 Years: A Survey
Object detection, as of one the most fundamental and challenging problems in
computer vision, has received great attention in recent years. Its development
in the past two decades can be regarded as an epitome of computer vision
history. If we think of today's object detection as a technical aesthetics
under the power of deep learning, then turning back the clock 20 years we would
witness the wisdom of cold weapon era. This paper extensively reviews 400+
papers of object detection in the light of its technical evolution, spanning
over a quarter-century's time (from the 1990s to 2019). A number of topics have
been covered in this paper, including the milestone detectors in history,
detection datasets, metrics, fundamental building blocks of the detection
system, speed up techniques, and the recent state of the art detection methods.
This paper also reviews some important detection applications, such as
pedestrian detection, face detection, text detection, etc, and makes an in-deep
analysis of their challenges as well as technical improvements in recent years.Comment: This work has been submitted to the IEEE TPAMI for possible
publicatio
Characteristic Regularisation for Super-Resolving Face Images
Existing facial image super-resolution (SR) methods focus mostly on improving "artificially down-sampled" lowresolution (LR) imagery. Such SR models, although strong at handling artificial LR images, often suffer from significant performance drop on genuine LR test data. Previous unsupervised domain adaptation (UDA) methods address this issue by training a model using unpaired genuine LR and HR data as well as cycle consistency loss formulation. However, this renders the model overstretched with two tasks: consistifying the visual characteristics and enhancing the image resolution. Importantly, this makes the end-to-end model training ineffective due to the difficulty of back-propagating gradients through two concatenated CNNs. To solve this problem, we formulate a method that joins the advantages of conventional SR and UDA models. Specifically, we separate and control the optimisations for characteristics consistifying and image super-resolving by introducing Characteristic Regularisation (CR) between them. This task split makes the model training more effective and computationally tractable. Extensive evaluations demonstrate the performance superiority of our method over state-of-the-art SR and UDA models on both genuine and artificial LR facial imagery data
- …