869 research outputs found
ReLoop2: Building Self-Adaptive Recommendation Models via Responsive Error Compensation Loop
Industrial recommender systems face the challenge of operating in
non-stationary environments, where data distribution shifts arise from evolving
user behaviors over time. To tackle this challenge, a common approach is to
periodically re-train or incrementally update deployed deep models with newly
observed data, resulting in a continual training process. However, the
conventional learning paradigm of neural networks relies on iterative
gradient-based updates with a small learning rate, making it slow for large
recommendation models to adapt. In this paper, we introduce ReLoop2, a
self-correcting learning loop that facilitates fast model adaptation in online
recommender systems through responsive error compensation. Inspired by the
slow-fast complementary learning system observed in human brains, we propose an
error memory module that directly stores error samples from incoming data
streams. These stored samples are subsequently leveraged to compensate for
model prediction errors during testing, particularly under distribution shifts.
The error memory module is designed with fast access capabilities and undergoes
continual refreshing with newly observed data samples during the model serving
phase to support fast model adaptation. We evaluate the effectiveness of
ReLoop2 on three open benchmark datasets as well as a real-world production
dataset. The results demonstrate the potential of ReLoop2 in enhancing the
responsiveness and adaptiveness of recommender systems operating in
non-stationary environments.Comment: Accepted by KDD 2023. See the project page at
https://xpai.github.io/ReLoo
Making, Managing and Experiencing ‘the Now’: Digital Media and the Compression and Pacing of ‘Real-Time’
Digital media time is commonly described as ‘real-time’. But what does this term refer to? How is ‘real-time’ made, managed and experienced? This paper explores these questions, drawing on interviews with UK based digital media professionals. Its specific concern is with how accounts of the time of digital media indicate a particular, yet supple, temporality, which emphasises ‘the now’. I draw on current literature that explores how real-time is a temporality capable of being stretched and condensed, or variously compressed and paced. While much of this literature focuses on the technological fabrication of real-time, I explore how ‘the now’ is produced through the interplay between human and non-human practices. Through discussion of the interviews, the paper concentrates on social, cultural and affective dimensions of ‘the now’, fleshing out more technologically-focused work and contributing to understanding of a prevalent way in which time is organised in contemporary digital societies
Error resilience and concealment techniques for high-efficiency video coding
This thesis investigates the problem of robust coding and error concealment in High Efficiency Video Coding (HEVC). After a review of the current state of the art, a simulation study about error robustness, revealed that the HEVC has weak protection against network losses with significant impact on video quality degradation. Based on this evidence, the first contribution of this work is a new method to reduce the temporal dependencies between motion vectors, by improving the decoded video quality without compromising the compression efficiency. The second contribution of this thesis is a two-stage approach for reducing the mismatch of temporal predictions in case of video streams received with errors or lost data. At the encoding stage, the reference pictures are dynamically distributed based on a constrained Lagrangian rate-distortion optimization to reduce the number of predictions from a single reference. At the streaming stage, a prioritization algorithm, based on spatial dependencies, selects a reduced set of motion vectors to be transmitted, as side information, to reduce mismatched motion predictions at the decoder. The problem of error concealment-aware video coding is also investigated to enhance the overall error robustness. A new approach based on scalable coding and optimally error concealment selection is proposed, where the optimal error concealment modes are found by simulating transmission losses, followed by a saliency-weighted optimisation. Moreover, recovery residual information is encoded using a rate-controlled enhancement layer. Both are transmitted to the decoder to be used in case of data loss. Finally, an adaptive error resilience scheme is proposed to dynamically predict the video stream that achieves the highest decoded quality for a particular loss case. A neural network selects among the various video streams, encoded with different levels of compression efficiency and error protection, based on information from the video signal, the coded stream and the transmission network. Overall, the new robust video coding methods investigated in this thesis yield consistent quality gains in comparison with other existing methods and also the ones implemented in the HEVC reference software. Furthermore, the trade-off between coding efficiency and error robustness is also better in the proposed methods
Enabling Cross-Camera Collaboration for Video Analytics on Distributed Smart Cameras
Overlapping cameras offer exciting opportunities to view a scene from
different angles, allowing for more advanced, comprehensive and robust
analysis. However, existing visual analytics systems for multi-camera streams
are mostly limited to (i) per-camera processing and aggregation and (ii)
workload-agnostic centralized processing architectures. In this paper, we
present Argus, a distributed video analytics system with cross-camera
collaboration on smart cameras. We identify multi-camera, multi-target tracking
as the primary task of multi-camera video analytics and develop a novel
technique that avoids redundant, processing-heavy identification tasks by
leveraging object-wise spatio-temporal association in the overlapping fields of
view across multiple cameras. We further develop a set of techniques to perform
these operations across distributed cameras without cloud support at low
latency by (i) dynamically ordering the camera and object inspection sequence
and (ii) flexibly distributing the workload across smart cameras, taking into
account network transmission and heterogeneous computational capacities.
Evaluation of three real-world overlapping camera datasets with two Nvidia
Jetson devices shows that Argus reduces the number of object identifications
and end-to-end latency by up to 7.13x and 2.19x (4.86x and 1.60x compared to
the state-of-the-art), while achieving comparable tracking quality.Comment: 18 pages, under revie
Dynamic and Super-Personalized Media Ecosystem Driven by Generative AI: Unpredictable Plays Never Repeating The Same
This paper introduces a media service model that exploits artificial
intelligence (AI) video generators at the receive end. This proposal deviates
from the traditional multimedia ecosystem, completely relying on in-house
production, by shifting part of the content creation onto the receiver. We
bring a semantic process into the framework, allowing the distribution network
to provide service elements that prompt the content generator, rather than
distributing encoded data of fully finished programs. The service elements
include fine-tailored text descriptions, lightweight image data of some
objects, or application programming interfaces, comprehensively referred to as
semantic sources, and the user terminal translates the received semantic data
into video frames. Empowered by the random nature of generative AI, the users
could then experience super-personalized services accordingly. The proposed
idea incorporates the situations in which the user receives different service
providers' element packages; a sequence of packages over time, or multiple
packages at the same time. Given promised in-context coherence and content
integrity, the combinatory dynamics will amplify the service diversity,
allowing the users to always chance upon new experiences. This work
particularly aims at short-form videos and advertisements, which the users
would easily feel fatigued by seeing the same frame sequence every time. In
those use cases, the content provider's role will be recast as scripting
semantic sources, transformed from a thorough producer. Overall, this work
explores a new form of media ecosystem facilitated by receiver-embedded
generative models, featuring both random content dynamics and enhanced delivery
efficiency simultaneously.Comment: 13 pages, 7 figure
Big Data Security (Volume 3)
After a short description of the key concepts of big data the book explores on the secrecy and security threats posed especially by cloud based data storage. It delivers conceptual frameworks and models along with case studies of recent technology
Generative AI-driven Semantic Communication Networks: Architecture, Technologies and Applications
Generative artificial intelligence (GAI) has emerged as a rapidly burgeoning
field demonstrating significant potential in creating diverse contents
intelligently and automatically. To support such artificial
intelligence-generated content (AIGC) services, future communication systems
should fulfill much more stringent requirements (including data rate,
throughput, latency, etc.) with limited yet precious spectrum resources. To
tackle this challenge, semantic communication (SemCom), dramatically reducing
resource consumption via extracting and transmitting semantics, has been deemed
as a revolutionary communication scheme. The advanced GAI algorithms facilitate
SemCom on sophisticated intelligence for model training, knowledge base
construction and channel adaption. Furthermore, GAI algorithms also play an
important role in the management of SemCom networks. In this survey, we first
overview the basics of GAI and SemCom as well as the synergies of the two
technologies. Especially, the GAI-driven SemCom framework is presented, where
many GAI models for information creation, SemCom-enabled information
transmission and information effectiveness for AIGC are discussed separately.
We then delve into the GAI-driven SemCom network management involving with
novel management layers, knowledge management, and resource allocation.
Finally, we envision several promising use cases, i.e., autonomous driving,
smart city, and the Metaverse for a more comprehensive exploration
Towards Real-World Data Streams for Deep Continual Learning
Continual Learning deals with Artificial Intelligent agents striving to learn from an ever-ending
stream of data. Recently, Deep Continual Learning focused on the design of new strategies to
endow Artificial Neural Networks with the ability to learn continuously without forgetting previous
knowledge. In fact, the learning process of any Artificial Neural Network model is well-known to
lack the sufficient stability to preserve existing knowledge when learning new information. This
phenomenon, called catastrophic forgetting or simply forgetting, is considered one of the main
obstacles for the design of effective Continual Learning agents. However, existing strategies designed
to mitigate forgetting have been evaluated on a restricted set of Continual Learning scenarios. The
most used one is, by far, the Class-Incremental scenario applied on object detection tasks. Even
though it drove interest in Continual Learning, Class-Incremental scenarios strongly constraint the
properties of the data stream, thus limiting its ability to model real-world environments.
The core of this thesis concerns the introduction of three Continual Learning data streams, whose
design is centered around specific real-world environments properties. First, we propose the Class-
Incremental with Repetition scenario, which builds a data stream including both the introduction
of new concepts and the repetition of previous ones. Repetition is naturally present in many
environments and it constitutes an important source of information. Second, we formalize the
Continual Pre-Training scenario, which leverages a data stream of unstructured knowledge to keep
a pre-trained model updated over time. One important objective of this scenario is to study how to
continuously build general, robust representations that does not strongly depend on the specific task
to be solved. This is a fundamental property of real-world agents, which build cross-task knowledge
and then adapts it to specific needs. Third, we study Continual Learning scenarios where data
streams are composed by temporally-correlated data. Temporal correlation is ubiquitous and lies
at the foundation of most environments we, as humans, experience during our life. We leverage
Recurrent Neural Networks as our main model, due to their intrinsic ability to model temporal
correlations. We discovered that, when applied to recurrent models, Continual Learning strategies
behave in an unexpected manner. This highlights the limits of the current experimental validation,
mostly focused on Computer Vision tasks.
Ultimately, the introduction of new data streams contributed to deepen our understanding of
how Artificial Neural Networks learn continuously. We discover that forgetting strongly depends
on the properties of the data stream and we observed large changes from one data stream to
another. Moreover, when forgetting is mild, we were able to effectively mitigate it with simple
strategies, or even without any specific ones. Loosening the focus on forgetting allows us to turn our
attention to other interesting problems, outlined in this thesis, like (i) separation between continual
representation learning and quick adaptation to novel tasks, (ii) robustness to unbalanced data
streams and (iii) ability to continuously learn temporal correlations. These objectives currently
defy existing strategies and will likely represent the next challenge for Continual Learning research
- …