621 research outputs found
Machine Learning for Multimedia Communications
Machine learning is revolutionizing the way multimedia information is processed and transmitted to users. After intensive and powerful training, some impressive efficiency/accuracy improvements have been made all over the transmission pipeline. For example, the high model capacity of the learning-based architectures enables us to accurately model the image and video behavior such that tremendous compression gains can be achieved. Similarly, error concealment, streaming strategy or even user perception modeling have widely benefited from the recent learningoriented developments. However, learning-based algorithms often imply drastic changes to the way data are represented or consumed, meaning that the overall pipeline can be affected even though a subpart of it is optimized. In this paper, we review the recent major advances that have been proposed all across the transmission chain, and we discuss their potential impact and the research challenges that they raise
From Capture to Display: A Survey on Volumetric Video
Volumetric video, which offers immersive viewing experiences, is gaining
increasing prominence. With its six degrees of freedom, it provides viewers
with greater immersion and interactivity compared to traditional videos.
Despite their potential, volumetric video services poses significant
challenges. This survey conducts a comprehensive review of the existing
literature on volumetric video. We firstly provide a general framework of
volumetric video services, followed by a discussion on prerequisites for
volumetric video, encompassing representations, open datasets, and quality
assessment metrics. Then we delve into the current methodologies for each stage
of the volumetric video service pipeline, detailing capturing, compression,
transmission, rendering, and display techniques. Lastly, we explore various
applications enabled by this pioneering technology and we present an array of
research challenges and opportunities in the domain of volumetric video
services. This survey aspires to provide a holistic understanding of this
burgeoning field and shed light on potential future research trajectories,
aiming to bring the vision of volumetric video to fruition.Comment: Submitte
Codage réseau pour des applications multimédias avancées
Network coding is a paradigm that allows an efficient use of the capacity of communication networks. It maximizes the throughput in a multi-hop multicast communication and reduces the delay. In this thesis, we focus our attention to the integration of the network coding framework to multimedia applications, and in particular to advanced systems that provide enhanced video services to the users. Our contributions concern several instances of advanced multimedia communications: an efficient framework for transmission of a live stream making joint use of network coding and multiple description coding; a novel transmission strategy for lossy wireless networks that guarantees a trade-off between loss resilience and short delay based on a rate-distortion optimized scheduling of the video frames, that we also extended to the case of interactive multi-view streaming; a distributed social caching system that, using network coding in conjunction with the knowledge of the users' preferences in terms of views, is able to select a replication scheme such that to provide a high video quality by accessing only other members of the social group without incurring the access cost associated with a connection to a central server and without exchanging large tables of metadata to keep track of the replicated parts; and, finally, a study on using blind source separation techniques to reduce the overhead incurred by network coding schemes based on error-detecting techniques such as parity coding and message digest generation. All our contributions are aimed at using network coding to enhance the quality of video transmission in terms of distortion and delay perceivedLe codage rĂ©seau est un paradigme qui permet une utilisation efficace du rĂ©seau. Il maximise le dĂ©bit dans un rĂ©seau multi-saut en multicast et rĂ©duit le retard. Dans cette thĂšse, nous concentrons notre attention sur lâintĂ©gration du codage rĂ©seau aux applications multimĂ©dias, et en particulier aux systĂšmes avancĂšs qui fournissent un service vidĂ©o amĂ©liorĂ© pour les utilisateurs. Nos contributions concernent plusieurs scĂ©narios : un cadre de fonctions efficace pour la transmission de flux en directe qui utilise Ă la fois le codage rĂ©seau et le codage par description multiple, une nouvelle stratĂ©gie de transmission pour les rĂ©seaux sans fil avec perte qui garantit un compromis entre la rĂ©silience vis-Ă -vis des perte et la reduction du retard sur la base dâune optimisation dĂ©bit-distorsion de l'ordonnancement des images vidĂ©o, que nous avons Ă©galement Ă©tendu au cas du streaming multi-vue interactive, un systĂšme replication sociale distribuĂ©e qui, en utilisant le rĂ©seau codage en relation et la connaissance des prĂ©fĂ©rences des utilisateurs en termes de vue, est en mesure de sĂ©lectionner un schĂ©ma de rĂ©plication capable de fournir une vidĂ©o de haute qualitĂ© en accĂ©dant seulement aux autres membres du groupe social, sans encourir le coĂ»t dâaccĂšs associĂ© Ă une connexion Ă un serveur central et sans Ă©changer des larges tables de mĂ©tadonnĂ©es pour tenir trace des Ă©lĂ©ments rĂ©pliquĂ©s, et, finalement, une Ă©tude sur lâutilisation de techniques de sĂ©paration aveugle de source -pour rĂ©duire lâoverhead encouru par les schĂ©mas de codage rĂ©seau- basĂ© sur des techniques de dĂ©tection dâerreur telles que le codage de paritĂ© et la gĂ©nĂ©ration de message digest
Machine Learning for Multimedia Communications
Machine learning is revolutionizing the way multimedia information is processed and transmitted to users. After intensive and powerful training, some impressive efficiency/accuracy improvements have been made all over the transmission pipeline. For example, the high model capacity of the learning-based architectures enables us to accurately model the image and video behavior such that tremendous compression gains can be achieved. Similarly, error concealment, streaming strategy or even user perception modeling have widely benefited from the recent learning-oriented developments. However, learning-based algorithms often imply drastic changes to the way data are represented or consumed, meaning that the overall pipeline can be affected even though a subpart of it is optimized. In this paper, we review the recent major advances that have been proposed all across the transmission chain, and we discuss their potential impact and the research challenges that they raise
Storage Solutions for Big Data Systems: A Qualitative Study and Comparison
Big data systems development is full of challenges in view of the variety of
application areas and domains that this technology promises to serve.
Typically, fundamental design decisions involved in big data systems design
include choosing appropriate storage and computing infrastructures. In this age
of heterogeneous systems that integrate different technologies for optimized
solution to a specific real world problem, big data system are not an exception
to any such rule. As far as the storage aspect of any big data system is
concerned, the primary facet in this regard is a storage infrastructure and
NoSQL seems to be the right technology that fulfills its requirements. However,
every big data application has variable data characteristics and thus, the
corresponding data fits into a different data model. This paper presents
feature and use case analysis and comparison of the four main data models
namely document oriented, key value, graph and wide column. Moreover, a feature
analysis of 80 NoSQL solutions has been provided, elaborating on the criteria
and points that a developer must consider while making a possible choice.
Typically, big data storage needs to communicate with the execution engine and
other processing and visualization technologies to create a comprehensive
solution. This brings forth second facet of big data storage, big data file
formats, into picture. The second half of the research paper compares the
advantages, shortcomings and possible use cases of available big data file
formats for Hadoop, which is the foundation for most big data computing
technologies. Decentralized storage and blockchain are seen as the next
generation of big data storage and its challenges and future prospects have
also been discussed
Reducing Internet Latency : A Survey of Techniques and their Merit
Bob Briscoe, Anna Brunstrom, Andreas Petlund, David Hayes, David Ros, Ing-Jyh Tsang, Stein Gjessing, Gorry Fairhurst, Carsten Griwodz, Michael WelzlPeer reviewedPreprin
DĂŒnaamiline kiiruse jaotamine interaktiivses mitmevaatelises video vaatevahetuse ennustamineses
In Interactive Multi-View Video (IMVV), the video has been captured by numbers of
cameras positioned in array and transmitted those camera views to users. The user can
interact with the transmitted video content by choosing viewpoints (views from different
cameras in the array) with the expectation of minimum transmission delay while
changing among various views. View switching delay is one of the primary concern that
is dealt in this thesis work, where the contribution is to minimize the transmission delay
of new view switch frame through a novel process of selection of the predicted view
and compression considering the transmission efficiency. Mainly considered a realtime
IMVV streaming, and the view switch is mapped as discrete Markov chain, where
the transition probability is derived using Zipf distribution, which provides information
regarding view switch prediction. To eliminate Round-Trip Time (RTT) transmission
delay, Quantization Parameters (QP) are adaptively allocated to the remaining redundant
transmitted frames to maintain view switching time minimum, trading off with
the quality of the video till RTT time-span. The experimental results of the proposed
method show superior performance on PSNR and view switching delay for better viewing quality over the existing methods
- âŠ