46 research outputs found
Investigating the use of pretrained convolutional neural network on cross-subject and cross-dataset EEG emotion recognition
The electroencephalogram (EEG) has great attraction in emotion recognition studies due to its resistance to deceptive actions of humans. This is one of the most significant advantages of brain signals in comparison to visual or speech signals in the emotion recognition context. A major challenge in EEG-based emotion recognition is that EEG recordings exhibit varying distributions for different people as well as for the same person at different time instances. This nonstationary nature of EEG limits the accuracy of it when subject independency is the priority. The aim of this study is to increase the subject-independent recognition accuracy by exploiting pretrained state-of-the-art Convolutional Neural Network (CNN) architectures. Unlike similar studies that extract spectral band power features from the EEG readings, raw EEG data is used in our study after applying windowing, pre-adjustments and normalization. Removing manual feature extraction from the training system overcomes the risk of eliminating hidden features in the raw data and helps leverage the deep neural network’s power in uncovering unknown features. To improve the classification accuracy further, a median filter is used to eliminate the false detections along a prediction interval of emotions. This method yields a mean cross-subject accuracy of 86.56% and 78.34% on the Shanghai Jiao Tong University Emotion EEG Dataset (SEED) for two and three emotion classes, respectively. It also yields a mean cross-subject accuracy of 72.81% on the Database for Emotion Analysis using Physiological Signals (DEAP) and 81.8% on the Loughborough University Multimodal Emotion Dataset (LUMED) for two emotion classes. Furthermore, the recognition model that has been trained using the SEED dataset was tested with the DEAP dataset, which yields a mean prediction accuracy of 58.1% across all subjects and emotion classes. Results show that in terms of classification accuracy, the proposed approach is superior to, or on par with, the reference subject-independent EEG emotion recognition studies identified in literature and has limited complexity due to the elimination of the need for feature extraction.<br
Quality-aware adaptive delivery of multi-view video
Advances in video coding and networking technologies have
paved the way for the Multi-View Video (MVV) streaming.
However, large amounts of data and dynamic network conditions
result in frequent network congestion, which may prevent
video packets from being delivered on time. As a consequence,
the 3D viewing experience may be degraded signifi-
cantly, unless quality-aware adaptation methods are deployed.
There is no research work to discuss the MVV adaptation of
decision strategy or provide a detailed analysis of a dynamic
network environment. This work addresses the mentioned issues
for MVV streaming over HTTP for emerging multi-view
displays. In this research work, the effect of various adaptations
of decision strategies are evaluated and, as a result, a
new quality-aware adaptation method is designed. The proposed
method is benefiting from layer based video coding in
such a way that high Quality of Experience (QoE) is maintained
in a cost-effective manner. The conducted experimental
results on MVV streaming using the proposed strategy are
showing that the perceptual 3D video quality, under adverse
network conditions, is enhanced significantly as a result of the
proposed quality-aware adaptation
Virtual transcendence experiences: Exploring technical and design challenges in multi-sensory environments
In this paper 1, we introduce the concept of Virtual Transcendence Experience (VTE) as a response to the interactions of several users sharing several immersive experiences through different media channels. For that, we review the current body of knowledge that has led to the development of a VTE system. This is followed by a
discussion of current technical and design challenges that could support the implementation of this concept. This discussion has informed the VTE framework (VTEf), which integrates different layers of experiences, including the role of each user and the technical challenges involved. We conclude this paper with suggestions for two scenarios and recommendations for the implementation of a system that could support VTEs
Multi-view video coding via virtual view generation
In this paper, a multi-view video coding method via generation of virtual picture sequences is proposed. Pictures are synthesized for the sake of better exploitation of the redundancies between neighbouring views in a multi-view sequence. Pictures are synthesized through a 3D warping method to estimate certain views in a multi-view set. Depth map and associated colour video sequences are used for view generation and tests. H. 264/AVC coding standard based MVC draft software is used for coding colour videos and depth maps as well as certain views which are predicted from the virtually generated views. Results for coding these views with the proposed method are compared against the reference H. 264/AVC simulcast method under some low delay coding scenarios. The rate-distortion performance of the proposed method outperforms that of the reference method at all bit-rates
Predicting head trajectories in 360° virtual reality videos
In this paper a fixation prediction based saliency algorithm is used in order to predict the head movements of viewers watching virtual reality (VR) videos, by modelling the
relationship between fixation predictions and recorded head movements. The saliency algorithm is applied to viewings faithfully recreated from recorded head movements. Spherical cross-correlation analysis is performed between predicted attention centres and actual viewing centres in order to try and
identify prevalent lengths of predictable attention and how early they can be predicted. The results show that fixation prediction based saliency analysis correlates with head movements only for limited durations. Therefore, further classification of durations where saliency analysis is predictive is
required
No-reference depth map quality evaluation model based on depth map edge confidence measurement in immersive video applications
When it comes to evaluating perceptual quality of digital media for overall quality of
experience assessment in immersive video applications, typically two main approaches stand out:
Subjective and objective quality evaluation. On one hand, subjective quality evaluation offers the
best representation of perceived video quality assessed by the real viewers. On the other hand, it
consumes a significant amount of time and effort, due to the involvement of real users with lengthy
and laborious assessment procedures. Thus, it is essential that an objective quality evaluation model
is developed. The speed-up advantage offered by an objective quality evaluation model, which can
predict the quality of rendered virtual views based on the depth maps used in the rendering process,
allows for faster quality assessments for immersive video applications. This is particularly
important given the lack of a suitable reference or ground truth for comparing the available depth
maps, especially when live content services are offered in those applications. This paper presents a
no-reference depth map quality evaluation model based on a proposed depth map edge confidence
measurement technique to assist with accurately estimating the quality of rendered (virtual) views
in immersive multi-view video content. The model is applied for depth image-based rendering in
multi-view video format, providing comparable evaluation results to those existing in the literature,
and often exceeding their performance
Adaptive delivery of immersive 3D multi-view video over the Internet
The increase in Internet bandwidth and the developments in 3D video technology have paved the way for the delivery of 3D Multi-View Video (MVV) over the Internet. However, large amounts of data and dynamic network conditions result in frequent network congestion, which may prevent video packets from being delivered on time. As a consequence, the 3D video experience may well be degraded unless content-aware precautionary mechanisms and adaptation methods are deployed. In this work, a novel adaptive MVV streaming method is introduced which addresses the future generation 3D immersive MVV experiences with multi-view displays. When the user experiences network congestion, making it necessary to perform adaptation, the rate-distortion optimum set of views that are pre-determined by the server, are truncated from the delivered MVV streams. In order to maintain high Quality of Experience (QoE) service during the frequent network congestion, the proposed method involves the calculation of low-overhead additional metadata that is delivered to the client. The proposed adaptive 3D MVV streaming solution is tested using the MPEG Dynamic Adaptive Streaming over HTTP (MPEG-DASH) standard. Both extensive objective and subjective evaluations are presented, showing that the proposed method provides significant quality enhancement under the adverse network conditions
Quality-aware adaptive delivery of multi-view video
Advances in video coding and networking technologies have
paved the way for the Multi-View Video (MVV) streaming.
However, large amounts of data and dynamic network conditions
result in frequent network congestion, which may prevent
video packets from being delivered on time. As a consequence,
the 3D viewing experience may be degraded signifi-
cantly, unless quality-aware adaptation methods are deployed.
There is no research work to discuss the MVV adaptation of
decision strategy or provide a detailed analysis of a dynamic
network environment. This work addresses the mentioned issues
for MVV streaming over HTTP for emerging multi-view
displays. In this research work, the effect of various adaptations
of decision strategies are evaluated and, as a result, a
new quality-aware adaptation method is designed. The proposed
method is benefiting from layer based video coding in
such a way that high Quality of Experience (QoE) is maintained
in a cost-effective manner. The conducted experimental
results on MVV streaming using the proposed strategy are
showing that the perceptual 3D video quality, under adverse
network conditions, is enhanced significantly as a result of the
proposed quality-aware adaptation
Error concealment-aware encoding for robust video transmission
In this paper an error concealment-aware encoding scheme is proposed to improve the quality of decoded video in broadcast environments prone to transmission errors and data loss. The proposed scheme is based on a scalable coding approach where the best error concealment (EC) methods to be
used at the decoder are optimally determined at the encoder and signalled to the decoder through SEI messages. Such optimal EC modes are found by simulating transmission losses followed by a
lagrangian optimisation of the signalling rate - EC distortion cost. A generalised saliency-weighted distortion is used and the residue between coded frames and their EC substitutes is encoded using a rate-controlled enhancement layer. When data loss occurs the decoder uses the signalling information is used at the decoder, in case of data loss, to improve the reconstruction quality. The simulation results show that the proposed method achieves consistent quality gains in comparison with other reference methods and previous works. Using only the EC mode signalling, i.e., without any residue transmitted in the enhancement layer, an average PSNR gain up to 2.95 dB is achieved, while using the full EC-aware scheme, i.e., including residue encoded in
the enhancement layer, the proposed scheme outperforms other comparable methods, with PSNR gain up to 3.79 dB
A two-stage approach for robust HEVC coding and streaming
The increased compression ratios achieved by the High Efficiency Video Coding (HEVC) standard lead to reduced
robustness of coded streams, with increased susceptibility to
network errors and consequent video quality degradation. This
paper proposes a method based on a two-stage approach to
improve the error robustness of HEVC streaming, by reducing
temporal error propagation in case of frame loss. The prediction mismatch that occurs at the decoder after frame loss is reduced through the following two stages: (i) at the encoding stage, the reference pictures are dynamically selected based on constraining conditions and Lagrangian optimisation, which distributes the use of reference pictures, by reducing the number of prediction units (PUs) that depend on a single reference; (ii) at the streaming stage, a motion vector (MV) prioritisation algorithm, based on spatial dependencies, selects an optimal sub-set of MVs to be transmitted, redundantly, as side information to reduce mismatched MV predictions at the decoder. The simulation results show that the proposed method significantly reduces the effect of temporal error propagation. Compared to the reference HEVC, the proposed reference picture selection method is able to improve the video quality at low packet loss rates (e.g., 1%) using
the same bitrate, achieving quality gains up to 2.3 dB for 10%
of packet loss ratio. It is shown, for instance, that the redundant MVs are able to boost the performance achieving quality gains of 3 dB when compared to the reference HEVC, at the cost using 4% increase in total bitrate