2,208 research outputs found
Fast Depth and Inter Mode Prediction for Quality Scalable High Efficiency Video Coding
International audienceThe scalable high efficiency video coding (SHVC) is an extension of high efficiency video coding (HEVC), which introduces multiple layers and inter-layer prediction, thus significantly increases the coding complexity on top of the already complicated HEVC encoder. In inter prediction for quality SHVC, in order to determine the best possible mode at each depth level, a coding tree unit can be recursively split into four depth levels, including merge mode, inter2Nx2N, inter2NxN, interNx2N, interNxN, in-ter2NxnU, inter2NxnD, internLx2N and internRx2N, intra modes and inter-layer reference (ILR) mode. This can obtain the highest coding efficiency, but also result in very high coding complexity. Therefore, it is crucial to improve coding speed while maintaining coding efficiency. In this research, we have proposed a new depth level and inter mode prediction algorithm for quality SHVC. First, the depth level candidates are predicted based on inter-layer correlation, spatial correlation and its correlation degree. Second, for a given depth candidate, we divide mode prediction into square and non-square mode predictions respectively. Third, in the square mode prediction, ILR and merge modes are predicted according to depth correlation, and early terminated whether residual distribution follows a Gaussian distribution. Moreover, ILR mode, merge mode and inter2Nx2N are early terminated based on significant differences in Rate Distortion (RD) costs. Fourth, if the early termination condition cannot be satisfied, non-square modes are further predicted based on significant differences in expected values of residual coefficients. Finally, inter-layer and spatial correlations are combined with residual distribution to examine whether to early terminate depth selection. Experimental results have demonstrated that, on average, the proposed algorithm can achieve a time saving of 71.14%, with a bit rate increase of 1.27%
Differentially Private Multi-Agent Planning for Logistic-like Problems
Planning is one of the main approaches used to improve agents' working
efficiency by making plans beforehand. However, during planning, agents face
the risk of having their private information leaked. This paper proposes a
novel strong privacy-preserving planning approach for logistic-like problems.
This approach outperforms existing approaches by addressing two challenges: 1)
simultaneously achieving strong privacy, completeness and efficiency, and 2)
addressing communication constraints. These two challenges are prevalent in
many real-world applications including logistics in military environments and
packet routing in networks. To tackle these two challenges, our approach adopts
the differential privacy technique, which can both guarantee strong privacy and
control communication overhead. To the best of our knowledge, this paper is the
first to apply differential privacy to the field of multi-agent planning as a
means of preserving the privacy of agents for logistic-like problems. We
theoretically prove the strong privacy and completeness of our approach and
empirically demonstrate its efficiency. We also theoretically analyze the
communication overhead of our approach and illustrate how differential privacy
can be used to control it
Multilayered assembly of poly(vinylidene fluoride) and poly(methyl methacrylate) for achieving multi-shape memory effects
Learning Two-Stream CNN for Multi-Modal Age-related Macular Degeneration Categorization
This paper tackles automated categorization of Age-related Macular
Degeneration (AMD), a common macular disease among people over 50. Previous
research efforts mainly focus on AMD categorization with a single-modal input,
let it be a color fundus image or an OCT image. By contrast, we consider AMD
categorization given a multi-modal input, a direction that is clinically
meaningful yet mostly unexplored. Contrary to the prior art that takes a
traditional approach of feature extraction plus classifier training that cannot
be jointly optimized, we opt for end-to-end multi-modal Convolutional Neural
Networks (MM-CNN). Our MM-CNN is instantiated by a two-stream CNN, with
spatially-invariant fusion to combine information from the fundus and OCT
streams. In order to visually interpret the contribution of the individual
modalities to the final prediction, we extend the class activation mapping
(CAM) technique to the multi-modal scenario. For effective training of MM-CNN,
we develop two data augmentation methods. One is GAN-based fundus / OCT image
synthesis, with our novel use of CAMs as conditional input of a high-resolution
image-to-image translation GAN. The other method is Loose Pairing, which pairs
a fundus image and an OCT image on the basis of their classes instead of eye
identities. Experiments on a clinical dataset consisting of 1,099 color fundus
images and 1,290 OCT images acquired from 1,099 distinct eyes verify the
effectiveness of the proposed solution for multi-modal AMD categorization
- …