137,897 research outputs found
Modelling of content-aware indicators for effective determination of shot boundaries in compressed MPEG videos
In this paper, a content-aware approach is proposed to design multiple test conditions for shot cut detection, which are organized into a multiple phase decision tree for abrupt cut detection and a finite state machine for dissolve detection. In comparison with existing approaches, our algorithm is characterized with two categories of content difference indicators and testing. While the first category indicates the content changes that are directly used for shot cut detection, the second category indicates the contexts under which the content change occurs. As a result, indications of frame differences are tested with context awareness to make the detection of shot cuts adaptive to both content and context changes. Evaluations announced by TRECVID 2007 indicate that our proposed algorithm achieved comparable performance to those using machine learning approaches, yet using a simpler feature set and straightforward design strategies. This has validated the effectiveness of modelling of content-aware indicators for decision making, which also provides a good alternative to conventional approaches in this topic
End-to-end Projector Photometric Compensation
Projector photometric compensation aims to modify a projector input image
such that it can compensate for disturbance from the appearance of projection
surface. In this paper, for the first time, we formulate the compensation
problem as an end-to-end learning problem and propose a convolutional neural
network, named CompenNet, to implicitly learn the complex compensation
function. CompenNet consists of a UNet-like backbone network and an autoencoder
subnet. Such architecture encourages rich multi-level interactions between the
camera-captured projection surface image and the input image, and thus captures
both photometric and environment information of the projection surface. In
addition, the visual details and interaction information are carried to deeper
layers along the multi-level skip convolution layers. The architecture is of
particular importance for the projector compensation task, for which only a
small training dataset is allowed in practice. Another contribution we make is
a novel evaluation benchmark, which is independent of system setup and thus
quantitatively verifiable. Such benchmark is not previously available, to our
best knowledge, due to the fact that conventional evaluation requests the
hardware system to actually project the final results. Our key idea, motivated
from our end-to-end problem formulation, is to use a reasonable surrogate to
avoid such projection process so as to be setup-independent. Our method is
evaluated carefully on the benchmark, and the results show that our end-to-end
learning solution outperforms state-of-the-arts both qualitatively and
quantitatively by a significant margin.Comment: To appear in the 2019 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR). Source code and dataset are available at
https://github.com/BingyaoHuang/compenne
Does the Soul's sleep generate the Reason? The symbol's compensatory aspect at quantum-psychoid matrix with regard to the Reason's unilateralism. Excerpt by.
A Symbol doesn't explain, says Jung. In fact it is beyond the dichotomy of the binary logic, that wants the limiting and restrictive diktat of the tertium non datur to be perpetuated so as to be obliged to choose between two possibilities being anyway on the same nomological axis
Channelized hotelling observers for signal detection in stack-mode reading of volumetric images on medical displays with slow response time
Volumetric medical images are commonly read in stack-browsing mode. However, previous studies suggest that slow temporal response of medical liquid crystal displays may degrade the diagnostic accuracy (lesion detectability) at browsing rates as low as 10 frames per second (fps). Recently, a multi-slice channelized Hotelling observer (msCHO) model was proposed to estimate the detection performance in 3D images. This implementation of the msCHO restricted the analysis to the luminance of a display pixel at the end of the frame time (end-of-frame luminance) while ignoring the luminance transition within the frame time (intra-frame luminance). Such an approach fails to differentiate between, for example, the commonly found case of two displays with different temporal profiles of luminance as long as their end-of-frame luminance levels are the same. In order to overcome this limitation of the msCHO, we propose a new upsampled msCHO (umsCHO) which acts on images obtained using both the intra-frame and the end-of-frame luminance information. The two models are compared on a set of synthesized 3D images for a range of browsing rates (16.67, 25 and 50 fps). Our results demonstrate that, depending on the details of the luminance transition profiles, neglecting the intra-frame luminance information may lead to over- or underestimation of lesion detectability. Therefore, we argue that using the umsCHO rather than msCHO model is more appropriate for estimating the detection performance in the stack-browsing mode
Indexing, browsing and searching of digital video
Video is a communications medium that normally brings together moving pictures with a synchronised audio track into a discrete piece or pieces of information. The size of a “piece ” of video can variously be referred to as a frame, a shot, a scene, a clip, a programme or an episode, and these are distinguished by their lengths and by their composition. We shall return to the definition of each of these in section 4 this chapter. In modern society, video is ver
Bodily awareness and novel multisensory features
According to the decomposition thesis, perceptual experiences resolve without remainder into their different modality-specific components. Contrary to this view, I argue that certain cases of multisensory integration give rise to experiences representing features of a novel type. Through the coordinated use of bodily awareness—understood here as encompassing both proprioception and kinaesthesis—and the exteroceptive sensory modalities, one becomes perceptually responsive to spatial features whose instances couldn’t be represented by any of the contributing modalities functioning in isolation. I develop an argument for this conclusion focusing on two cases: 3D shape perception in haptic touch and experiencing an object’s egocentric location in crossmodally accessible, environmental space
- …