14,322 research outputs found
Image sequence analysis for emerging interactive multimedia services - The European COST 211 framework
Cataloged from PDF version of article.Flexibility and efficiency of coding, content extraction,
and content-based search are key research topics in
the field of interactive multimedia. Ongoing ISO MPEG-4 and
MPEG-7 activities are targeting standardization to facilitate such
services. European COST Telecommunications activities provide
a framework for research collaboration. COST 211bis and COST
211ter activities have been instrumental in the definition and
development of the ITU-T H.261 and H.263 standards for videoconferencing
over ISDN and videophony over regular phone
lines, respectively. The group has also contributed significantly
to the ISO MPEG-4 activities. At present a significant effort
of the COST 211ter group activities is dedicated toward image
and video sequence analysis and segmentation—an important
technological aspect for the success of emerging object-based
MPEG-4 and MPEG-7 multimedia applications. The current
work of COST 211 is centered around the test model, called
the Analysis Model (AM). The essential feature of the AM is
its ability to fuse information from different sources to achieve
a high-quality object segmentation. The current information
sources are the intermediate results from frame-based (still) color
segmentation, motion vector based segmentation, and changedetection-based
segmentation. Motion vectors, which form the
basis for the motion vector based intermediate segmentation, are
estimated from consecutive frames. A recursive shortest spanning
tree (RSST) algorithm is used to obtain intermediate color and
motion vector based segmentation results. A rule-based region
processor fuses the intermediate results; a postprocessor further
refines the final segmentation output. The results of the current
AM are satisfactory; it is expected that there will be further
improvements of the AM within the COST 211 project
Efficient MRF Energy Propagation for Video Segmentation via Bilateral Filters
Segmentation of an object from a video is a challenging task in multimedia
applications. Depending on the application, automatic or interactive methods
are desired; however, regardless of the application type, efficient computation
of video object segmentation is crucial for time-critical applications;
specifically, mobile and interactive applications require near real-time
efficiencies. In this paper, we address the problem of video segmentation from
the perspective of efficiency. We initially redefine the problem of video
object segmentation as the propagation of MRF energies along the temporal
domain. For this purpose, a novel and efficient method is proposed to propagate
MRF energies throughout the frames via bilateral filters without using any
global texture, color or shape model. Recently presented bi-exponential filter
is utilized for efficiency, whereas a novel technique is also developed to
dynamically solve graph-cuts for varying, non-lattice graphs in general linear
filtering scenario. These improvements are experimented for both automatic and
interactive video segmentation scenarios. Moreover, in addition to the
efficiency, segmentation quality is also tested both quantitatively and
qualitatively. Indeed, for some challenging examples, significant time
efficiency is observed without loss of segmentation quality.Comment: Multimedia, IEEE Transactions on (Volume:16, Issue: 5, Aug. 2014
Video matching using DC-image and local features
This paper presents a suggested framework for video matching based on local features extracted from the DCimage of MPEG compressed videos, without decompression. The relevant arguments and supporting evidences are discussed for developing video similarity techniques that works directly on compressed videos, without decompression, and especially utilising small size images. Two experiments are carried to support the above. The first is comparing between the DC-image and I-frame, in terms of matching performance and the corresponding computation complexity. The second experiment compares between using local features and global features in video matching, especially in the compressed domain and with the small size images. The results confirmed that the use of DC-image, despite its highly reduced size, is promising as it produces at least similar (if not better) matching precision, compared to the full I-frame. Also, using SIFT, as a local feature, outperforms precision of most of the standard global features. On the other hand, its computation complexity is relatively higher, but it is still within the realtime margin. There are also various optimisations that can be done to improve this computation complexity
- …