561 research outputs found

    Automatic detection and extraction of artificial text in video

    Get PDF
    A significant challenge in large multimedia databases is the provision of efficient means for semantic indexing and retrieval of visual information. Artificial text in video is normally generated in order to supplement or summarise the visual content and thus is an important carrier of information that is highly relevant to the content of the video. As such, it is a potential ready-to-use source of semantic information. In this paper we present an algorithm for detection and localisation of artificial text in video using a horizontal difference magnitude measure and morphological processing. The result of character segmentation, based on a modified version of the Wolf-Jolion algorithm [1][2] is enhanced using smoothing and multiple binarisation. The output text is input to an “off-the-shelf” noncommercial OCR. Detection, localisation and recognition results for a 20min long MPEG-1 encoded television programme are presented

    Bilateral filter in image processing

    Get PDF
    The bilateral filter is a nonlinear filter that does spatial averaging without smoothing edges. It has shown to be an effective image denoising technique. It also can be applied to the blocking artifacts reduction. An important issue with the application of the bilateral filter is the selection of the filter parameters, which affect the results significantly. Another research interest of bilateral filter is acceleration of the computation speed. There are three main contributions of this thesis. The first contribution is an empirical study of the optimal bilateral filter parameter selection in image denoising. I propose an extension of the bilateral filter: multi resolution bilateral filter, where bilateral filtering is applied to the low-frequency sub-bands of a signal decomposed using a wavelet filter bank. The multi resolution bilateral filter is combined with wavelet thresholding to form a new image denoising framework, which turns out to be very effective in eliminating noise in real noisy images. The second contribution is that I present a spatially adaptive method to reduce compression artifacts. To avoid over-smoothing texture regions and to effectively eliminate blocking and ringing artifacts, in this paper, texture regions and block boundary discontinuities are first detected; these are then used to control/adapt the spatial and intensity parameters of the bilateral filter. The test results prove that the adaptive method can improve the quality of restored images significantly better than the standard bilateral filter. The third contribution is the improvement of the fast bilateral filter, in which I use a combination of multi windows to approximate the Gaussian filter more precisely

    Shape representation and coding of visual objets in multimedia applications — An overview

    Get PDF
    Emerging multimedia applications have created the need for new functionalities in digital communications. Whereas existing compression standards only deal with the audio-visual scene at a frame level, it is now necessary to handle individual objects separately, thus allowing scalable transmission as well as interactive scene recomposition by the receiver. The future MPEG-4 standard aims at providing compression tools addressing these functionalities. Unlike existing frame-based standards, the corresponding coding schemes need to encode shape information explicitly. This paper reviews existing solutions to the problem of shape representation and coding. Region and contour coding techniques are presented and their performance is discussed, considering coding efficiency and rate-distortion control capability, as well as flexibility to application requirements such as progressive transmission, low-delay coding, and error robustnes

    Image interpolation using Shearlet based iterative refinement

    Get PDF
    This paper proposes an image interpolation algorithm exploiting sparse representation for natural images. It involves three main steps: (a) obtaining an initial estimate of the high resolution image using linear methods like FIR filtering, (b) promoting sparsity in a selected dictionary through iterative thresholding, and (c) extracting high frequency information from the approximation to refine the initial estimate. For the sparse modeling, a shearlet dictionary is chosen to yield a multiscale directional representation. The proposed algorithm is compared to several state-of-the-art methods to assess its objective as well as subjective performance. Compared to the cubic spline interpolation method, an average PSNR gain of around 0.8 dB is observed over a dataset of 200 images

    NEW CHANGE DETECTION MODELS FOR OBJECT-BASED ENCODING OF PATIENT MONITORING VIDEO

    Get PDF
    The goal of this thesis is to find a highly efficient algorithm to compress patient monitoring video. This type of video mainly contains local motions and a large percentage of idle periods. To specifically utilize these features, we present an object-based approach, which decomposes input video into three objects representing background, slow-motion foreground and fast-motion foreground. Encoding these three video objects with different temporal scalabilities significantly improves the coding efficiency in terms of bitrate vs. visual quality. The video decomposition is built upon change detection which identifies content changes between video frames. To improve the robustness of capturing small changes, we contribute two new change detection models. The model built upon Markov random theory discriminates foreground containing the patient being monitored. The other model, called covariance test method, identifies constantly changing content by exploiting temporal correlation in multiple video frames. Both models show great effectiveness in constructing the defined video objects. We present detailed algorithms of video object construction, as well as experimental results on the object-based coding of patient monitoring video

    Combined Industry, Space and Earth Science Data Compression Workshop

    Get PDF
    The sixth annual Space and Earth Science Data Compression Workshop and the third annual Data Compression Industry Workshop were held as a single combined workshop. The workshop was held April 4, 1996 in Snowbird, Utah in conjunction with the 1996 IEEE Data Compression Conference, which was held at the same location March 31 - April 3, 1996. The Space and Earth Science Data Compression sessions seek to explore opportunities for data compression to enhance the collection, analysis, and retrieval of space and earth science data. Of particular interest is data compression research that is integrated into, or has the potential to be integrated into, a particular space or earth science data information system. Preference is given to data compression research that takes into account the scien- tist's data requirements, and the constraints imposed by the data collection, transmission, distribution and archival systems

    Video object segmentation and tracking.

    Get PDF
    Thesis (M.Sc.Eng.)-University of KwaZulu-Natal, 2005One of the more complex video processing problems currently vexing researchers is that of object segmentation. This involves identifying semantically meaningful objects in a scene and separating them from the background. While the human visual system is capable of performing this task with minimal effort, development and research in machine vision is yet to yield techniques that perform the task as effectively and efficiently. The problem is not only difficult due to the complexity of the mechanisms involved but also because it is an ill-posed problem. No unique segmentation of a scene exists as what is of interest as a segmented object depends very much on the application and the scene content. In most situations a priori knowledge of the nature of the problem is required, often depending on the specific application in which the segmentation tool is to be used. This research presents an automatic method of segmenting objects from a video sequence. The intent is to extract and maintain both the shape and contour information as the object changes dynamically over time in the sequence. A priori information is incorporated by requesting the user to tune a set of input parameters prior to execution of the algorithm. Motion is used as a semantic for video object extraction subject to the assumption that there is only one moving object in the scene and the only motion in the video sequence is that of the object of interest. It is further assumed that there is constant illumination and no occlusion of the object. A change detection mask is used to detect the moving object followed by morphological operators to refine the result. The change detection mask yields a model of the moving components; this is then compared to a contour map of the frame to extract a more accurate contour of the moving object and this is then used to extract the object of interest itself. Since the video object is moving as the sequence progresses, it is necessary to update the object over time. To accomplish this, an object tracker has been implemented based on the Hausdorff objectmatching algorithm. The dissertation begins with an overview of segmentation techniques and a discussion of the approach used in this research. This is followed by a detailed description of the algorithm covering initial segmentation, object tracking across frames and video object extraction. Finally, the semantic object extraction results for a variety of video sequences are presented and evaluated
    • 

    corecore