77 research outputs found

    Digital photo album management techniques: from one dimension to multi-dimension.

    Get PDF
    Lu Yang.Thesis submitted in: November 2004.Thesis (M.Phil.)--Chinese University of Hong Kong, 2005.Includes bibliographical references (leaves 96-103).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.ivChapter 1 --- Introduction --- p.1Chapter 1.1 --- Motivation --- p.1Chapter 1.2 --- Our Contributions --- p.3Chapter 1.3 --- Thesis Outline --- p.5Chapter 2 --- Background Study --- p.7Chapter 2.1 --- MPEG-7 Introduction --- p.8Chapter 2.2 --- Image Analysis in CBIR Systems --- p.11Chapter 2.2.1 --- Color Information --- p.13Chapter 2.2.2 --- Color Layout --- p.19Chapter 2.2.3 --- Texture Information --- p.20Chapter 2.2.4 --- Shape Information --- p.24Chapter 2.2.5 --- CBIR Systems --- p.26Chapter 2.3 --- Image Processing in JPEG Frequency Domain --- p.30Chapter 2.4 --- Photo Album Clustering --- p.33Chapter 3 --- Feature Extraction and Similarity Analysis --- p.38Chapter 3.1 --- Feature Set in Frequency Domain --- p.38Chapter 3.1.1 --- JPEG Frequency Data --- p.39Chapter 3.1.2 --- Our Feature Set --- p.42Chapter 3.2 --- Digital Photo Similarity Analysis --- p.43Chapter 3.2.1 --- Energy Histogram --- p.43Chapter 3.2.2 --- Photo Distance --- p.45Chapter 4 --- 1-Dimensional Photo Album Management Techniques --- p.49Chapter 4.1 --- Photo Album Sorting --- p.50Chapter 4.2 --- Photo Album Clustering --- p.52Chapter 4.3 --- Photo Album Compression --- p.56Chapter 4.3.1 --- Variable IBP frames --- p.56Chapter 4.3.2 --- Adaptive Search Window --- p.57Chapter 4.3.3 --- Compression Flow --- p.59Chapter 4.4 --- Experiments and Performance Evaluations --- p.60Chapter 5 --- High Dimensional Photo Clustering --- p.67Chapter 5.1 --- Traditional Clustering Techniques --- p.67Chapter 5.1.1 --- Hierarchical Clustering --- p.68Chapter 5.1.2 --- Traditional K-means --- p.71Chapter 5.2 --- Multidimensional Scaling --- p.74Chapter 5.2.1 --- Introduction --- p.75Chapter 5.2.2 --- Classical Scaling --- p.77Chapter 5.3 --- Our Interactive MDS-based Clustering --- p.80Chapter 5.3.1 --- Principal Coordinates from MDS --- p.81Chapter 5.3.2 --- Clustering Scheme --- p.82Chapter 5.3.3 --- Layout Scheme --- p.84Chapter 5.4 --- Experiments and Results --- p.87Chapter 6 --- Conclusions --- p.94Bibliography --- p.9

    Deliverable D1.1 State of the art and requirements analysis for hypervideo

    Get PDF
    This deliverable presents a state-of-art and requirements analysis report for hypervideo authored as part of the WP1 of the LinkedTV project. Initially, we present some use-case (viewers) scenarios in the LinkedTV project and through the analysis of the distinctive needs and demands of each scenario we point out the technical requirements from a user-side perspective. Subsequently we study methods for the automatic and semi-automatic decomposition of the audiovisual content in order to effectively support the annotation process. Considering that the multimedia content comprises of different types of information, i.e., visual, textual and audio, we report various methods for the analysis of these three different streams. Finally we present various annotation tools which could integrate the developed analysis results so as to effectively support users (video producers) in the semi-automatic linking of hypervideo content, and based on them we report on the initial progress in building the LinkedTV annotation tool. For each one of the different classes of techniques being discussed in the deliverable we present the evaluation results from the application of one such method of the literature to a dataset well-suited to the needs of the LinkedTV project, and we indicate the future technical requirements that should be addressed in order to achieve higher levels of performance (e.g., in terms of accuracy and time-efficiency), as necessary

    CONTENT-BASED IMAGE RETRIEVAL USING ENHANCED HYBRID METHODS WITH COLOR AND TEXTURE FEATURES

    Get PDF
    Content-based image retrieval (CBIR) automatically retrieves similar images to the query image by using the visual contents (features) of the image like color, texture and shape. Effective CBIR is based on efficient feature extraction for indexing and on effective query image matching with the indexed images for retrieval. However the main issue in CBIR is that how to extract the features efficiently because the efficient features describe well the image and they are used efficiently in matching of the images to get robust retrieval. This issue is the main inspiration for this thesis to develop a hybrid CBIR with high performance in the spatial and frequency domains. We propose various approaches, in which different techniques are fused to extract the statistical color and texture features efficiently in both domains. In spatial domain, the statistical color histogram features are computed using the pixel distribution of the Laplacian filtered sharpened images based on the different quantization schemes. However color histogram does not provide the spatial information. The solution is by using the histogram refinement method in which the statistical features of the regions in histogram bins of the filtered image are extracted but it leads to high computational cost, which is reduced by dividing the image into the sub-blocks of different sizes, to extract the color and texture features. To improve further the performance, color and texture features are combined using sub-blocks due to the less computational cos

    AUTOMATED FEATURE EXTRACTION AND CONTENT-BASED RETRIEVAL OFPATHOLOGY MICROSCOPIC IMAGES USING K-MEANS CLUSTERING AND CODE RUN-LENGTH PROBABILITY DISTRIBUTION

    Get PDF
    The dissertation starts with an extensive literature survey on the current issues in content-based image retrieval (CBIR) research, the state-of-the-art theories, methodologies, and implementations, covering topics such as general information retrieval theories, imaging, image feature identification and extraction, feature indexing and multimedia database search, user-system interaction, relevance feedback, and performance evaluation. A general CBIR framework has been proposed with three layers: image document space, feature space, and concept space. The framework emphasizes that while the projection from the image document space to the feature space is algorithmic and unrestricted, the connection between the feature space and the concept space is based on statistics instead of semantics. The scheme favors image features that do not rely on excessive assumptions about image contentAs an attempt to design a new CBIR methodology following the above framework, k-means clustering color quantization is applied to pathology microscopic images, followed by code run-length probability distribution feature extraction. Kulback-Liebler divergence is used as distance measure for feature comparison. For content-based retrieval, the distance between two images is defined as a function of all individual features. The process is highly automated and the system is capable of working effectively across different tissues without human interference. Possible improvements and future directions have been discussed

    An object-based approach to retrieval of image and video content

    Get PDF
    Promising new directions have been opened up for content-based visual retrieval in recent years. Object-based retrieval which allows users to manipulate video objects as part of their searching and browsing interaction, is one of these. It is the purpose of this thesis to constitute itself as a part of a larger stream of research that investigates visual objects as a possible approach to advancing the use of semantics in content-based visual retrieval. The notion of using objects in video retrieval has been seen as desirable for some years, but only very recently has technology started to allow even very basic object-location functions on video. The main hurdles to greater use of objects in video retrieval are the overhead of object segmentation on large amounts of video and the issue of whether objects can actually be used efficiently for multimedia retrieval. Despite this, there are already some examples of work which supports retrieval based on video objects. This thesis investigates an object-based approach to content-based visual retrieval. The main research contributions of this work are a study of shot boundary detection on compressed domain video where a fast detection approach is proposed and evaluated, and a study on the use of objects in interactive image retrieval. An object-based retrieval framework is developed in order to investigate object-based retrieval on a corpus of natural image and video. This framework contains the entire processing chain required to analyse, index and interactively retrieve images and video via object-to-object matching. The experimental results indicate that object-based searching consistently outperforms image-based search using low-level features. This result goes some way towards validating the approach of allowing users to select objects as a basis for searching video archives when the information need dictates it as appropriate

    Video Indexing and Retrieval Techniques Using Novel Approaches to Video Segmentation, Characterization, and Similarity Matching

    Get PDF
    Multimedia applications are rapidly spread at an ever-increasing rate introducing a number of challenging problems at the hands of the research community, The most significant and influential problem, among them, is the effective access to stored data. In spite of the popularity of keyword-based search technique in alphanumeric databases, it is inadequate for use with multimedia data due to their unstructured nature. On the other hand, a number of content-based access techniques have been developed in the context of image indexing and retrieval; meanwhile video retrieval systems start to gain wide attention, This work proposes a number of techniques constituting a fully content-based system for retrieving video data. These techniques are primarily targeting the efficiency, reliability, scalability, extensibility, and effectiveness requirements of such applications. First, an abstract representation of the video stream, known as the DC sequence, is extracted. Second, to deal with the problem of video segmentation, an efficient neural network model is introduced. The novel use of the neural network improves the reliability while the efficiency is achieved through the instantaneous use of the recall phase to identify shot boundaries. Third, the problem of key frames extraction is addressed using two efficient algorithms that adapt their selection decisions based on the amount of activity found in each video shot enabling the selection of a near optimal expressive set of key frames. Fourth, the developed system employs an indexing scheme that supports two low-level features, color and texture, to represent video data, Finally, we propose, in the retrieval stage, a novel model for performing video data matching task that integrates a number of human-based similarity factors. All our software implementations are in Java, which enables it to be used across heterogeneous platforms. The retrieval system performance has been evaluated yielding a very good retrieval rate and accuracy, which demonstrate the effectiveness of the developed system

    Automatic object classification for surveillance videos.

    Get PDF
    PhDThe recent popularity of surveillance video systems, specially located in urban scenarios, demands the development of visual techniques for monitoring purposes. A primary step towards intelligent surveillance video systems consists on automatic object classification, which still remains an open research problem and the keystone for the development of more specific applications. Typically, object representation is based on the inherent visual features. However, psychological studies have demonstrated that human beings can routinely categorise objects according to their behaviour. The existing gap in the understanding between the features automatically extracted by a computer, such as appearance-based features, and the concepts unconsciously perceived by human beings but unattainable for machines, or the behaviour features, is most commonly known as semantic gap. Consequently, this thesis proposes to narrow the semantic gap and bring together machine and human understanding towards object classification. Thus, a Surveillance Media Management is proposed to automatically detect and classify objects by analysing the physical properties inherent in their appearance (machine understanding) and the behaviour patterns which require a higher level of understanding (human understanding). Finally, a probabilistic multimodal fusion algorithm bridges the gap performing an automatic classification considering both machine and human understanding. The performance of the proposed Surveillance Media Management framework has been thoroughly evaluated on outdoor surveillance datasets. The experiments conducted demonstrated that the combination of machine and human understanding substantially enhanced the object classification performance. Finally, the inclusion of human reasoning and understanding provides the essential information to bridge the semantic gap towards smart surveillance video systems

    A rapid and robust method for shot boundary detection and classification in uncompressed MPEG video sequences

    Get PDF
    Abstract Shot boundary and classification is the first and most important step for further analysis of video content. Shot transitions include abrupt changes and gradual changes. A rapid and robust method for shot boundary detection and classification in MPEG compressed sequences is proposed in this paper. We firstly only decode I frames partly in video sequences to generate DC images and then calculate the difference values of histogram of these DC images in order to detect roughly the shot boundary. Then, for abrupt change detection, shot boundary is precisely located by movement information of B frames. Shot gradual change is located by difference values of successive N I frames and classified by the alteration of the number of intra coding macroblocks (MBs) in P frames. All features such as the number of MBs in frames are extracted from uncompressed video sequences. Experiments have been done on the standard TRECVid video database and others to reveal the performance of the proposed method
    • …
    corecore