5 research outputs found

    A SURVEY ON WEB MULTIMEDIA MINING

    Get PDF
    ABSTRACT Modern developments in digital media technologies has made transmitting and storing large amounts of multi/rich media data (e.g. text, images, music, video and their combination

    Object detection and activity recognition in digital image and video libraries

    Get PDF
    This thesis is a comprehensive study of object-based image and video retrieval, specifically for car and human detection and activity recognition purposes. The thesis focuses on the problem of connecting low level features to high level semantics by developing relational object and activity presentations. With the rapid growth of multimedia information in forms of digital image and video libraries, there is an increasing need for intelligent database management tools. The traditional text based query systems based on manual annotation process are impractical for today\u27s large libraries requiring an efficient information retrieval system. For this purpose, a hierarchical information retrieval system is proposed where shape, color and motion characteristics of objects of interest are captured in compressed and uncompressed domains. The proposed retrieval method provides object detection and activity recognition at different resolution levels from low complexity to low false rates. The thesis first examines extraction of low level features from images and videos using intensity, color and motion of pixels and blocks. Local consistency based on these features and geometrical characteristics of the regions is used to group object parts. The problem of managing the segmentation process is solved by a new approach that uses object based knowledge in order to group the regions according to a global consistency. A new model-based segmentation algorithm is introduced that uses a feedback from relational representation of the object. The selected unary and binary attributes are further extended for application specific algorithms. Object detection is achieved by matching the relational graphs of objects with the reference model. The major advantages of the algorithm can be summarized as improving the object extraction by reducing the dependence on the low level segmentation process and combining the boundary and region properties. The thesis then addresses the problem of object detection and activity recognition in compressed domain in order to reduce computational complexity. New algorithms for object detection and activity recognition in JPEG images and MPEG videos are developed. It is shown that significant information can be obtained from the compressed domain in order to connect to high level semantics. Since our aim is to retrieve information from images and videos compressed using standard algorithms such as JPEG and MPEG, our approach differentiates from previous compressed domain object detection techniques where the compression algorithms are governed by characteristics of object of interest to be retrieved. An algorithm is developed using the principal component analysis of MPEG motion vectors to detect the human activities; namely, walking, running, and kicking. Object detection in JPEG compressed still images and MPEG I frames is achieved by using DC-DCT coefficients of the luminance and chrominance values in the graph based object detection algorithm. The thesis finally addresses the problem of object detection in lower resolution and monochrome images. Specifically, it is demonstrated that the structural information of human silhouettes can be captured from AC-DCT coefficients

    Efficient Techniques for Management and Delivery of Video Data

    Get PDF
    The rapid advances in electronic imaging, storage, data compression telecommunications, and networking technology have resulted in a vast creation and use of digital videos in many important applications such as digital libraries, distance learning, public information systems, electronic commerce, movie on demand, etc. This brings about the need for management as well as delivery of video data. Organizing and managing video data, however, is much more complex than managing conventional text data due to their semantically rich and unstructured contents. Also, the enormous size of video files requires high communication bandwidth for data delivery. In this dissertation, I present the following techniques for video data management and delivery. Decomposing video into meaningful pieces (i.e., shots) is a very fundamental step to handling the complicated contents of video data. Content-based video parsing techniques are presented and analyzed. In order to reduce the computation cost substantially, a non-sequential approach to shot boundary detection is investigated. Efficient browsing and indexing of video data are essential for video data management. Non-linear browsing and cost-effective indexing schemes for video data based on their contents are described and evaluated. In order to satisfy various user requests, delivering long videos through the limited capacity of bandwidth is challenging work. To reduce the demand on this bandwidth, a hybrid of two effective approaches, periodic broadcast and scheduled multicast, is discussed and simulated. The current techniques related to the above works are discussed thoroughly to explain their advantages and disadvantages, and to make the new improved schemes. The substantial amount of experiments and simulations as well as the concepts are provided to compare the introduced techniques with the other existing ones. The results indicate that they outperform recent techniques by a significant margin. I conclude the dissertation with a discussing of future research directions

    ADVISE: advanced digital video information segmentation engine.

    Get PDF
    by Chung-Wing Ng.Thesis (M.Phil.)--Chinese University of Hong Kong, 2002.Includes bibliographical references (leaves 100-107).Abstracts in English and Chinese.Abstract --- p.iiAcknowledgment --- p.viTable of Contents --- p.viiList of Tables --- p.xList of Figures --- p.xiChapter Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Image-based Video Description --- p.2Chapter 1.2 --- Video Summary --- p.5Chapter 1.3 --- Video Matching --- p.6Chapter 1.4 --- Contributions --- p.7Chapter 1.5 --- Outline of Thesis --- p.8Chapter Chapter 2 --- Literature Review --- p.10Chapter 2.1 --- Video Retrieval in Digital Video Libraries --- p.11Chapter 2.1.1 --- The VISION Project --- p.11Chapter 2.1.2 --- The INFORMEDIA Project --- p.12Chapter 2.1.3 --- Discussion --- p.13Chapter 2.2 --- Video Structuring --- p.14Chapter 2.2.1 --- Video Segmentation --- p.16Chapter 2.2.2 --- Color histogram Extraction --- p.17Chapter 2.2.3 --- Further Structuring --- p.18Chapter 2.3 --- XML Technologies --- p.19Chapter 2.3.1 --- XML Syntax --- p.20Chapter 2.3.2 --- "Document Type Definition, DTD" --- p.21Chapter 2.3.3 --- "Extensible Stylesheet Language, XSL" --- p.21Chapter 2.4 --- SMIL Technology --- p.22Chapter 2.4.1 --- SMIL Syntax --- p.23Chapter 2.4.2 --- Model of SMIL Applications --- p.23Chapter Chapter 3 --- Overview of ADVISE --- p.25Chapter 3.1 --- Objectives --- p.26Chapter 3.2 --- System Architecture --- p.26Chapter 3.2.1 --- Video Preprocessing Module --- p.26Chapter 3.2.2 --- Web-based Video Retrieval Module --- p.30Chapter 3.2.3 --- Video Streaming Server --- p.34Chapter 3.3 --- Summary --- p.35Chapter Chapter 4 --- Construction of Video Table-of-Contents (V-ToC) --- p.36Chapter 4.1 --- Video Structuring --- p.37Chapter 4.1.1 --- Terms and Definitions --- p.37Chapter 4.1.2 --- Regional Color Histograms --- p.39Chapter 4.1.3 --- Video Shot Boundaries Detection --- p.43Chapter 4.1.4 --- Video Groups Formation --- p.47Chapter 4.1.5 --- Video Scenes Formation --- p.50Chapter 4.2 --- Storage and Presentation --- p.53Chapter 4.2.1 --- Definition of XML Video Structure --- p.54Chapter 4.2.2 --- V-ToC Presentation Using XSL --- p.55Chapter 4.3 --- Evaluation of Video Structure --- p.58Chapter Chapter 5 --- Video Summarization --- p.62Chapter 5.1 --- Terms and Definitions --- p.64Chapter 5.2 --- Video Features Used for Summarization --- p.65Chapter 5.3 --- Video Summarization Algorithm --- p.67Chapter 5.3.1 --- Combining Extracted Video Segments --- p.68Chapter 5.3.2 --- Scoring the Extracted Video Segments --- p.69Chapter 5.3.3 --- Selecting Extracted Video Segments --- p.70Chapter 5.3.4 --- Refining the Selection Result --- p.71Chapter 5.4 --- Video Summary in SMIL --- p.74Chapter 5.5 --- Evaluations --- p.76Chapter 5.5.1 --- Experiment 1: Percentages of Features Extracted --- p.76Chapter 5.5.2 --- Experiment 2: Evaluation of the Refinement Process --- p.78Chapter Chapter 6 --- Video Matching Using V-ToC --- p.80Chapter 6.1 --- Terms and Definitions --- p.81Chapter 6.2 --- Video Features Used for Matching --- p.82Chapter 6.3 --- Non-ordered Tree Matching Algorithm --- p.83Chapter 6.4 --- Ordered Tree Matching Algorithms --- p.87Chapter 6.5 --- Evaluation of Video Matching --- p.91Chapter 6.5.1 --- Applying Non-ordered Tree Matching --- p.92Chapter 6.5.2 --- Applying Ordered Tree Matching --- p.94Chapter Chapter 7 --- Conclusion --- p.96Bibliography --- p.10
    corecore