21,149 research outputs found

    Indización automática de vídeo

    Get PDF
    After an initial discussion of the problems presented by image and video indexing as compared to text indexing, the authors describe some of the basic techniques for automatic video indexing. The content-based retrieval paradigm and some automatic methods for segmentation and key-frame identification are further described. Certain low-level parameters for identifying an image are also introduced. The authors discuss the drawbacks of such automatic methods based solely on the image and give examples from projects using accompanying nformation as well, such as audio and captions. The article concludes by pointing out that the use of standards, like Mpeg-7, can promote the development of new and richer applications based on video

    Evolution of Information Retrieval System: Critical Review of Multimedia Information Retrieval System Based On Content, Context, and Concept

    Get PDF
    In recent years the explosive growth of information affects the flood of information. The amount of information must be followed by the development of the effective Information Retrieval System (IRS) so that the information will be easily accessible and useful for the user. The source of Information contains various media format, beside text there is also image, audio, and video that called multimedia. A large number of multimedia information rise the Multimedia Information Retrieval System (MIRS). Most of MIRS today is monolithic or only using one media format like Google1 for text search, tineye2 for image search, youtube3 for video search or 4shared4 for music and audio search. There is a need of information in any kind of media, not only retrieve the document in text format, but also retrieve the document in an image, audio and video format at once from any kind media format of the query. This study reviews the evolution of IRS, regress from text-based to concept- based MIRS. Unified Multimedia Indexing technique is discussed along with Concept-based MIRS. This critical review concludes that the evolution of IRS follows three paces: content-based, context-based and concept-based. Each pace takes on indexing system and retrieval techniques to optimize information retrieved. The challenge is how to come up with a retrieval technique that can process unified MIRS in order to retrieve optimally the relevant document

    Content-based indexing of low resolution documents

    Get PDF
    In any multimedia presentation, the trend for attendees taking pictures of slides that interest them during the presentation using capturing devices is gaining popularity. To enhance the image usefulness, the images captured could be linked to image or video database. The database can be used for the purpose of file archiving, teaching and learning, research and knowledge management, which concern image search. However, the above-mentioned devices include cameras or mobiles phones have low resolution resulted from poor lighting and noise. Content-Based Image Retrieval (CBIR) is considered among the most interesting and promising fields as far as image search is concerned. Image search is related with finding images that are similar for the known query image found in a given image database. This thesis concerns with the methods used for the purpose of identifying documents that are captured using image capturing devices. In addition, the thesis also concerns with a technique that can be used to retrieve images from an indexed image database. Both concerns above apply digital image processing technique. To build an indexed structure for fast and high quality content-based retrieval of an image, some existing representative signatures and the key indexes used have been revised. The retrieval performance is very much relying on how the indexing is done. The retrieval approaches that are currently in existence including making use of shape, colour and texture features. Putting into consideration these features relative to individual databases, the majority of retrievals approaches have poor results on low resolution documents, consuming a lot of time and in the some cases, for the given query image, irrelevant images are obtained. The proposed identification and indexing method in the thesis uses a Visual Signature (VS). VS consists of the captures slides textual layout’s graphical information, shape’s moment and spatial distribution of colour. This approach, which is signature-based are considered for fast and efficient matching to fulfil the needs of real-time applications. The approach also has the capability to overcome the problem low resolution document such as noisy image, the environment’s varying lighting conditions and complex backgrounds. We present hierarchy indexing techniques, whose foundation are tree and clustering. K-means clustering are used for visual features like colour since their spatial distribution give a good image’s global information. Tree indexing for extracted layout and shape features are structured hierarchically and Euclidean distance is used to get similarity image for CBIR. The assessment of the proposed indexing scheme is conducted based on recall and precision, a standard CBIR retrieval performance evaluation. We develop CBIR system and conduct various retrieval experiments with the fundamental aim of comparing the accuracy during image retrieval. A new algorithm that can be used with integrated visual signatures, especially in late fusion query was introduced. The algorithm has the capability of reducing any shortcoming associated with normalisation in initial fusion technique. Slides from conferences, lectures and meetings presentation are used for comparing the proposed technique’s performances with that of the existing approaches with the help of real data. This finding of the thesis presents exciting possibilities as the CBIR systems is able to produce high quality result even for a query, which uses low resolution documents. In the future, the utilization of multimodal signatures, relevance feedback and artificial intelligence technique are recommended to be used in CBIR system to further enhance the performance

    Desafios e avanços na recuperação automática da informação audiovisual

    Get PDF
    Exposição sobre processos e métodos utilizados para a indexação e recuperação textual da informação semântica em vídeo, tendo como base a identificação e classificação do seu conteúdo visual e sonoro. Palavras-chave: Sistemas de Recuperação da Informação Visual. Indexação de vídeos. Recuperação do conteúdo audiovisual. Challenges and advancements in automatic retrieval of audiovisual information Abstract Presentation of  methods and processes applied to classification and retrieval of semantic information of video programs, through identification of sound and visual content. Keywords: Content Based Image Retrieval. Video indexing. Multimedia content retrieval

    Focused image search in the social Web.

    Get PDF
    Recently, social multimedia-sharing websites, which allow users to upload, annotate, and share online photo or video collections, have become increasingly popular. The user tags or annotations constitute the new multimedia meta-data . We present an image search system that exploits both image textual and visual information. First, we use focused crawling and DOM Tree based web data extraction methods to extract image textual features from social networking image collections. Second, we propose the concept of visual words to handle the image\u27s visual content for fast indexing and searching. We also develop several user friendly search options to allow users to query the index using words and image feature descriptions (visual words). The developed image search system tries to bridge the gap between the scalable industrial image search engines, which are based on keyword search, and the slower content based image retrieval systems developed mostly in the academic field and designed to search based on image content only. We have implemented a working prototype by crawling and indexing over 16,056 images from flickr.com, one of the most popular image sharing websites. Our experimental results on a working prototype confirm the efficiency and effectiveness of the methods, that we proposed

    Video Indexing and Retrieval Techniques Using Novel Approaches to Video Segmentation, Characterization, and Similarity Matching

    Get PDF
    Multimedia applications are rapidly spread at an ever-increasing rate introducing a number of challenging problems at the hands of the research community, The most significant and influential problem, among them, is the effective access to stored data. In spite of the popularity of keyword-based search technique in alphanumeric databases, it is inadequate for use with multimedia data due to their unstructured nature. On the other hand, a number of content-based access techniques have been developed in the context of image indexing and retrieval; meanwhile video retrieval systems start to gain wide attention, This work proposes a number of techniques constituting a fully content-based system for retrieving video data. These techniques are primarily targeting the efficiency, reliability, scalability, extensibility, and effectiveness requirements of such applications. First, an abstract representation of the video stream, known as the DC sequence, is extracted. Second, to deal with the problem of video segmentation, an efficient neural network model is introduced. The novel use of the neural network improves the reliability while the efficiency is achieved through the instantaneous use of the recall phase to identify shot boundaries. Third, the problem of key frames extraction is addressed using two efficient algorithms that adapt their selection decisions based on the amount of activity found in each video shot enabling the selection of a near optimal expressive set of key frames. Fourth, the developed system employs an indexing scheme that supports two low-level features, color and texture, to represent video data, Finally, we propose, in the retrieval stage, a novel model for performing video data matching task that integrates a number of human-based similarity factors. All our software implementations are in Java, which enables it to be used across heterogeneous platforms. The retrieval system performance has been evaluated yielding a very good retrieval rate and accuracy, which demonstrate the effectiveness of the developed system

    Analysis of shot boundary detection techniques on a large video test suite

    Get PDF
    This thesis investigates how content-based indexing and retrieval systems can be used to analyse digital video. We focus particularly on the challenge of applying colour-analysis methods to large amounts of heterogeneous television broadcast video. Content-based systems are those which attempt to automatically analyse image or video documents by identifying and indexing certain features present in the documents. These features may include colour and texture, shape, and spatial locations. Digital video has become hugely important through the widespread use of the Internet and the increasing number of digital content providers supplying the commercial and domestic markets. The challenge facing the indexing of digital video information in order to support browsing and retrieval by users, is to design systems that can accurately and automatically process large amounts of heterogeneous video. The basic segmentation of video material into shots and scenes is the basic operation in the analysis of video content. Although many published methods of detecting shot boundaries exist, it is d ifficult to compare and contrast the available techniques. This is due to several reasons. Firstly, full system implementation details are not always published and this can make recreation of the systems difficult. Secondly, most systems are evaluated on small, homogeneous sequences of video. These results give little indication how such systems would perform on a broader range of video content types, or indeed how differing content types can affect system performance. As part of an ongoing video indexing and browsing project, our research has focused on the application of different methods of video segmentation to a large and diverse digital video collection. A particular focus is to examine how different segmentation methods perform on different video content types. With this information, it is hoped to develop a system capable of accurately segmenting a wide range of broadcast video. Oilier areas addressed in this thesis include an investigation of evaluation methods for digital video indexing systems, and the use of adaptive thresholds for segmentation of video into shots and scenes

    Surgical Phase Recognition of Short Video Shots Based on Temporal Modeling of Deep Features

    Full text link
    Recognizing the phases of a laparoscopic surgery (LS) operation form its video constitutes a fundamental step for efficient content representation, indexing and retrieval in surgical video databases. In the literature, most techniques focus on phase segmentation of the entire LS video using hand-crafted visual features, instrument usage signals, and recently convolutional neural networks (CNNs). In this paper we address the problem of phase recognition of short video shots (10s) of the operation, without utilizing information about the preceding/forthcoming video frames, their phase labels or the instruments used. We investigate four state-of-the-art CNN architectures (Alexnet, VGG19, GoogleNet, and ResNet101), for feature extraction via transfer learning. Visual saliency was employed for selecting the most informative region of the image as input to the CNN. Video shot representation was based on two temporal pooling mechanisms. Most importantly, we investigate the role of 'elapsed time' (from the beginning of the operation), and we show that inclusion of this feature can increase performance dramatically (69% vs. 75% mean accuracy). Finally, a long short-term memory (LSTM) network was trained for video shot classification based on the fusion of CNN features with 'elapsed time', increasing the accuracy to 86%. Our results highlight the prominent role of visual saliency, long-range temporal recursion and 'elapsed time' (a feature so far ignored), for surgical phase recognition.Comment: 6 pages, 4 figures, 6 table
    corecore