32,596 research outputs found

    Multimedia search without visual analysis: the value of linguistic and contextual information

    Get PDF
    This paper addresses the focus of this special issue by analyzing the potential contribution of linguistic content and other non-image aspects to the processing of audiovisual data. It summarizes the various ways in which linguistic content analysis contributes to enhancing the semantic annotation of multimedia content, and, as a consequence, to improving the effectiveness of conceptual media access tools. A number of techniques are presented, including the time-alignment of textual resources, audio and speech processing, content reduction and reasoning tools, and the exploitation of surface features

    Deep supervised hashing for fast retrieval of radio image cubes

    Full text link
    The shear number of sources that will be detected by next-generation radio surveys will be astronomical, which will result in serendipitous discoveries. Data-dependent deep hashing algorithms have been shown to be efficient at image retrieval tasks in the fields of computer vision and multimedia. However, there are limited applications of these methodologies in the field of astronomy. In this work, we utilize deep hashing to rapidly search for similar images in a large database. The experiment uses a balanced dataset of 2708 samples consisting of four classes: Compact, FRI, FRII, and Bent. The performance of the method was evaluated using the mean average precision (mAP) metric where a precision of 88.5\% was achieved. The experimental results demonstrate the capability to search and retrieve similar radio images efficiently and at scale. The retrieval is based on the Hamming distance between the binary hash of the query image and those of the reference images in the database.Comment: 4 pages, 4 figure

    An affect-based video retrieval system with open vocabulary querying

    Get PDF
    Content-based video retrieval systems (CBVR) are creating new search and browse capabilities using metadata describing significant features of the data. An often overlooked aspect of human interpretation of multimedia data is the affective dimension. Incorporating affective information into multimedia metadata can potentially enable search using this alternative interpretation of multimedia content. Recent work has described methods to automatically assign affective labels to multimedia data using various approaches. However, the subjective and imprecise nature of affective labels makes it difficult to bridge the semantic gap between system-detected labels and user expression of information requirements in multimedia retrieval. We present a novel affect-based video retrieval system incorporating an open-vocabulary query stage based on WordNet enabling search using an unrestricted query vocabulary. The system performs automatic annotation of video data with labels of well defined affective terms. In retrieval annotated documents are ranked using the standard Okapi retrieval model based on open-vocabulary text queries. We present experimental results examining the behaviour of the system for retrieval of a collection of automatically annotated feature films of different genres. Our results indicate that affective annotation can potentially provide useful augmentation to more traditional objective content description in multimedia retrieval

    Image mining: trends and developments

    Get PDF
    [Abstract]: Advances in image acquisition and storage technology have led to tremendous growth in very large and detailed image databases. These images, if analyzed, can reveal useful information to the human users. Image mining deals with the extraction of implicit knowledge, image data relationship, or other patterns not explicitly stored in the images. Image mining is more than just an extension of data mining to image domain. It is an interdisciplinary endeavor that draws upon expertise in computer vision, image processing, image retrieval, data mining, machine learning, database, and artificial intelligence. In this paper, we will examine the research issues in image mining, current developments in image mining, particularly, image mining frameworks, state-of-the-art techniques and systems. We will also identify some future research directions for image mining

    Discrete Multi-modal Hashing with Canonical Views for Robust Mobile Landmark Search

    Full text link
    Mobile landmark search (MLS) recently receives increasing attention for its great practical values. However, it still remains unsolved due to two important challenges. One is high bandwidth consumption of query transmission, and the other is the huge visual variations of query images sent from mobile devices. In this paper, we propose a novel hashing scheme, named as canonical view based discrete multi-modal hashing (CV-DMH), to handle these problems via a novel three-stage learning procedure. First, a submodular function is designed to measure visual representativeness and redundancy of a view set. With it, canonical views, which capture key visual appearances of landmark with limited redundancy, are efficiently discovered with an iterative mining strategy. Second, multi-modal sparse coding is applied to transform visual features from multiple modalities into an intermediate representation. It can robustly and adaptively characterize visual contents of varied landmark images with certain canonical views. Finally, compact binary codes are learned on intermediate representation within a tailored discrete binary embedding model which preserves visual relations of images measured with canonical views and removes the involved noises. In this part, we develop a new augmented Lagrangian multiplier (ALM) based optimization method to directly solve the discrete binary codes. We can not only explicitly deal with the discrete constraint, but also consider the bit-uncorrelated constraint and balance constraint together. Experiments on real world landmark datasets demonstrate the superior performance of CV-DMH over several state-of-the-art methods

    Slovenian Virtual Gallery on the Internet

    Get PDF
    The Slovenian Virtual Gallery (SVG) is a World Wide Web based multimedia collection of pictures, text, clickable-maps and video clips presenting Slovenian fine art from the gothic period up to the present days. Part of SVG is a virtual gallery space where pictures hang on the walls while another part is devoted to current exhibitions of selected Slovenian art galleries. The first version of this application was developed in the first half of 1995. It was based on a file system for storing all the data and custom developed software for search, automatic generation of HTML documents, scaling of pictures and remote management of the system. Due to the fast development of Web related tools a new version of SVG was developed in 1997 based on object-oriented relational database server technology. Both implementations are presented and compared in this article with issues related to the transion between the two versions. At the end, we will also discuss some extensions to SVG. We will present the GUI (Graphical User Interface) developed specially for presentation of current exhibitions over the Web which is based on GlobalView panoramic navigation extension to developed Internet Video Server (IVS). And since SVG operates with a lot of image data, we will confront with the problem of Image Content Retrieval

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
    corecore