82,058 research outputs found

    Augmenting conversations through context-aware multimedia retrieval based on speech recognition

    Get PDF
    Future’s environments will be sensitive and responsive to the presence of people to support them carrying out their everyday life activities, tasks and rituals, in an easy and natural way. Such interactive spaces will use the information and communication technologies to bring the computation into the physical world, in order to enhance ordinary activities of their users. This paper describes a speech-based spoken multimedia retrieval system that can be used to present relevant video-podcast (vodcast) footage, in response to spontaneous speech and conversations during daily life activities. The proposed system allows users to search the spoken content of multimedia files rather than their associated meta-information and let them navigate to the right portion where queried words are spoken by facilitating within-medium searches of multimedia content through a bag-of-words approach. Finally, we have studied the proposed system on different scenarios by using vodcasts in English from various categories, as the targeted multimedia, and discussed how it would enhance people’s everyday life activities by different scenarios including education, entertainment, marketing, news and workplace

    Learning Multimodal Latent Attributes

    Get PDF
    Abstract—The rapid development of social media sharing has created a huge demand for automatic media classification and annotation techniques. Attribute learning has emerged as a promising paradigm for bridging the semantic gap and addressing data sparsity via transferring attribute knowledge in object recognition and relatively simple action classification. In this paper, we address the task of attribute learning for understanding multimedia data with sparse and incomplete labels. In particular we focus on videos of social group activities, which are particularly challenging and topical examples of this task because of their multi-modal content and complex and unstructured nature relative to the density of annotations. To solve this problem, we (1) introduce a concept of semi-latent attribute space, expressing user-defined and latent attributes in a unified framework, and (2) propose a novel scalable probabilistic topic model for learning multi-modal semi-latent attributes, which dramatically reduces requirements for an exhaustive accurate attribute ontology and expensive annotation effort. We show that our framework is able to exploit latent attributes to outperform contemporary approaches for addressing a variety of realistic multimedia sparse data learning tasks including: multi-task learning, learning with label noise, N-shot transfer learning and importantly zero-shot learning

    Query generation from multiple media examples

    Get PDF
    This paper exploits an unified media document representation called feature terms for query generation from multiple media examples, e.g. images. A feature term refers to a value interval of a media feature. A media document is therefore represented by a frequency vector about feature term appearance. This approach (1) facilitates feature accumulation from multiple examples; (2) enables the exploration of text-based retrieval models for multimedia retrieval. Three statistical criteria, minimised chi-squared, minimised AC/DC rate and maximised entropy, are proposed to extract feature terms from a given media document collection. Two textual ranking functions, KL divergence and a BM25-like retrieval model, are adapted to estimate media document relevance. Experiments on the Corel photo collection and the TRECVid 2006 collection show the effectiveness of feature term based query in image and video retrieval

    Study on QoS support in 802.11e-based multi-hop vehicular wireless ad hoc networks

    Get PDF
    Multimedia communications over vehicular ad hoc networks (VANET) will play an important role in the future intelligent transport system (ITS). QoS support for VANET therefore becomes an essential problem. In this paper, we first study the QoS performance in multi-hop VANET by using the standard IEEE 802.11e EDCA MAC and our proposed triple-constraint QoS routing protocol, Delay-Reliability-Hop (DeReHQ). In particular, we evaluate the DeReHQ protocol together with EDCA in highway and urban areas. Simulation results show that end-to-end delay performance can sometimes be achieved when both 802.11e EDCA and DeReHQ extended AODV are used. However, further studies on cross-layer optimization for QoS support in multi-hop environment are required

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges
    • …
    corecore