1,676 research outputs found

    Human-machine cooperation in large-scale multimedia retrieval : a survey

    Get PDF
    Large-Scale Multimedia Retrieval(LSMR) is the task to fast analyze a large amount of multimedia data like images or videos and accurately find the ones relevant to a certain semantic meaning. Although LSMR has been investigated for more than two decades in the fields of multimedia processing and computer vision, a more interdisciplinary approach is necessary to develop an LSMR system that is really meaningful for humans. To this end, this paper aims to stimulate attention to the LSMR problem from diverse research fields. By explaining basic terminologies in LSMR, we first survey several representative methods in chronological order. This reveals that due to prioritizing the generality and scalability for large-scale data, recent methods interpret semantic meanings with a completely different mechanism from humans, though such humanlike mechanisms were used in classical heuristic-based methods. Based on this, we discuss human-machine cooperation, which incorporates knowledge about human interpretation into LSMR without sacrificing the generality and scalability. In particular, we present three approaches to human-machine cooperation (cognitive, ontological, and adaptive), which are attributed to cognitive science, ontology engineering, and metacognition, respectively. We hope that this paper will create a bridge to enable researchers in different fields to communicate about the LSMR problem and lead to a ground-breaking next generation of LSMR systems

    Human-Machine Cooperation in Large-Scale Multimedia Retrieval: A Survey

    Get PDF
    Large-Scale Multimedia Retrieval(LSMR) is the task to fast analyze a large amount of multimedia data like images or videos and accurately find the ones relevant to a certain semantic meaning. Although LSMR has been investigated for more than two decades in the fields of multimedia processing and computer vision, a more interdisciplinary approach is necessary to develop an LSMR system that is really meaningful for humans. To this end, this paper aims to stimulate attention to the LSMR problem from diverse research fields. By explaining basic terminologies in LSMR, we first survey several representative methods in chronological order. This reveals that due to prioritizing the generality and scalability for large-scale data, recent methods interpret semantic meanings with a completely different mechanism from humans, though such humanlike mechanisms were used in classical heuristic-based methods. Based on this, we discuss human-machine cooperation, which incorporates knowledge about human interpretation into LSMR without sacrificing the generality and scalability. In particular, we present three approaches to human-machine cooperation (cognitive, ontological, and adaptive), which are attributed to cognitive science, ontology engineering, and metacognition, respectively. We hope that this paper will create a bridge to enable researchers in different fields to communicate about the LSMR problem and lead to a ground-breaking next generation of LSMR systems

    Automatic semantic video annotation in wide domain videos based on similarity and commonsense knowledgebases

    Get PDF
    In this paper, we introduce a novel framework for automatic Semantic Video Annotation. As this framework detects possible events occurring in video clips, it forms the annotating base of video search engine. To achieve this purpose, the system has to able to operate on uncontrolled wide-domain videos. Thus, all layers have to be based on generic features. This framework aims to bridge the "semantic gap", which is the difference between the low-level visual features and the human's perception, by finding videos with similar visual events, then analyzing their free text annotation to find a common area then to decide the best description for this new video using commonsense knowledgebases. Experiments were performed on wide-domain video clips from the TRECVID 2005 BBC rush standard database. Results from these experiments show promising integrity between those two layers in order to find expressing annotations for the input video. These results were evaluated based on retrieval performance

    CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap

    Get PDF
    After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in multimedia search engines, we have identified and analyzed gaps within European research effort during our second year. In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio- economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal challenges

    Evolution of Information Retrieval System: Critical Review of Multimedia Information Retrieval System Based On Content, Context, and Concept

    Get PDF
    In recent years the explosive growth of information affects the flood of information. The amount of information must be followed by the development of the effective Information Retrieval System (IRS) so that the information will be easily accessible and useful for the user. The source of Information contains various media format, beside text there is also image, audio, and video that called multimedia. A large number of multimedia information rise the Multimedia Information Retrieval System (MIRS). Most of MIRS today is monolithic or only using one media format like Google1 for text search, tineye2 for image search, youtube3 for video search or 4shared4 for music and audio search. There is a need of information in any kind of media, not only retrieve the document in text format, but also retrieve the document in an image, audio and video format at once from any kind media format of the query. This study reviews the evolution of IRS, regress from text-based to concept- based MIRS. Unified Multimedia Indexing technique is discussed along with Concept-based MIRS. This critical review concludes that the evolution of IRS follows three paces: content-based, context-based and concept-based. Each pace takes on indexing system and retrieval techniques to optimize information retrieved. The challenge is how to come up with a retrieval technique that can process unified MIRS in order to retrieve optimally the relevant document
    corecore