8,356 research outputs found

    TRECVID: benchmarking the effectiveness of information retrieval tasks on digital video

    Get PDF
    Many research groups worldwide are now investigating techniques which can support information retrieval on archives of digital video and as groups move on to implement these techniques they inevitably try to evaluate the performance of their techniques in practical situations. The difficulty with doing this is that there is no test collection or any environment in which the effectiveness of video IR or video IR sub-tasks, can be evaluated and compared. The annual series of TREC exercises has, for over a decade, been benchmarking the effectiveness of systems in carrying out various information retrieval tasks on text and audio and has contributed to a huge improvement in many of these. Two years ago, a track was introduced which covers shot boundary detection, feature extraction and searching through archives of digital video. In this paper we present a summary of the activities in the TREC Video track in 2002 where 17 teams from across the world took part

    TRECVID 2003 - an overview

    Get PDF

    The TREC-2002 video track report

    Get PDF
    TREC-2002 saw the second running of the Video Track, the goal of which was to promote progress in content-based retrieval from digital video via open, metrics-based evaluation. The track used 73.3 hours of publicly available digital video (in MPEG-1/VCD format) downloaded by the participants directly from the Internet Archive (Prelinger Archives) (internetarchive, 2002) and some from the Open Video Project (Marchionini, 2001). The material comprised advertising, educational, industrial, and amateur films produced between the 1930's and the 1970's by corporations, nonprofit organizations, trade associations, community and interest groups, educational institutions, and individuals. 17 teams representing 5 companies and 12 universities - 4 from Asia, 9 from Europe, and 4 from the US - participated in one or more of three tasks in the 2001 video track: shot boundary determination, feature extraction, and search (manual or interactive). Results were scored by NIST using manually created truth data for shot boundary determination and manual assessment of feature extraction and search results. This paper is an introduction to, and an overview of, the track framework - the tasks, data, and measures - the approaches taken by the participating groups, the results, and issues regrading the evaluation. For detailed information about the approaches and results, the reader should see the various site reports in the final workshop proceedings

    Implementation and analysis of several keyframe-based browsing interfaces to digital video

    Get PDF
    In this paper we present a variety of browsing interfaces for digital video information. The six interfaces are implemented on top of Físchlár, an operational recording, indexing, browsing and playback system for broadcast TV programmes. In developing the six browsing interfaces, we have been informed by the various dimensions which can be used to distinguish one interface from another. For this we include layeredness (the number of “layers” of abstraction which can be used in browsing a programme), the provision or omission of temporal information (varying from full timestamp information to nothing at all on time) and visualisation of spatial vs. temporal aspects of the video. After introducing and defining these dimensions we then locate some common browsing interfaces from the literature in this 3-dimensional “space” and then we locate our own six interfaces in this same space. We then present an outline of the interfaces and include some user feedback

    Adaptive Information Cluster at Dublin City University

    Get PDF
    The Adaptive Information Cluster (AIC) is a collaboration between Dublin City University and University College Dublin, and in the AIC at DCU, we investigate and develop as one stream of our research activities, various content analysis tools that can automatically index and structure video information. This includes movies or CCTV footage and the motivation is to support useful searching and browsing features for the envisaged end-users of such systems. We bring in the HCI perspective to this highly-technically-oriented research by brainstorming, generating scenarios, sketching and prototyping the user-interfaces to the resulting video retrieval systems we develop, and we conduct usability studies to better understand the usage and opinions of such systems so as to guide the future direction of our technological research

    Robust Evolution Of Contingent Cooperation In Pure One-Shot Prisoners' Dilemmas. Part II: Evolutionary Dynamics & Testable Predictions

    Get PDF
    ROC curves from the signal detection literature are used in an evolutionary analysis of one-shot and repeated prisoners' dilemmas: showing if there is any discounting of future payoffs, or any cost of searching for an additional partner, then cooperative players who contingently participate - in terms of who to play with or when to exit - cannot survive when most other players unconditionally defect; even when contingent participators only interact with themselves by perfectly detecting their own type. However, quite different results hold for players who act contingently, not in terms of whether to play or exit, but rather in terms of how to act with any given partner. There is a form of contingent cooperation in one-shot prisoners' dilemmas (called CD behavior) that will robustly evolve through any payoff monotonic process, such as replicator dynamics. That is, whenever CD-players can detect their own type better than pure chance, they are guaranteed to evolve from any initial population - eventually to a unique evolutionarily stable population composed entirely of contingent cooperators - provided the fear payoff difference is less than the sum of greed and cooperation payoff differences. The adaptive capabilities just described hold for pure one-shot prisoners' dilemmas: meaning no repeated interactions or pairings in any generation are involved; no information or third party reports about past behavior are involved, all signal information arises only from symptoms detected after two strangers meet for the first time; and no subjective preferences for altruism, fairness, equity, reciprocity, or morality affect the raw evolutionary dynamics. Testable predictions are also derived that agree with a large body of experimental data built up since the prisoners dilemma was first introduced in 1950. They describe how the CD-players' equilibrium probability of cooperating changes: depending on the relative size of fear, greed, and cooperation payoff differences; and depending on the players' history of communication, especially when face-to-face discussion is involved. --prisoners' dilemma,cooperation,Nash equilibrium,evolutionary stability,replicator dynamics,signal detection,ROC curves,experiment

    Collaborative video searching on a tabletop

    Get PDF
    Almost all system and application design for multimedia systems is based around a single user working in isolation to perform some task yet much of the work for which we use computers to help us, is based on working collaboratively with colleagues. Groupware systems do support user collaboration but typically this is supported through software and users still physically work independently. Tabletop systems, such as the DiamondTouch from MERL, are interface devices which support direct user collaboration on a tabletop. When a tabletop is used as the interface for a multimedia system, such as a video search system, then this kind of direct collaboration raises many questions for system design. In this paper we present a tabletop system for supporting a pair of users in a video search task and we evaluate the system not only in terms of search performance but also in terms of user–user interaction and how different user personalities within each pair of searchers impacts search performance and user interaction. Incorporating the user into the system evaluation as we have done here reveals several interesting results and has important ramifications for the design of a multimedia search system

    Single Shot Temporal Action Detection

    Full text link
    Temporal action detection is a very important yet challenging problem, since videos in real applications are usually long, untrimmed and contain multiple action instances. This problem requires not only recognizing action categories but also detecting start time and end time of each action instance. Many state-of-the-art methods adopt the "detection by classification" framework: first do proposal, and then classify proposals. The main drawback of this framework is that the boundaries of action instance proposals have been fixed during the classification step. To address this issue, we propose a novel Single Shot Action Detector (SSAD) network based on 1D temporal convolutional layers to skip the proposal generation step via directly detecting action instances in untrimmed video. On pursuit of designing a particular SSAD network that can work effectively for temporal action detection, we empirically search for the best network architecture of SSAD due to lacking existing models that can be directly adopted. Moreover, we investigate into input feature types and fusion strategies to further improve detection accuracy. We conduct extensive experiments on two challenging datasets: THUMOS 2014 and MEXaction2. When setting Intersection-over-Union threshold to 0.5 during evaluation, SSAD significantly outperforms other state-of-the-art systems by increasing mAP from 19.0% to 24.6% on THUMOS 2014 and from 7.4% to 11.0% on MEXaction2.Comment: ACM Multimedia 201
    corecore