20,413 research outputs found
K-Space at TRECVid 2007
In this paper we describe K-Space participation in
TRECVid 2007. K-Space participated in two tasks, high-level feature extraction and interactive search. We present our approaches for each of these activities and provide a brief analysis of our results. Our high-level feature submission utilized multi-modal low-level features which included visual, audio and temporal elements. Specific concept detectors (such as Face detectors) developed by K-Space partners were also used. We experimented with different machine learning approaches including logistic regression and support vector machines (SVM). Finally we also experimented with both early and late fusion for feature combination. This year we also participated in interactive search, submitting 6 runs. We developed two interfaces which both utilized the same retrieval functionality. Our objective was to measure the effect of context, which was supported to different degrees in each interface, on user performance.
The first of the two systems was a āshotā based interface,
where the results from a query were presented as a ranked
list of shots. The second interface was ābroadcastā based,
where results were presented as a ranked list of broadcasts.
Both systems made use of the outputs of our high-level feature submission as well as low-level visual features
Action Recognition in Videos: from Motion Capture Labs to the Web
This paper presents a survey of human action recognition approaches based on
visual data recorded from a single video camera. We propose an organizing
framework which puts in evidence the evolution of the area, with techniques
moving from heavily constrained motion capture scenarios towards more
challenging, realistic, "in the wild" videos. The proposed organization is
based on the representation used as input for the recognition task, emphasizing
the hypothesis assumed and thus, the constraints imposed on the type of video
that each technique is able to address. Expliciting the hypothesis and
constraints makes the framework particularly useful to select a method, given
an application. Another advantage of the proposed organization is that it
allows categorizing newest approaches seamlessly with traditional ones, while
providing an insightful perspective of the evolution of the action recognition
task up to now. That perspective is the basis for the discussion in the end of
the paper, where we also present the main open issues in the area.Comment: Preprint submitted to CVIU, survey paper, 46 pages, 2 figures, 4
table
SAVASA project @ TRECVID 2012: interactive surveillance event detection
In this paper we describe our participation in the interactive surveillance event detection task at TRECVid 2012. The system we developed was comprised of individual classifiers brought together behind a simple video search interface that enabled users to select relevant segments based on down~sampled animated gifs. Two types of user -- `experts' and `end users' -- performed the evaluations. Due to time constraints we focussed on three events -- ObjectPut, PersonRuns and Pointing -- and two of the five available cameras (1 and 3). Results from the interactive runs as well as discussion of the performance of the underlying retrospective classifiers are presented
Computational illumination for high-speed in vitro Fourier ptychographic microscopy
We demonstrate a new computational illumination technique that achieves large
space-bandwidth-time product, for quantitative phase imaging of unstained live
samples in vitro. Microscope lenses can have either large field of view (FOV)
or high resolution, not both. Fourier ptychographic microscopy (FPM) is a new
computational imaging technique that circumvents this limit by fusing
information from multiple images taken with different illumination angles. The
result is a gigapixel-scale image having both wide FOV and high resolution,
i.e. large space-bandwidth product (SBP). FPM has enormous potential for
revolutionizing microscopy and has already found application in digital
pathology. However, it suffers from long acquisition times (on the order of
minutes), limiting throughput. Faster capture times would not only improve
imaging speed, but also allow studies of live samples, where motion artifacts
degrade results. In contrast to fixed (e.g. pathology) slides, live samples are
continuously evolving at various spatial and temporal scales. Here, we present
a new source coding scheme, along with real-time hardware control, to achieve
0.8 NA resolution across a 4x FOV with sub-second capture times. We propose an
improved algorithm and new initialization scheme, which allow robust phase
reconstruction over long time-lapse experiments. We present the first FPM
results for both growing and confluent in vitro cell cultures, capturing videos
of subcellular dynamical phenomena in popular cell lines undergoing division
and migration. Our method opens up FPM to applications with live samples, for
observing rare events in both space and time
- ā¦