8,486 research outputs found
K-Space at TRECVid 2007
In this paper we describe K-Space participation in
TRECVid 2007. K-Space participated in two tasks, high-level feature extraction and interactive search. We present our approaches for each of these activities and provide a brief analysis of our results. Our high-level feature submission utilized multi-modal low-level features which included visual, audio and temporal elements. Specific concept detectors (such as Face detectors) developed by K-Space partners were also used. We experimented with different machine learning approaches including logistic regression and support vector machines (SVM). Finally we also experimented with both early and late fusion for feature combination. This year we also participated in interactive search, submitting 6 runs. We developed two interfaces which both utilized the same retrieval functionality. Our objective was to measure the effect of context, which was supported to different degrees in each interface, on user performance.
The first of the two systems was a āshotā based interface,
where the results from a query were presented as a ranked
list of shots. The second interface was ābroadcastā based,
where results were presented as a ranked list of broadcasts.
Both systems made use of the outputs of our high-level feature submission as well as low-level visual features
A location-aware embedding technique for accurate landmark recognition
The current state of the research in landmark recognition highlights the good
accuracy which can be achieved by embedding techniques, such as Fisher vector
and VLAD. All these techniques do not exploit spatial information, i.e.
consider all the features and the corresponding descriptors without embedding
their location in the image. This paper presents a new variant of the
well-known VLAD (Vector of Locally Aggregated Descriptors) embedding technique
which accounts, at a certain degree, for the location of features. The driving
motivation comes from the observation that, usually, the most interesting part
of an image (e.g., the landmark to be recognized) is almost at the center of
the image, while the features at the borders are irrelevant features which do
no depend on the landmark. The proposed variant, called locVLAD (location-aware
VLAD), computes the mean of the two global descriptors: the VLAD executed on
the entire original image, and the one computed on a cropped image which
removes a certain percentage of the image borders. This simple variant shows an
accuracy greater than the existing state-of-the-art approach. Experiments are
conducted on two public datasets (ZuBuD and Holidays) which are used both for
training and testing. Morever a more balanced version of ZuBuD is proposed.Comment: 6 pages, 5 figures, ICDSC 201
K-Space at TRECVID 2008
In this paper we describe K-Spaceās participation in
TRECVid 2008 in the interactive search task. For 2008
the K-Space group performed one of the largest interactive
video information retrieval experiments conducted
in a laboratory setting. We had three institutions participating
in a multi-site multi-system experiment. In
total 36 users participated, 12 each from Dublin City
University (DCU, Ireland), University of Glasgow (GU,
Scotland) and Centrum Wiskunde and Informatica (CWI,
the Netherlands). Three user interfaces were developed,
two from DCU which were also used in 2007 as well as
an interface from GU. All interfaces leveraged the same
search service. Using a latin squares arrangement, each
user conducted 12 topics, leading in total to 6 runs per
site, 18 in total. We officially submitted for evaluation 3
of these runs to NIST with an additional expert run using
a 4th system. Our submitted runs performed around
the median. In this paper we will present an overview of
the search system utilized, the experimental setup and a
preliminary analysis of our results
Computational illumination for high-speed in vitro Fourier ptychographic microscopy
We demonstrate a new computational illumination technique that achieves large
space-bandwidth-time product, for quantitative phase imaging of unstained live
samples in vitro. Microscope lenses can have either large field of view (FOV)
or high resolution, not both. Fourier ptychographic microscopy (FPM) is a new
computational imaging technique that circumvents this limit by fusing
information from multiple images taken with different illumination angles. The
result is a gigapixel-scale image having both wide FOV and high resolution,
i.e. large space-bandwidth product (SBP). FPM has enormous potential for
revolutionizing microscopy and has already found application in digital
pathology. However, it suffers from long acquisition times (on the order of
minutes), limiting throughput. Faster capture times would not only improve
imaging speed, but also allow studies of live samples, where motion artifacts
degrade results. In contrast to fixed (e.g. pathology) slides, live samples are
continuously evolving at various spatial and temporal scales. Here, we present
a new source coding scheme, along with real-time hardware control, to achieve
0.8 NA resolution across a 4x FOV with sub-second capture times. We propose an
improved algorithm and new initialization scheme, which allow robust phase
reconstruction over long time-lapse experiments. We present the first FPM
results for both growing and confluent in vitro cell cultures, capturing videos
of subcellular dynamical phenomena in popular cell lines undergoing division
and migration. Our method opens up FPM to applications with live samples, for
observing rare events in both space and time
- ā¦