73,647 research outputs found
User-centred interface design for cross-language information retrieval
This paper reports on the user-centered design methodology and
techniques used for the elicitation of user requirements and how these requirements informed the first phase of the user interface design for a Cross-Language Information Retrieval System. We describe a set of factors involved in analysis of the data collected and, finally discuss the implications for user interface design based on the findings
Dance-the-music : an educational platform for the modeling, recognition and audiovisual monitoring of dance steps using spatiotemporal motion templates
In this article, a computational platform is presented, entitled âDance-the-Musicâ, that can be used in a dance educational context to explore and learn the basics of dance steps. By introducing a method based on spatiotemporal motion templates, the platform facilitates to train basic step models from sequentially repeated dance figures performed by a dance teacher. Movements are captured with an optical motion capture system. The teachersâ models can be visualized from a first-person perspective to instruct students how to perform the specific dance steps in the correct manner. Moreover, recognition algorithms-based on a template matching method can determine the quality of a studentâs performance in real time by means of multimodal monitoring techniques. The results of an evaluation study suggest that the Dance-the-Music is effective in helping dance students to master the basics of dance figures
Adaptive Nonparametric Image Parsing
In this paper, we present an adaptive nonparametric solution to the image
parsing task, namely annotating each image pixel with its corresponding
category label. For a given test image, first, a locality-aware retrieval set
is extracted from the training data based on super-pixel matching similarities,
which are augmented with feature extraction for better differentiation of local
super-pixels. Then, the category of each super-pixel is initialized by the
majority vote of the -nearest-neighbor super-pixels in the retrieval set.
Instead of fixing as in traditional non-parametric approaches, here we
propose a novel adaptive nonparametric approach which determines the
sample-specific k for each test image. In particular, is adaptively set to
be the number of the fewest nearest super-pixels which the images in the
retrieval set can use to get the best category prediction. Finally, the initial
super-pixel labels are further refined by contextual smoothing. Extensive
experiments on challenging datasets demonstrate the superiority of the new
solution over other state-of-the-art nonparametric solutions.Comment: 11 page
The CHORUS gap analysis on user-centered methodology for design and evaluation of multi-media information access systems
CHORUS is a Coordination Action, a specific type of project funded by the European commission under its research programmes, intended to bring together research projects with common goals, in the field of search technologies for digital audio-visual content, one of the strategic objectives of the current research frame program. CHORUS coordinates a number of research projects in the general area of audio-visual and multi-media information access and management.
The most important single contribution of the CHORUS work plan will be to provide a survey of the field and a roadmap with a gap analysis for the realisation of viable audio-visual search engines by European partners. This is done by several means. CHORUS organises Think-Tanks with industrial participation, focussed workshops to treat specific questions, and more general conferences for academic discussions. CHORUS is now in its final phase, and is currently preparing its final report together with a final conference to mark its publication
Finding new music: a diary study of everyday encounters with novel songs
This paper explores how we, as individuals, purposefully or serendipitously encounter 'new music' (that is, music that we havenât heard before) and relates these behaviours to music information retrieval activities such as music searching and music discovery via use of recommender systems. 41 participants participated in a three-day diary study, in which they recorded all incidents that brought them into contact with new music. The diaries were analyzed using a Grounded Theory approach. The results of this analysis are discussed with respect to location, time, and whether the music encounter was actively sought or occurred passively. Based on these results, we outline design implications for music information retrieval software, and suggest an extension of 'laid back' searching
Counterfactual Estimation and Optimization of Click Metrics for Search Engines
Optimizing an interactive system against a predefined online metric is
particularly challenging, when the metric is computed from user feedback such
as clicks and payments. The key challenge is the counterfactual nature: in the
case of Web search, any change to a component of the search engine may result
in a different search result page for the same query, but we normally cannot
infer reliably from search log how users would react to the new result page.
Consequently, it appears impossible to accurately estimate online metrics that
depend on user feedback, unless the new engine is run to serve users and
compared with a baseline in an A/B test. This approach, while valid and
successful, is unfortunately expensive and time-consuming. In this paper, we
propose to address this problem using causal inference techniques, under the
contextual-bandit framework. This approach effectively allows one to run
(potentially infinitely) many A/B tests offline from search log, making it
possible to estimate and optimize online metrics quickly and inexpensively.
Focusing on an important component in a commercial search engine, we show how
these ideas can be instantiated and applied, and obtain very promising results
that suggest the wide applicability of these techniques
Exploiting Image-trained CNN Architectures for Unconstrained Video Classification
We conduct an in-depth exploration of different strategies for doing event
detection in videos using convolutional neural networks (CNNs) trained for
image classification. We study different ways of performing spatial and
temporal pooling, feature normalization, choice of CNN layers as well as choice
of classifiers. Making judicious choices along these dimensions led to a very
significant increase in performance over more naive approaches that have been
used till now. We evaluate our approach on the challenging TRECVID MED'14
dataset with two popular CNN architectures pretrained on ImageNet. On this
MED'14 dataset, our methods, based entirely on image-trained CNN features, can
outperform several state-of-the-art non-CNN models. Our proposed late fusion of
CNN- and motion-based features can further increase the mean average precision
(mAP) on MED'14 from 34.95% to 38.74%. The fusion approach achieves the
state-of-the-art classification performance on the challenging UCF-101 dataset
- âŠ