37,557 research outputs found
Multimodal music information processing and retrieval: survey and future challenges
Towards improving the performance in various music information processing
tasks, recent studies exploit different modalities able to capture diverse
aspects of music. Such modalities include audio recordings, symbolic music
scores, mid-level representations, motion, and gestural data, video recordings,
editorial or cultural tags, lyrics and album cover arts. This paper critically
reviews the various approaches adopted in Music Information Processing and
Retrieval and highlights how multimodal algorithms can help Music Computing
applications. First, we categorize the related literature based on the
application they address. Subsequently, we analyze existing information fusion
approaches, and we conclude with the set of challenges that Music Information
Retrieval and Sound and Music Computing research communities should focus in
the next years
K-Space at TRECVid 2007
In this paper we describe K-Space participation in
TRECVid 2007. K-Space participated in two tasks, high-level feature extraction and interactive search. We present our approaches for each of these activities and provide a brief analysis of our results. Our high-level feature submission utilized multi-modal low-level features which included visual, audio and temporal elements. Specific concept detectors (such as Face detectors) developed by K-Space partners were also used. We experimented with different machine learning approaches including logistic regression and support vector machines (SVM). Finally we also experimented with both early and late fusion for feature combination. This year we also participated in interactive search, submitting 6 runs. We developed two interfaces which both utilized the same retrieval functionality. Our objective was to measure the effect of context, which was supported to different degrees in each interface, on user performance.
The first of the two systems was a ‘shot’ based interface,
where the results from a query were presented as a ranked
list of shots. The second interface was ‘broadcast’ based,
where results were presented as a ranked list of broadcasts.
Both systems made use of the outputs of our high-level feature submission as well as low-level visual features
The relationship between IR and multimedia databases
Modern extensible database systems support multimedia data through ADTs. However, because of the problems with multimedia query formulation, this support is not sufficient.\ud
\ud
Multimedia querying requires an iterative search process involving many different representations of the objects in the database. The support that is needed is very similar to the processes in information retrieval.\ud
\ud
Based on this observation, we develop the miRRor architecture for multimedia query processing. We design a layered framework based on information retrieval techniques, to provide a usable query interface to the multimedia database.\ud
\ud
First, we introduce a concept layer to enable reasoning over low-level concepts in the database.\ud
\ud
Second, we add an evidential reasoning layer as an intermediate between the user and the concept layer.\ud
\ud
Third, we add the functionality to process the users' relevance feedback.\ud
\ud
We then adapt the inference network model from text retrieval to an evidential reasoning model for multimedia query processing.\ud
\ud
We conclude with an outline for implementation of miRRor on top of the Monet extensible database system
Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)
The implicit objective of the biennial "international - Traveling Workshop on
Interactions between Sparse models and Technology" (iTWIST) is to foster
collaboration between international scientific teams by disseminating ideas
through both specific oral/poster presentations and free discussions. For its
second edition, the iTWIST workshop took place in the medieval and picturesque
town of Namur in Belgium, from Wednesday August 27th till Friday August 29th,
2014. The workshop was conveniently located in "The Arsenal" building within
walking distance of both hotels and town center. iTWIST'14 has gathered about
70 international participants and has featured 9 invited talks, 10 oral
presentations, and 14 posters on the following themes, all related to the
theory, application and generalization of the "sparsity paradigm":
Sparsity-driven data sensing and processing; Union of low dimensional
subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph
sensing/processing; Blind inverse problems and dictionary learning; Sparsity
and computational neuroscience; Information theory, geometry and randomness;
Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?;
Sparse machine learning and inference.Comment: 69 pages, 24 extended abstracts, iTWIST'14 website:
http://sites.google.com/site/itwist1
- …