3,344 research outputs found

    Towards an All-Purpose Content-Based Multimedia Information Retrieval System

    Full text link
    The growth of multimedia collections - in terms of size, heterogeneity, and variety of media types - necessitates systems that are able to conjointly deal with several forms of media, especially when it comes to searching for particular objects. However, existing retrieval systems are organized in silos and treat different media types separately. As a consequence, retrieval across media types is either not supported at all or subject to major limitations. In this paper, we present vitrivr, a content-based multimedia information retrieval stack. As opposed to the keyword search approach implemented by most media management systems, vitrivr makes direct use of the object's content to facilitate different types of similarity search, such as Query-by-Example or Query-by-Sketch, for and, most importantly, across different media types - namely, images, audio, videos, and 3D models. Furthermore, we introduce a new web-based user interface that enables easy-to-use, multimodal retrieval from and browsing in mixed media collections. The effectiveness of vitrivr is shown on the basis of a user study that involves different query and media types. To the best of our knowledge, the full vitrivr stack is unique in that it is the first multimedia retrieval system that seamlessly integrates support for four different types of media. As such, it paves the way towards an all-purpose, content-based multimedia information retrieval system

    Mobile Augmented Reality: User Interfaces, Frameworks, and Intelligence

    Get PDF
    Mobile Augmented Reality (MAR) integrates computer-generated virtual objects with physical environments for mobile devices. MAR systems enable users to interact with MAR devices, such as smartphones and head-worn wearables, and perform seamless transitions from the physical world to a mixed world with digital entities. These MAR systems support user experiences using MAR devices to provide universal access to digital content. Over the past 20 years, several MAR systems have been developed, however, the studies and design of MAR frameworks have not yet been systematically reviewed from the perspective of user-centric design. This article presents the first effort of surveying existing MAR frameworks (count: 37) and further discuss the latest studies on MAR through a top-down approach: (1) MAR applications; (2) MAR visualisation techniques adaptive to user mobility and contexts; (3) systematic evaluation of MAR frameworks, including supported platforms and corresponding features such as tracking, feature extraction, and sensing capabilities; and (4) underlying machine learning approaches supporting intelligent operations within MAR systems. Finally, we summarise the development of emerging research fields and the current state-of-the-art, and discuss the important open challenges and possible theoretical and technical directions. This survey aims to benefit both researchers and MAR system developers alike.Peer reviewe

    Identification of expressive descriptors for style extraction in music analysis using linear and nonlinear models

    Get PDF
    La formalización de las interpretaciones expresivas aún se considera relevante debido a la complejidad de la música. La interpretación expresiva forma un aspecto importante de la música, teniendo en cuenta diferentes convenciones como géneros o estilos que una interpretación puede desarrollar con el tiempo. Modelar la relación entre las expresiones musicales y los aspectos estructurales de la información acústica requiere una base probabilística y estadística mínima para la robustez, validación y reproducibilidad de aplicaciones computacionales. Por lo tanto, es necesaria una relación cohesiva y una justificación sobre los resultados. Esta tesis se sustenta en la teoría y aplicaciones de modelos discriminativos y generativos en el marco del aprendizaje de maquina y la relación de procedimientos sistemáticos con los conceptos de la musicología utilizando técnicas de procesamiento de señales y minería de datos. Los resultados se validaron mediante pruebas estadísticas y una experimentación no paramétrica con la implementación de un conjunto de métricas para medir aspectos acústicos y temporales de archivos de audio para entrenar un modelo discriminativo y mejorar el proceso de síntesis de un modelo neuronal profundo. Adicionalmente, el modelo implementado presenta la oportunidad para la aplicación de procedimientos sistemáticos, automatización de transcripciones usando notación musical, entrenamiento de habilidades auditivas para estudiantes de música y mejorar la implementación de redes neuronales profundas usando CPU en lugar de GPU debido a las ventajas de las redes convolucionales para el procesamiento de archivos de audio como vectores o matriz con una secuencia de notas.MaestríaMagister en Ingeniería Electrónic

    Symbolic and Visual Retrieval of Mathematical Notation using Formula Graph Symbol Pair Matching and Structural Alignment

    Get PDF
    Large data collections containing millions of math formulae in different formats are available on-line. Retrieving math expressions from these collections is challenging. We propose a framework for retrieval of mathematical notation using symbol pairs extracted from visual and semantic representations of mathematical expressions on the symbolic domain for retrieval of text documents. We further adapt our model for retrieval of mathematical notation on images and lecture videos. Graph-based representations are used on each modality to describe math formulas. For symbolic formula retrieval, where the structure is known, we use symbol layout trees and operator trees. For image-based formula retrieval, since the structure is unknown we use a more general Line of Sight graph representation. Paths of these graphs define symbol pairs tuples that are used as the entries for our inverted index of mathematical notation. Our retrieval framework uses a three-stage approach with a fast selection of candidates as the first layer, a more detailed matching algorithm with similarity metric computation in the second stage, and finally when relevance assessments are available, we use an optional third layer with linear regression for estimation of relevance using multiple similarity scores for final re-ranking. Our model has been evaluated using large collections of documents, and preliminary results are presented for videos and cross-modal search. The proposed framework can be adapted for other domains like chemistry or technical diagrams where two visually similar elements from a collection are usually related to each other

    Virtual Reality Games for Motor Rehabilitation

    Get PDF
    This paper presents a fuzzy logic based method to track user satisfaction without the need for devices to monitor users physiological conditions. User satisfaction is the key to any product’s acceptance; computer applications and video games provide a unique opportunity to provide a tailored environment for each user to better suit their needs. We have implemented a non-adaptive fuzzy logic model of emotion, based on the emotional component of the Fuzzy Logic Adaptive Model of Emotion (FLAME) proposed by El-Nasr, to estimate player emotion in UnrealTournament 2004. In this paper we describe the implementation of this system and present the results of one of several play tests. Our research contradicts the current literature that suggests physiological measurements are needed. We show that it is possible to use a software only method to estimate user emotion
    • …
    corecore