17,180 research outputs found

    Deep Cross-Modal Correlation Learning for Audio and Lyrics in Music Retrieval

    Get PDF
    Deep cross-modal learning has successfully demonstrated excellent performance in cross-modal multimedia retrieval, with the aim of learning joint representations between different data modalities. Unfortunately, little research focuses on cross-modal correlation learning where temporal structures of different data modalities such as audio and lyrics should be taken into account. Stemming from the characteristic of temporal structures of music in nature, we are motivated to learn the deep sequential correlation between audio and lyrics. In this work, we propose a deep cross-modal correlation learning architecture involving two-branch deep neural networks for audio modality and text modality (lyrics). Data in different modalities are converted to the same canonical space where inter modal canonical correlation analysis is utilized as an objective function to calculate the similarity of temporal structures. This is the first study that uses deep architectures for learning the temporal correlation between audio and lyrics. A pre-trained Doc2Vec model followed by fully-connected layers is used to represent lyrics. Two significant contributions are made in the audio branch, as follows: i) We propose an end-to-end network to learn cross-modal correlation between audio and lyrics, where feature extraction and correlation learning are simultaneously performed and joint representation is learned by considering temporal structures. ii) As for feature extraction, we further represent an audio signal by a short sequence of local summaries (VGG16 features) and apply a recurrent neural network to compute a compact feature that better learns temporal structures of music audio. Experimental results, using audio to retrieve lyrics or using lyrics to retrieve audio, verify the effectiveness of the proposed deep correlation learning architectures in cross-modal music retrieval

    Saving Energy in Mobile Devices for On-Demand Multimedia Streaming -- A Cross-Layer Approach

    Full text link
    This paper proposes a novel energy-efficient multimedia delivery system called EStreamer. First, we study the relationship between buffer size at the client, burst-shaped TCP-based multimedia traffic, and energy consumption of wireless network interfaces in smartphones. Based on the study, we design and implement EStreamer for constant bit rate and rate-adaptive streaming. EStreamer can improve battery lifetime by 3x, 1.5x and 2x while streaming over Wi-Fi, 3G and 4G respectively.Comment: Accepted in ACM Transactions on Multimedia Computing, Communications and Applications (ACM TOMCCAP), November 201

    Context Aware Adaptable Applications - A global approach

    Get PDF
    Actual applications (mostly component based) requirements cannot be expressed without a ubiquitous and mobile part for end-users as well as for M2M applications (Machine to Machine). Such an evolution implies context management in order to evaluate the consequences of the mobility and corresponding mechanisms to adapt or to be adapted to the new environment. Applications are then qualified as context aware applications. This first part of this paper presents an overview of context and its management by application adaptation. This part starts by a definition and proposes a model for the context. It also presents various techniques to adapt applications to the context: from self-adaptation to supervised approached. The second part is an overview of architectures for adaptable applications. It focuses on platforms based solutions and shows information flows between application, platform and context. Finally it makes a synthesis proposition with a platform for adaptable context-aware applications called Kalimucho. Then we present implementations tools for software components and a dataflow models in order to implement the Kalimucho platform

    A user perspective of quality of service in m-commerce

    Get PDF
    This is the post-print version of the Article. The official published version can be accessed from the link below - Copyright @ 2004 Springer VerlagIn an m-commerce setting, the underlying communication system will have to provide a Quality of Service (QoS) in the presence of two competing factors—network bandwidth and, as the pressure to add value to the business-to-consumer (B2C) shopping experience by integrating multimedia applications grows, increasing data sizes. In this paper, developments in the area of QoS-dependent multimedia perceptual quality are reviewed and are integrated with recent work focusing on QoS for e-commerce. Based on previously identified user perceptual tolerance to varying multimedia QoS, we show that enhancing the m-commerce B2C user experience with multimedia, far from being an idealised scenario, is in fact feasible if perceptual considerations are employed

    The sweet smell of success: Enhancing multimedia applications with olfaction

    Get PDF
    This is the Post-Print version of the Article. The official published version can be accessed from the link below - Copyright @ 2012 ACMOlfaction, or smell, is one of the last challenges which multimedia applications have to conquer. As far as computerized smell is concerned, there are several difficulties to overcome, particularly those associated with the ambient nature of smell. In this article, we present results from an empirical study exploring users' perception of olfaction-enhanced multimedia displays. Findings show that olfaction significantly adds to the user multimedia experience. Moreover, use of olfaction leads to an increased sense of reality and relevance. Our results also show that users are tolerant of the interference and distortion effects caused by olfactory effect in multimedia
    • 

    corecore