31,155 research outputs found

    Speaker-following Video Subtitles

    Full text link
    We propose a new method for improving the presentation of subtitles in video (e.g. TV and movies). With conventional subtitles, the viewer has to constantly look away from the main viewing area to read the subtitles at the bottom of the screen, which disrupts the viewing experience and causes unnecessary eyestrain. Our method places on-screen subtitles next to the respective speakers to allow the viewer to follow the visual content while simultaneously reading the subtitles. We use novel identification algorithms to detect the speakers based on audio and visual information. Then the placement of the subtitles is determined using global optimization. A comprehensive usability study indicated that our subtitle placement method outperformed both conventional fixed-position subtitling and another previous dynamic subtitling method in terms of enhancing the overall viewing experience and reducing eyestrain

    Deep Cross-Modal Correlation Learning for Audio and Lyrics in Music Retrieval

    Get PDF
    Deep cross-modal learning has successfully demonstrated excellent performance in cross-modal multimedia retrieval, with the aim of learning joint representations between different data modalities. Unfortunately, little research focuses on cross-modal correlation learning where temporal structures of different data modalities such as audio and lyrics should be taken into account. Stemming from the characteristic of temporal structures of music in nature, we are motivated to learn the deep sequential correlation between audio and lyrics. In this work, we propose a deep cross-modal correlation learning architecture involving two-branch deep neural networks for audio modality and text modality (lyrics). Data in different modalities are converted to the same canonical space where inter modal canonical correlation analysis is utilized as an objective function to calculate the similarity of temporal structures. This is the first study that uses deep architectures for learning the temporal correlation between audio and lyrics. A pre-trained Doc2Vec model followed by fully-connected layers is used to represent lyrics. Two significant contributions are made in the audio branch, as follows: i) We propose an end-to-end network to learn cross-modal correlation between audio and lyrics, where feature extraction and correlation learning are simultaneously performed and joint representation is learned by considering temporal structures. ii) As for feature extraction, we further represent an audio signal by a short sequence of local summaries (VGG16 features) and apply a recurrent neural network to compute a compact feature that better learns temporal structures of music audio. Experimental results, using audio to retrieve lyrics or using lyrics to retrieve audio, verify the effectiveness of the proposed deep correlation learning architectures in cross-modal music retrieval

    Saving Energy in Mobile Devices for On-Demand Multimedia Streaming -- A Cross-Layer Approach

    Full text link
    This paper proposes a novel energy-efficient multimedia delivery system called EStreamer. First, we study the relationship between buffer size at the client, burst-shaped TCP-based multimedia traffic, and energy consumption of wireless network interfaces in smartphones. Based on the study, we design and implement EStreamer for constant bit rate and rate-adaptive streaming. EStreamer can improve battery lifetime by 3x, 1.5x and 2x while streaming over Wi-Fi, 3G and 4G respectively.Comment: Accepted in ACM Transactions on Multimedia Computing, Communications and Applications (ACM TOMCCAP), November 201

    Effect of oil palm empty fruit bunches (OPEFB) fibers to the compressive strength and water absorption of concrete

    Get PDF
    Growing popularity based on environmentally-friendly, low cost and lightweight building materials in the construction industry has led to a need to examine how these characteristics can be achieved and at the same time giving the benefit to the environment and maintain the material requirements based on the standards required. Recycling of waste generated from industrial and agricultural activities as measures of building materials is not only a viable solution to the problem of pollution but also to produce an economic design of building

    A Survey of Techniques For Improving Energy Efficiency in Embedded Computing Systems

    Full text link
    Recent technological advances have greatly improved the performance and features of embedded systems. With the number of just mobile devices now reaching nearly equal to the population of earth, embedded systems have truly become ubiquitous. These trends, however, have also made the task of managing their power consumption extremely challenging. In recent years, several techniques have been proposed to address this issue. In this paper, we survey the techniques for managing power consumption of embedded systems. We discuss the need of power management and provide a classification of the techniques on several important parameters to highlight their similarities and differences. This paper is intended to help the researchers and application-developers in gaining insights into the working of power management techniques and designing even more efficient high-performance embedded systems of tomorrow

    Context Aware Adaptable Applications - A global approach

    Get PDF
    Actual applications (mostly component based) requirements cannot be expressed without a ubiquitous and mobile part for end-users as well as for M2M applications (Machine to Machine). Such an evolution implies context management in order to evaluate the consequences of the mobility and corresponding mechanisms to adapt or to be adapted to the new environment. Applications are then qualified as context aware applications. This first part of this paper presents an overview of context and its management by application adaptation. This part starts by a definition and proposes a model for the context. It also presents various techniques to adapt applications to the context: from self-adaptation to supervised approached. The second part is an overview of architectures for adaptable applications. It focuses on platforms based solutions and shows information flows between application, platform and context. Finally it makes a synthesis proposition with a platform for adaptable context-aware applications called Kalimucho. Then we present implementations tools for software components and a dataflow models in order to implement the Kalimucho platform

    A user perspective of quality of service in m-commerce

    Get PDF
    This is the post-print version of the Article. The official published version can be accessed from the link below - Copyright @ 2004 Springer VerlagIn an m-commerce setting, the underlying communication system will have to provide a Quality of Service (QoS) in the presence of two competing factors—network bandwidth and, as the pressure to add value to the business-to-consumer (B2C) shopping experience by integrating multimedia applications grows, increasing data sizes. In this paper, developments in the area of QoS-dependent multimedia perceptual quality are reviewed and are integrated with recent work focusing on QoS for e-commerce. Based on previously identified user perceptual tolerance to varying multimedia QoS, we show that enhancing the m-commerce B2C user experience with multimedia, far from being an idealised scenario, is in fact feasible if perceptual considerations are employed
    • 

    corecore