350 research outputs found

    AXMEDIS 2007 Conference Proceedings

    Get PDF
    The AXMEDIS International Conference series has been established since 2005 and is focused on the research, developments and applications in the cross-media domain, exploring innovative technologies to meet the challenges of the sector. AXMEDIS2007 deals with all subjects and topics related to cross-media and digital-media content production, processing, management, standards, representation, sharing, interoperability, protection and rights management. It addresses the latest developments and future trends of the technologies and their applications, their impact and exploitation within academic, business and industrial communities

    Content-based visualisation to aid common navigation of musical audio

    Get PDF

    Ubiquitous Integration and Temporal Synchronisation (UbilTS) framework : a solution for building complex multimodal data capture and interactive systems

    Get PDF
    Contemporary Data Capture and Interactive Systems (DCIS) systems are tied in with various technical complexities such as multimodal data types, diverse hardware and software components, time synchronisation issues and distributed deployment configurations. Building these systems is inherently difficult and requires addressing of these complexities before the intended and purposeful functionalities can be attained. The technical issues are often common and similar among diverse applications. This thesis presents the Ubiquitous Integration and Temporal Synchronisation (UbiITS) framework, a generic solution to address the technical complexities in building DCISs. The proposed solution is an abstract software framework that can be extended and customised to any application requirements. UbiITS includes all fundamental software components, techniques, system level layer abstractions and reference architecture as a collection to enable the systematic construction of complex DCISs. This work details four case studies to showcase the versatility and extensibility of UbiITS framework’s functionalities and demonstrate how it was employed to successfully solve a range of technical requirements. In each case UbiITS operated as the core element of each application. Additionally, these case studies are novel systems by themselves in each of their domains. Longstanding technical issues such as flexibly integrating and interoperating multimodal tools, precise time synchronisation, etc., were resolved in each application by employing UbiITS. The framework enabled establishing a functional system infrastructure in these cases, essentially opening up new lines of research in each discipline where these research approaches would not have been possible without the infrastructure provided by the framework. The thesis further presents a sample implementation of the framework on a device firmware exhibiting its capability to be directly implemented on a hardware platform. Summary metrics are also produced to establish the complexity, reusability, extendibility, implementation and maintainability characteristics of the framework.Engineering and Physical Sciences Research Council (EPSRC) grants - EP/F02553X/1, 114433 and 11394

    MediaSync: Handbook on Multimedia Synchronization

    Get PDF
    This book provides an approachable overview of the most recent advances in the fascinating field of media synchronization (mediasync), gathering contributions from the most representative and influential experts. Understanding the challenges of this field in the current multi-sensory, multi-device, and multi-protocol world is not an easy task. The book revisits the foundations of mediasync, including theoretical frameworks and models, highlights ongoing research efforts, like hybrid broadband broadcast (HBB) delivery and users' perception modeling (i.e., Quality of Experience or QoE), and paves the way for the future (e.g., towards the deployment of multi-sensory and ultra-realistic experiences). Although many advances around mediasync have been devised and deployed, this area of research is getting renewed attention to overcome remaining challenges in the next-generation (heterogeneous and ubiquitous) media ecosystem. Given the significant advances in this research area, its current relevance and the multiple disciplines it involves, the availability of a reference book on mediasync becomes necessary. This book fills the gap in this context. In particular, it addresses key aspects and reviews the most relevant contributions within the mediasync research space, from different perspectives. Mediasync: Handbook on Multimedia Synchronization is the perfect companion for scholars and practitioners that want to acquire strong knowledge about this research area, and also approach the challenges behind ensuring the best mediated experiences, by providing the adequate synchronization between the media elements that constitute these experiences

    Intelligent Tools for Drum Loop Retrieval and Generation

    Get PDF
    Large libraries of musical data are an increasingly common feature of contemporary computer-based music production practice, with producers often relying heavily on large, curated libraries of data such as loops and samples when making tracks. Drum loop libraries are a particularly common type of library in this context. However, their typically large size, coupled with often poor user interfaces means navigating and exploring them in a fast, easy and enjoyable way is not always possible. Additionally, writing a drum part for a whole track out of many drum loops can be a laborious process, requiring manually editing of many drum loops. The aim of this thesis is to contribute novel techniques based on Music Information Retrieval (MIR) and machine learning that make the process of writing drum tracks using drum loops faster, easier and more enjoyable. We primarily focus on tools for drum loop library navigation and exploration, with additional work on assistive generation of drum loops. We contribute proof-of-concept and prototype tools, Groove Explorer and Groove Explorer 2, for drum loop library exploration based on an interface applying similarity-based visual arrangement of drum loops. Work on Groove Explorer suggested that there were limitations in the existing state-of-the-art approaches to drum loop similarity modelling that must be addressed for tools such as ours to be successful. This was verified via a perceptual study, which identified possible areas of improvement in similarity modelling. Following this, we develop and evaluate a set of novel models for drum loop analysis that capture rhythmic structure and the perceptually relevant qualities of microtiming. Drawing from this, a new approach to drum loop similarity modelling was verified in context as part of Groove Explorer 2, which we evaluated via a user study. The results indicated that our approach could make drum loop library exploration faster, easier and more enjoyable. We finally present an automatic drum loop generation system, jaki, that uses a novel approach for drum loop generation according to user constraints, that could extend Groove Explorer 2 as a drum loop editing and composition tool. Combined, these two systems could offer an end-to-end solution to improved writing of drum tracks

    An Artificial Intelligence Approach to Concatenative Sound Synthesis

    Get PDF
    Sound examples are included with this thesisTechnological advancement such as the increase in processing power, hard disk capacity and network bandwidth has opened up many exciting new techniques to synthesise sounds, one of which is Concatenative Sound Synthesis (CSS). CSS uses data-driven method to synthesise new sounds from a large corpus of small sound snippets. This technique closely resembles the art of mosaicing, where small tiles are arranged together to create a larger image. A ‘target’ sound is often specified by users so that segments in the database that match those of the target sound can be identified and then concatenated together to generate the output sound. Whilst the practicality of CSS in synthesising sounds currently looks promising, there are still areas to be explored and improved, in particular the algorithm that is used to find the matching segments in the database. One of the main issues in CSS is the basis of similarity, as there are many perceptual attributes which sound similarity can be based on, for example it can be based on timbre, loudness, rhythm, and tempo and so on. An ideal CSS system needs to be able to decipher which of these perceptual attributes are anticipated by the users and then accommodate them by synthesising sounds that are similar with respect to the particular attribute. Failure to communicate the basis of sound similarity between the user and the CSS system generally results in output that mismatches the sound which has been envisioned by the user. In order to understand how humans perceive sound similarity, several elements that affected sound similarity judgment were first investigated. Of the four elements tested (timbre, melody, loudness, tempo), it was found that the basis of similarity is dependent on humans’ musical training where musicians based similarity on the timbral information, whilst non-musicians rely on melodic information. Thus, for the rest of the study, only features that represent the timbral information were included, as musicians are the target user for the findings of this study. Another issue with the current state of CSS systems is the user control flexibility, in particular during segment matching, where features can be assigned with different weights depending on their importance to the search. Typically, the weights (in some existing CSS systems that support the weight assigning mechanism) can only be assigned manually, resulting in a process that is both labour intensive and time consuming. Additionally, another problem was identified in this study, which is the lack of mechanism to handle homosonic and equidistant segments. These conditions arise when too few features are compared causing otherwise aurally different sounds to be represented by the same sonic values, or can also be a result of rounding off the values of the features extracted. This study addresses both of these problems through an extended use of Artificial Intelligence (AI). The Analysis Hierarchy Process (AHP) is employed to enable order dependent features selection, allowing weights to be assigned for each audio feature according to their relative importance. Concatenation distance is used to overcome the issues with homosonic and equidistant sound segments. The inclusion of AI results in a more intelligent system that can better handle tedious tasks and minimize human error, allowing users (composers) to worry less of the mundane tasks, and focusing more on the creative aspects of music making. In addition to the above, this study also aims to enhance user control flexibility in a CSS system and improve similarity result. The key factors that affect the synthesis results of CSS were first identified and then included as parametric options which users can control in order to communicate their intended creations to the system to synthesise. Comprehensive evaluations were carried out to validate the feasibility and effectiveness of the proposed solutions (timbral-based features set, AHP, and concatenation distance). The final part of the study investigates the relationship between perceived sound similarity and perceived sound interestingness. A new framework that integrates all these solutions, the query-based CSS framework, was then proposed. The proof-of-concept of this study, ConQuer, was developed based on this framework. This study has critically analysed the problems in existing CSS systems. Novel solutions have been proposed to overcome them and their effectiveness has been tested and discussed, and these are also the main contributions of this study.Malaysian Minsitry of Higher Education, Universiti Putra Malaysi

    Proceedings of the 7th Sound and Music Computing Conference

    Get PDF
    Proceedings of the SMC2010 - 7th Sound and Music Computing Conference, July 21st - July 24th 2010

    Affect-based indexing and retrieval of multimedia data

    Get PDF
    Digital multimedia systems are creating many new opportunities for rapid access to content archives. In order to explore these collections using search, the content must be annotated with significant features. An important and often overlooked aspect o f human interpretation o f multimedia data is the affective dimension. The hypothesis o f this thesis is that affective labels o f content can be extracted automatically from within multimedia data streams, and that these can then be used for content-based retrieval and browsing. A novel system is presented for extracting affective features from video content and mapping it onto a set o f keywords with predetermined emotional interpretations. These labels are then used to demonstrate affect-based retrieval on a range o f feature films. Because o f the subjective nature o f the words people use to describe emotions, an approach towards an open vocabulary query system utilizing the electronic lexical database WordNet is also presented. This gives flexibility for search queries to be extended to include keywords without predetermined emotional interpretations using a word-similarity measure. The thesis presents the framework and design for the affectbased indexing and retrieval system along with experiments, analysis, and conclusions

    Exploiting tag information for search and personalization

    Get PDF
    [no abstract
    • 

    corecore