790 research outputs found

    Style Recognition in Music with Context Free Grammars and Kolmogorov Complexity

    Get PDF
    The Kolmogorov Complexity of an object is incomputable. But built in its structure is a way to specify description methods of an object that is computable in some sense. Such a description method then can be exploited to quantify the bits of information needed to generate the object from scratch. We show that Context-Free Grammars form such a viable description method to specify an object and the size of the grammar can be used to estimate the Kolmogorov Complexity. We use such estimation in approximating the Information Distance between two musical strings. We also show that such distance measure in music can be used to recognize the genre, composer and style and also for music classification

    Pattern Discovery from Biosequences

    Get PDF
    In this thesis we have developed novel methods for analyzing biological data, the primary sequences of the DNA and proteins, the microarray based gene expression data, and other functional genomics data. The main contribution is the development of the pattern discovery algorithm SPEXS, accompanied by several practical applications for analyzing real biological problems. For performing these biological studies that integrate different types of biological data we have developed a comprehensive web-based biological data analysis environment Expression Profiler (http://ep.ebi.ac.uk/)

    A Parametric Sound Object Model for Sound Texture Synthesis

    Get PDF
    This thesis deals with the analysis and synthesis of sound textures based on parametric sound objects. An overview is provided about the acoustic and perceptual principles of textural acoustic scenes, and technical challenges for analysis and synthesis are considered. Four essential processing steps for sound texture analysis are identifi ed, and existing sound texture systems are reviewed, using the four-step model as a guideline. A theoretical framework for analysis and synthesis is proposed. A parametric sound object synthesis (PSOS) model is introduced, which is able to describe individual recorded sounds through a fi xed set of parameters. The model, which applies to harmonic and noisy sounds, is an extension of spectral modeling and uses spline curves to approximate spectral envelopes, as well as the evolution of parameters over time. In contrast to standard spectral modeling techniques, this representation uses the concept of objects instead of concatenated frames, and it provides a direct mapping between sounds of diff erent length. Methods for automatic and manual conversion are shown. An evaluation is presented in which the ability of the model to encode a wide range of di fferent sounds has been examined. Although there are aspects of sounds that the model cannot accurately capture, such as polyphony and certain types of fast modulation, the results indicate that high quality synthesis can be achieved for many different acoustic phenomena, including instruments and animal vocalizations. In contrast to many other forms of sound encoding, the parametric model facilitates various techniques of machine learning and intelligent processing, including sound clustering and principal component analysis. Strengths and weaknesses of the proposed method are reviewed, and possibilities for future development are discussed

    Literary review of content-based music recognition paradigms

    Get PDF
    During the last few decades, a need for novel retrieval strategies for large audio databases emerged as millions of digital audio documents became accessible for everyone through the Internet. It became essential that the users could search for songs that they had no prior information about using only the content of the audio as a query. In practice this means that when a user hears an unknown song coming out of the radio and wants to get more information about it, he or she can simply record a sample of the song with a mobile device and send it to a music recognition application as a query. Query results would then be presented on the screen with all the necessary meta data, such as the song name and artist. The retrieval systems are expected to perform quickly and accurately against large databases that may contain millions of songs, which poses lots of challenges for the researchers. This thesis is a literature review which will go through some audio retrieval paradigms that allow querying for songs using only their audio content, such as audio fingerprinting. It will also address the typical problems and challenges of audio retrieval and compare how each of these proposed paradigms performs in these challenging scenarios

    Music Synchronization, Audio Matching, Pattern Detection, and User Interfaces for a Digital Music Library System

    Get PDF
    Over the last two decades, growing efforts to digitize our cultural heritage could be observed. Most of these digitization initiatives pursuit either one or both of the following goals: to conserve the documents - especially those threatened by decay - and to provide remote access on a grand scale. For music documents these trends are observable as well, and by now several digital music libraries are in existence. An important characteristic of these music libraries is an inherent multimodality resulting from the large variety of available digital music representations, such as scanned score, symbolic score, audio recordings, and videos. In addition, for each piece of music there exists not only one document of each type, but many. Considering and exploiting this multimodality and multiplicity, the DFG-funded digital library initiative PROBADO MUSIC aimed at developing a novel user-friendly interface for content-based retrieval, document access, navigation, and browsing in large music collections. The implementation of such a front end requires the multimodal linking and indexing of the music documents during preprocessing. As the considered music collections can be very large, the automated or at least semi-automated calculation of these structures would be recommendable. The field of music information retrieval (MIR) is particularly concerned with the development of suitable procedures, and it was the goal of PROBADO MUSIC to include existing and newly developed MIR techniques to realize the envisioned digital music library system. In this context, the present thesis discusses the following three MIR tasks: music synchronization, audio matching, and pattern detection. We are going to identify particular issues in these fields and provide algorithmic solutions as well as prototypical implementations. In Music synchronization, for each position in one representation of a piece of music the corresponding position in another representation is calculated. This thesis focuses on the task of aligning scanned score pages of orchestral music with audio recordings. Here, a previously unconsidered piece of information is the textual specification of transposing instruments provided in the score. Our evaluations show that the neglect of such information can result in a measurable loss of synchronization accuracy. Therefore, we propose an OCR-based approach for detecting and interpreting the transposition information in orchestral scores. For a given audio snippet, audio matching methods automatically calculate all musically similar excerpts within a collection of audio recordings. In this context, subsequence dynamic time warping (SSDTW) is a well-established approach as it allows for local and global tempo variations between the query and the retrieved matches. Moving to real-life digital music libraries with larger audio collections, however, the quadratic runtime of SSDTW results in untenable response times. To improve on the response time, this thesis introduces a novel index-based approach to SSDTW-based audio matching. We combine the idea of inverted file lists introduced by Kurth and Müller (Efficient index-based audio matching, 2008) with the shingling techniques often used in the audio identification scenario. In pattern detection, all repeating patterns within one piece of music are determined. Usually, pattern detection operates on symbolic score documents and is often used in the context of computer-aided motivic analysis. Envisioned as a new feature of the PROBADO MUSIC system, this thesis proposes a string-based approach to pattern detection and a novel interactive front end for result visualization and analysis

    The development of a discovery and control environment for networked audio devices based on a study of current audio control protocols

    Get PDF
    This dissertation develops a standard device model for networked audio devices and introduces a novel discovery and control environment that uses the developed device model. The proposed standard device model is derived from a study of current audio control protocols. Both the functional capabilities and design principles of audio control protocols are investigated with an emphasis on Open Sound Control, SNMP and IEC-62379, AES64, CopperLan and UPnP. An abstract model of networked audio devices is developed, and the model is implemented in each of the previously mentioned control protocols. This model is also used within a novel discovery and control environment designed around a distributed associative memory termed an object space. This environment challenges the accepted notions of the functionality provided by a control protocol. The study concludes by comparing the salient features of the different control protocols encountered in this study. Different approaches to control protocol design are considered, and several design heuristics for control protocols are proposed

    Issues in time series querying.

    Get PDF
    Lau Yung Hang.Thesis (M.Phil.)--Chinese University of Hong Kong, 2005.Includes bibliographical references (leaves 78-82).Abstracts in English and Chinese.Abstract --- p.iAcknowledgement --- p.iiiList of Figures --- p.viiiList of Tables --- p.xList of Algorithms --- p.xiChapter 1 --- Introduction --- p.1Chapter 1.1 --- Justifying the Need for US and DTW --- p.1Chapter 1.2 --- Motivating Examples --- p.3Chapter 1.3 --- Contributions --- p.9Chapter 1.4 --- Thesis Organization --- p.10Chapter 2 --- Problem Definition --- p.11Chapter 3 --- Preliminaries --- p.13Chapter 3.1 --- Time Warping Distance --- p.13Chapter 3.2 --- Constraints and Lower Bounding --- p.16Chapter 3.3 --- Uniform Scaling --- p.20Chapter 3.3.1 --- Lower bounding uniform scaling --- p.21Chapter 4 --- Scaling and Time Warping --- p.23Chapter 4.1 --- Tightness of the lower bounds --- p.27Chapter 4.2 --- Experimental Evaluation --- p.32Chapter 5 --- A Faster and more Flexible Approach --- p.41Chapter 5.1 --- The Enveloping Sequences Revisited --- p.41Chapter 5.2 --- Speeding up LB Distance Computation --- p.43Chapter 5.3 --- Experimental Evaluation --- p.44Chapter 5.3.1 --- Query Time Comparison --- p.44Chapter 5.3.2 --- Effect on Pruning Power --- p.46Chapter 6 --- Indexing for SWM --- p.49Chapter 6.1 --- Related Work --- p.49Chapter 6.1.1 --- Fast subsequence matching --- p.49Chapter 6.1.2 --- Duality-based subsequence matching --- p.50Chapter 6.1.3 --- Nearest Neighbor Search --- p.53Chapter 6.1.4 --- Dimension Reduction --- p.57Chapter 6.2 --- Proposed Indexing for SWM --- p.60Chapter 6.2.1 --- Index construction algorithm --- p.60Chapter 6.2.2 --- Utilizing the index --- p.61Chapter 6.2.3 --- Nearest Neighbor Search --- p.63Chapter 6.3 --- Experimental Evaluation --- p.64Chapter 6.3.1 --- Range Queries --- p.64Chapter 6.3.2 --- One nearest neighbor search --- p.68Chapter 6.3.3 --- k-nearest neighbor search --- p.72Chapter 7 --- Conclusion --- p.76Bibliography --- p.7

    Multiple Media Correlation: Theory and Applications

    Get PDF
    This thesis introduces multiple media correlation, a new technology for the automatic alignment of multiple media objects such as text, audio, and video. This research began with the question: what can be learned when multiple multimedia components are analyzed simultaneously? Most ongoing research in computational multimedia has focused on queries, indexing, and retrieval within a single media type. Video is compressed and searched independently of audio, text is indexed without regard to temporal relationships it may have to other media data. Multiple media correlation provides a framework for locating and exploiting correlations between multiple, potentially heterogeneous, media streams. The goal is computed synchronization, the determination of temporal and spatial alignments that optimize a correlation function and indicate commonality and synchronization between media objects. The model also provides a basis for comparison of media in unrelated domains. There are many real-world applications for this technology, including speaker localization, musical score alignment, and degraded media realignment. Two applications, text-to-speech alignment and parallel text alignment, are described in detail with experimental validation. Text-to-speech alignment computes the alignment between a textual transcript and speech-based audio. The presented solutions are effective for a wide variety of content and are useful not only for retrieval of content, but in support of automatic captioning of movies and video. Parallel text alignment provides a tool for the comparison of alternative translations of the same document that is particularly useful to the classics scholar interested in comparing translation techniques or styles. The results presented in this thesis include (a) new media models more useful in analysis applications, (b) a theoretical model for multiple media correlation, (c) two practical application solutions that have wide-spread applicability, and (d) Xtrieve, a multimedia database retrieval system that demonstrates this new technology and demonstrates application of multiple media correlation to information retrieval. This thesis demonstrates that computed alignment of media objects is practical and can provide immediate solutions to many information retrieval and content presentation problems. It also introduces a new area for research in media data analysis
    corecore