15 research outputs found

    “A Good Algorithm Does Not Steal – It Imitates” : The Originality Report as a Means of Measuring When a Music Generation Algorithm Copies Too Much

    Get PDF
    Research on automatic music generation lacks consideration of the originality of musical outputs, creating risks of plagiarism and/or copyright infringement. We present the originality report – a set of analyses for measuring the extent to which an algorithm copies from the input music on which it is trained. First, a baseline is constructed, determining the extent to which human composers borrow from themselves and each other in some existing music corpus. Second, we apply a similar analysis to musical outputs of runs of MAIA Markov and Music Transformer generation algorithms, and compare the results to the baseline. Third, we investigate how originality varies as a function of Transformer’s training epoch. Results from the second analysis indicate that the originality of Transformer’s output is below the 95%-confidence interval of the baseline. Musicological interpretation of the analyses shows that the Transformer model obtained via the conventional stopping criteria produces single-note repetition patterns, resulting in outputs of low quality and originality, while in later training epochs, the model tends to overfit, producing copies of excerpts of input pieces. We recommend the originality report as a new means of evaluating algorithm training processes and outputs in future, and question the reported success of language-based deep learning models for music generation. Supporting materials (code, dataset) will be made available via https://​osf.​io/​96emr/​

    Understanding and Compressing Music with Maximal Transformable Patterns

    Get PDF
    We present a polynomial-time algorithm that discovers all maximal patterns in a point set, D⊂RkD\subset\mathbb{R}^k, that are related by transformations in a user-specified class, FF, of bijections over Rk\mathbb{R}^k. We also present a second algorithm that discovers the set of occurrences for each of these maximal patterns and then uses compact encodings of these occurrence sets to compute a losslessly compressed encoding of the input point set. This encoding takes the form of a set of pairs, E={⟨P1,T1⟩,⟨P2,T2⟩,…⟨Pℓ,Tℓ⟩}E=\left\lbrace\left\langle P_1, T_1\right\rangle,\left\langle P_2, T_2\right\rangle,\ldots\left\langle P_{\ell}, T_{\ell}\right\rangle\right\rbrace, where each ⟨Pi,Ti⟩\langle P_i,T_i\rangle consists of a maximal pattern, Pi⊆DP_i\subseteq D, and a set, Ti⊂FT_i\subset F, of transformations that map PiP_i onto other subsets of DD. Each transformation is encoded by a vector of real values that uniquely identifies it within FF and the length of this vector is used as a measure of the complexity of FF. We evaluate the new compression algorithm with three transformation classes of differing complexity, on the task of classifying folk-song melodies into tune families. The most complex of the classes tested includes all combinations of the musical transformations of transposition, inversion, retrograde, augmentation and diminution. We found that broadening the transformation class improved performance on this task. However, it did not, on average, improve compression factor, which may be due to the datasets (in this case, folk-song melodies) being too short and simple to benefit from the potentially greater number of pattern relationships that are discoverable with larger transformation classes

    New evaluation methods for automatic music generation

    Get PDF
    Recent research in the field of automatic music generation lacks rigorous and comprehensive evaluation methods, creating plagiarism risks and partial understandings of generation performance. To contribute to evaluation methodology in this field, I first introduce the originality report for measuring the extent to which an algorithm copies from the input music. It starts with constructing a baseline to determine the extent to which human composers borrow from themselves and each other in some existing music corpus. I then apply the similar analysis to musical outputs of runs of MAIA Markov and Music Transformer generation algorithms, and compare the results to the baseline. Results indicate that the originality of Music Transfomer's output is below the 95\% confidence interval of the baseline, while MAIA Markov stays within that interval. Second, I conduct a listening study to comparatively evaluate music generation systems along six musical dimensions: stylistic success, aesthetic pleasure, repetition or self-reference, melody, harmony, and rhythm. A range of models are used to generate 30-second excerpts in the style of Classical string quartets and classical piano improvisations. Fifty participants with relatively high musical knowledge rate unlabelled samples of computer-generated and human-composed excerpts. I use non-parametric Bayesian hypothesis testing to interpret the results. The results show that the strongest deep learning method, Music Transformer, has equivalent performance to a non-deep learning method, MAIA Markov, and there still remains a significant gap between any algorithmic method and human-composed excerpts. Third, I introduce six musical features: statistical complexity, transitional complexity, arc score, tonality ambiguity, time intervals and onset jitters to investigate correlations to the collected ratings. The result shows human composed music remains at the same level of statistical complexity, while the computer-generated excerpts have either lower or higher statistical complexity and receive lower ratings. This thesis contributes to the evaluation methodology of automatic music generation by filling the gap of originality report, comparative evaluation and musicological analysis

    Improving the running time of repeated pattern discovery in multidimensional representations of music

    Get PDF
    Methods for discovering repeated patterns in music are important tools in computational music analysis. Repeated pattern discovery can be used in applications such as song classification and music generation in computational creativity. Multiple approaches to repeated pattern discovery have been developed, but many of the approaches do not work well with polyphonic music, that is, music where multiple notes occur at the same time. Music can be represented as a multidimensional dataset, where notes are represented as multidimensional points. Moving patterns in time and transposing their pitch can be expressed as translation. Multidimensional representations of music enable the use of algorithms that can effectively find repeated patterns in polyphonic music. The research on methods for repeated pattern discovery in multidimensional representa- tions of music is largely based on the SIA and SIATEC algorithms. Multiple variants of both algorithms have been developed. Most of the variants use SIA or SIATEC directly and then use heuristic functions to identify the musically most important patterns. The variants do not thus typically provide improvements in running time. However, the running time of SIA and SIATEC can be impractical on large inputs. This thesis focuses on improving the running time of pattern discovery in multidimensional representations of music. The algorithms that are developed in this thesis are based on SIA and SIATEC. Two approaches to improving running time are investigated. The first approach involves the use of hashing, and the second approach is based on using filtering to avoid the computation of unimportant patterns altogether. Three novel algorithms are presented: SIAH, SIATECH, and SIATECHF. The SIAH and SIATECH algorithms, which use hashing, were found to provide great improvements in running time over the corresponding SIA and SIATEC algorithms. The use of filtering in SIATECHF was not found to significantly improve the running time of repeated pattern discovery

    Machine Annotation of Traditional Irish Dance Music

    Get PDF
    The work presented in this thesis is validated in experiments using 130 realworld field recordings of traditional music from sessions, classes, concerts and commercial recordings. Test audio includes solo and ensemble playing on a variety of instruments recorded in real-world settings such as noisy public sessions. Results are reported using standard measures from the field of information retrieval (IR) including accuracy, error, precision and recall and the system is compared to alternative approaches for CBMIR common in the literature

    Automatic identification of musical schemata

    Get PDF
    This study was stimulated by the Galant musical schemata theory (GMST), an example–based learning and compositional practice that peaked in popularity around the early 18th century in Europe, suggesting a culturally–defined classification of polyphonic patterns. Under the premises of the GMST and by relating notions from psychology towards a cognitive model for musical schemata identification, an explanatory system based on music-analytical thought–patterns was examined, aiming to describe the mental processes involved in three accumulative operations: a) the schematic analysis of music notation into a stream of salient musical elements and, eventually, GMST–related musical structures, providing the standard form of music notation interpretation for the examined model; b) the example–based learning of musical schemata definitions from annotated examples, and c) the discovery of – similar to the Galant – musical schemata family–types in corpora. The proposed music–analytical model was tested with a novel computational system performing three tasks accordingly: i) search, matching representations of Galant musical schemata prototypes and examining similarity models; ii) classification, classifying segments of schematic analysis according to musical schemata family–type definitions that are extracted and maintained utilising annotated examples and pattern detection methods, and iii) polyphonic pattern extraction, examining methods that form and categorise musical schemata structures. The proposed model was evaluated employing the technological research methodology, and computational experiments quantified the performance of the computational system implementing the aforementioned tasks by utilising Galant musical schemata–annotated datasets and task–oriented performance metrics. Results show a functional cognitive model for complex music–analytical operations with polyphonic patterns, suggesting methodological explanations as to how these may be addressed by the initiate. Based on the foundations established in this project, it may in the future become possible to develop computational tools that have applications in music education and musicological research

    Music Metadata Capture in the Studio from Audio and Symbolic Data

    Get PDF
    PhdMusic Information Retrieval (MIR) tasks, in the main, are concerned with the accurate generation of one of a number of different types of music metadata {beat onsets, or melody extraction, for example. Almost always, they operate on fully mixed digital audio recordings. Commonly, this means that a large amount of signal processing effort is directed towards the isolation, and then identification, of certain highly relevant aspects of the audio mix. In some cases, results of one MIR algorithm are useful, if not essential, to the operation of another { a chord detection algorithm for example, is highly dependent upon accurate pitch detection. Although not clearly defined in all cases, certain rules exist which we may take from music theory in order to assist the task { the particular note intervals which make up a specific chord, for example. On the question of generating accurate, low level music metadata (e.g. chromatic pitch and score onset time), a potentially huge advantage lies in the use of multitrack, rather than mixed, audio recordings, in which the separate instrument recordings may be analysed in isolation. Additionally, in MIR, as in many other research areas currently, there is an increasing push towards the use of the Semantic Web for publishing metadata using the Resource Description Framework (RDF). Semantic Web technologies, though, also facilitate the querying of data via the SPARQL query language, as well as logical inferencing via the careful creation and use of web ontology language (OWL) ontologies. This, in turn, opens up the intriguing possibility of deferring our decision regarding which particular type of MIR query to ask of our low-level music metadata until some point later down the line, long after all the heavy signal processing has been carried out. In this thesis, we describe an over-arching vision for an alternative MIR paradigm, built around the principles of early, studio-based metadata capture, and exploitation of open, machine-readable Semantic Web data. Using the specific example of structural segmentation, we demonstrate that by analysing multitrack rather than mixed audio, we are able to achieve a significant and quantifiable increase in the accuracy of our segmentation algorithm. We also provide details of a new multitrack audio dataset with structural segmentation annotations, created as part of this research, and available for public use. Furthermore, we show that it is possible to fully implement a pair of pattern discovery algorithms (the SIA and SIATEC algorithms { highly applicable, but not restricted to, symbolic music data analysis) using only SemanticWeb technologies { the SPARQL query language, acting on RDF data, in tandem with a small OWL ontology. We describe the challenges encountered by taking this approach, the particular solution we've arrived at, and we evaluate the implementation both in terms of its execution time, and also within the wider context of our vision for a new MIR paradigm.EPSRC studentship no. EP/505054/1

    CHORUS Deliverable 2.1: State of the Art on Multimedia Search Engines

    Get PDF
    Based on the information provided by European projects and national initiatives related to multimedia search as well as domains experts that participated in the CHORUS Think-thanks and workshops, this document reports on the state of the art related to multimedia content search from, a technical, and socio-economic perspective. The technical perspective includes an up to date view on content based indexing and retrieval technologies, multimedia search in the context of mobile devices and peer-to-peer networks, and an overview of current evaluation and benchmark inititiatives to measure the performance of multimedia search engines. From a socio-economic perspective we inventorize the impact and legal consequences of these technical advances and point out future directions of research
    corecore