4,298 research outputs found

    Computer-aided Melody Note Transcription Using the Tony Software: Accuracy and Efficiency

    Get PDF
    accepteddate-added: 2015-05-24 19:18:46 +0000 date-modified: 2017-12-28 10:36:36 +0000 keywords: Tony, melody, note, transcription, open source software bdsk-url-1: https://code.soundsoftware.ac.uk/attachments/download/1423/tony-paper_preprint.pdfdate-added: 2015-05-24 19:18:46 +0000 date-modified: 2017-12-28 10:36:36 +0000 keywords: Tony, melody, note, transcription, open source software bdsk-url-1: https://code.soundsoftware.ac.uk/attachments/download/1423/tony-paper_preprint.pdfWe present Tony, a software tool for the interactive an- notation of melodies from monophonic audio recordings, and evaluate its usability and the accuracy of its note extraction method. The scientific study of acoustic performances of melodies, whether sung or played, requires the accurate transcription of notes and pitches. To achieve the desired transcription accuracy for a particular application, researchers manually correct results obtained by automatic methods. Tony is an interactive tool directly aimed at making this correction task efficient. It provides (a) state-of-the art algorithms for pitch and note estimation, (b) visual and auditory feedback for easy error-spotting, (c) an intelligent graphical user interface through which the user can rapidly correct estimation errors, (d) extensive export functions enabling further processing in other applications. We show that Tony’s built in automatic note transcription method compares favourably with existing tools. We report how long it takes to annotate recordings on a set of 96 solo vocal recordings and study the effect of piece, the number of edits made and the annotator’s increasing mastery of the software. Tony is Open Source software, with source code and compiled binaries for Windows, Mac OS X and Linux available from https://code.soundsoftware.ac.uk/projects/tony/

    Deep Learning Techniques for Music Generation -- A Survey

    Full text link
    This paper is a survey and an analysis of different ways of using deep learning (deep artificial neural networks) to generate musical content. We propose a methodology based on five dimensions for our analysis: Objective - What musical content is to be generated? Examples are: melody, polyphony, accompaniment or counterpoint. - For what destination and for what use? To be performed by a human(s) (in the case of a musical score), or by a machine (in the case of an audio file). Representation - What are the concepts to be manipulated? Examples are: waveform, spectrogram, note, chord, meter and beat. - What format is to be used? Examples are: MIDI, piano roll or text. - How will the representation be encoded? Examples are: scalar, one-hot or many-hot. Architecture - What type(s) of deep neural network is (are) to be used? Examples are: feedforward network, recurrent network, autoencoder or generative adversarial networks. Challenge - What are the limitations and open challenges? Examples are: variability, interactivity and creativity. Strategy - How do we model and control the process of generation? Examples are: single-step feedforward, iterative feedforward, sampling or input manipulation. For each dimension, we conduct a comparative analysis of various models and techniques and we propose some tentative multidimensional typology. This typology is bottom-up, based on the analysis of many existing deep-learning based systems for music generation selected from the relevant literature. These systems are described and are used to exemplify the various choices of objective, representation, architecture, challenge and strategy. The last section includes some discussion and some prospects.Comment: 209 pages. This paper is a simplified version of the book: J.-P. Briot, G. Hadjeres and F.-D. Pachet, Deep Learning Techniques for Music Generation, Computational Synthesis and Creative Systems, Springer, 201

    Arabic Music Genre Identification

    Get PDF
    Published by: Semarak Ilmu Publishing. This is an open-access article distributed under the terms of the Creative Commons Attribution-Non Commercial 4.0 International License (CC BY-NC), https://creativecommons.org/licenses/by-nc/4.0/Music Information Retrieval (MIR) is one data science application crucial for different tasks such as recommendation systems, genre identification, fingerprinting, and novelty assessment. Different Machine Learning techniques are utilised to analyse digital music records, such as clustering, classification, similarity scoring, and identifying various properties for the different tasks. Music is represented digitally using diverse transformations and is clustered and classified successfully for Western Music. However, Eastern Music poses a challenge, and some techniques have achieved success in clustering and classifying Turkish and Persian Music. This research presents an evaluation of machine learning algorithms' performance on pre-labelled Arabic Music with their Arabic genre (Maqam). The study introduced new data representations of the Arabic music dataset and identified the most suitable machine-learning methods and future enhancements.Peer reviewe

    On-the-fly synthesizer programming with rule learning

    Get PDF
    This manuscript explores automatic programming of sound synthesis algorithms within the context of the performative artistic practice known as live coding. Writing source code in an improvised way to create music or visuals became an instrument the moment affordable computers were able to perform real-time sound synthesis with languages that keep their interpreter running. Ever since, live coding has dealt with real time programming of synthesis algorithms. For that purpose, one possibility is an algorithm that automatically creates variations out of a few presets selected by the user. However, the need for real-time feedback and the small size of the data sets (which can even be collected mid-performance) are constraints that make existing automatic sound synthesizer programmers and learning algorithms unfeasible. Also, the design of such algorithms is not oriented to create variations of a sound but rather to find the synthesizer parameters that match a given one. Other approaches create representations of the space of possible sounds, allowing the user to explore it by means of interactive evolution. Even though these systems are exploratory-oriented, they require longer run-times. This thesis investigates inductive rule learning for on-the-fly synthesizer programming. This approach is conceptually different from those found in both synthesizer programming and live coding literature. Rule models offer interpretability and allow working with the parameter values of the synthesis algorithms (even with symbolic data), making preprocessing unnecessary. RuLer, the proposed learning algorithm, receives a dataset containing user labeled combinations of parameter values of a synthesis algorithm. Among those combinations sharing the same label, it analyses the patterns based on dissimilarity. These patterns are described as an IF-THEN rule model. The algorithm parameters provide control to define what is considered a pattern. As patterns are the base for inducting new parameter settings, the algorithm parameters control the degree of consistency of the inducted settings respect to the original input data. An algorithm (named FuzzyRuLer) able to extend IF-THEN rules to hyperrectangles, which in turn are used as the cores of membership functions, is presented. The resulting fuzzy rule model creates a map of the entire input feature space. For such a pursuit, the algorithm generalizes the logical rules solving the contradictions by following a maximum volume heuristics. Across the manuscript it is discussed how, when machine learning algorithms are used as creative tools, glitches, errors or inaccuracies produced by the resulting models are sometimes desirable as they might offer novel, unpredictable results. The evaluation of the algorithms follows two paths. The first focuses on user tests. The second responds to the fact that this work was carried out within the computer science department and is intended to provide a broader, nonspecific domain evaluation of the algorithms performance using extrinsic benchmarks (i.e not belonging to a synthesizer's domain) for cross validation and minority oversampling. In oversampling tasks, using imbalanced datasets, the algorithm yields state-of-the-art results. Moreover, the synthetic points produced are significantly different from those created by the other algorithms and perform (controlled) exploration of more distant regions. Finally, accompanying the research, various performances, concerts and an album were produced with the algorithms and examples of this thesis. The reviews received and collections where the album has been featured show a positive reception within the community. Together, these evaluations suggest that rule learning is both an effective method and a promising path for further research.Aquest manuscrit explora la programació automàtica d’algorismes de síntesi de so dins del context de la pràctica artística performativa coneguda com a live coding. L'escriptura improvisada de codi font per crear música o visuals es va convertir en un instrument en el moment en què els ordinadors van poder realitzar síntesis de so en temps real amb llenguatges que mantenien el seu intèrpret en funcionament. D'aleshores ençà, el live coding comporta la programació en temps real d’algorismes de síntesi de so. Per a aquest propòsit, una possibilitat és tenir un algorisme que creï automàticament variacions a partir d'alguns presets seleccionats. No obstant, la necessitat de retroalimentació en temps real i la petita mida dels conjunts de dades són restriccions que fan que els programadors automàtics de sintetitzadors de so i els algorismes d’aprenentatge no siguin factibles d’utilitzar. A més, el seu disseny no està orientat a crear variacions d'un so, sinó a trobar els paràmetres del sintetitzador que aplicats a l'algorisme de síntesi produeixen un so determinat (target). Altres enfocaments creen representacions de l'espai de sons possibles, per permetre a l'usuari explorar-lo mitjançant l'evolució interactiva, però requereixen temps més llargs. Aquesta tesi investiga l'aprenentatge inductiu de regles per a la programació on-the-fly de sintetitzadors. Aquest enfocament és conceptualment diferent dels que es troben a la literatura. Els models de regles ofereixen interpretabilitat i permeten treballar amb els valors dels paràmetres dels algorismes de síntesi, sense processament previ. RuLer, l'algorisme d'aprenentatge proposat, rep dades amb combinacions etiquetades per l'usuari dels valors dels paràmetres d'un algorisme de síntesi. A continuació, analitza els patrons, basats en la dissimilitud, entre les combinacions de cada etiqueta. Aquests patrons es descriuen com un model de regles IF-THEN. Els paràmetres de l'algorisme proporcionen control per definir el que es considera un patró. Llavors, controlen el grau de consistència dels nous paràmetres de síntesi induïts respecte a les dades d'entrada originals. A continuació, es presenta un algorisme (FuzzyRuLer) capaç d’estendre les regles IF-THEN a hiperrectangles, que al seu torn s’utilitzen com a nuclis de funcions de pertinença. El model de regles difuses resultant crea un mapa complet de l'espai de la funció d'entrada. Per això, l'algorisme generalitza les regles lògiques seguint una heurística de volum màxim. Al llarg del manuscrit es discuteix com, quan s’utilitzen algorismes d’aprenentatge automàtic com a eines creatives, de vegades són desitjables glitches, errors o imprecisions produïdes pels models resultants, ja que poden oferir nous resultats imprevisibles. L'avaluació dels algorismes segueix dos camins. El primer es centra en proves d'usuari. El segon, que respon al fet que aquest treball es va dur a terme dins del departament de ciències de la computació, pretén proporcionar una avaluació més àmplia, no específica d'un domini, del rendiment dels algorismes mitjançant benchmarks extrínsecs utilitzats per cross-validation i minority oversampling. En tasques d'oversampling, mitjançant imbalanced data sets, l'algorisme proporciona resultats equiparables als de l'estat de l'art. A més, els punts sintètics produïts són significativament diferents als creats pels altres algorismes i realitzen exploracions (controlades) de regions més llunyanesEste manuscrito explora la programación automática de algoritmos de síntesis de sonido dentro del contexto de la práctica artística performativa conocida como live coding. La escritura de código fuente de forma improvisada para crear música o imágenes, se convirtió en un instrumento en el momento en que las computadoras asequibles pudieron realizar síntesis de sonido en tiempo real con lenguajes que mantuvieron su interprete en funcionamiento. Desde entonces, el live coding ha implicado la programación en tiempo real de algoritmos de síntesis. Para ese propósito, una posibilidad es tener un algoritmo que cree automáticamente variaciones a partir de unos pocos presets seleccionados. Sin embargo, la necesidad de retroalimentación en tiempo real y el pequeño tamaño de los conjuntos de datos (que incluso pueden recopilarse durante la misma actuación), limitan el uso de los algoritmos existentes, tanto de programación automática de sintetizadores como de aprendizaje de máquina. Además, el diseño de dichos algoritmos no está orientado a crear variaciones de un sonido, sino a encontrar los parámetros del sintetizador que coincidan con un sonido dado. Otros enfoques crean representaciones del espacio de posibles sonidos, para permitir al usuario explorarlo mediante evolución interactiva. Aunque estos sistemas están orientados a la exploración, requieren tiempos más largos. Esta tesis investiga el aprendizaje inductivo de reglas para la programación de sintetizadores on-the-fly. Este enfoque es conceptualmente diferente de los que se encuentran en la literatura, tanto de programación de sintetizadores como de live coding. Los modelos de reglas ofrecen interpretabilidad y permiten trabajar con los valores de los parámetros de los algoritmos de síntesis (incluso con datos simbólicos), haciendo innecesario el preprocesamiento. RuLer, el algoritmo de aprendizaje propuesto, recibe un conjunto de datos que contiene combinaciones, etiquetadas por el usuario, de valores de parámetros de un algoritmo de síntesis. Luego, analiza los patrones, en función de la disimilitud, entre las combinaciones de cada etiqueta. Estos patrones se describen como un modelo de reglas lógicas IF-THEN. Los parámetros del algoritmo proporcionan el control para definir qué se considera un patrón. Como los patrones son la base para inducir nuevas configuraciones de parámetros, los parámetros del algoritmo controlan también el grado de consistencia de las configuraciones inducidas con respecto a los datos de entrada originales. Luego, se presenta un algoritmo (llamado FuzzyRuLer) capaz de extender las reglas lógicas tipo IF-THEN a hiperrectángulos, que a su vez se utilizan como núcleos de funciones de pertenencia. El modelo de reglas difusas resultante crea un mapa completo del espacio de las clases de entrada. Para tal fin, el algoritmo generaliza las reglas lógicas resolviendo las contradicciones utilizando una heurística de máximo volumen. A lo largo del manuscrito se analiza cómo, cuando los algoritmos de aprendizaje automático se utilizan como herramientas creativas, los glitches, errores o inexactitudes producidas por los modelos resultantes son a veces deseables, ya que pueden ofrecer resultados novedosos e impredecibles. La evaluación de los algoritmos sigue dos caminos. El primero se centra en pruebas de usuario. El segundo, responde al hecho de que este trabajo se llevó a cabo dentro del departamento de ciencias de la computación y está destinado a proporcionar una evaluación más amplia, no de dominio específica, del rendimiento de los algoritmos utilizando beanchmarks extrínsecos para cross-validation y oversampling. En estas últimas pruebas, utilizando conjuntos de datos no balanceados, el algoritmo produce resultados equiparables a los del estado del arte. Además, los puntos sintéticos producidos son significativamente diferentes de los creados por los otros algoritmos y realizan una exploración (controlada) de regiones más distantes. Finalmente, acompañando la investigación, realicé diversas presentaciones, conciertos y un ´álbum utilizando los algoritmos y ejemplos de esta tesis. Las críticas recibidas y las listas donde se ha presentado el álbum muestran una recepción positiva de la comunidad. En conjunto, estas evaluaciones sugieren que el aprendizaje de reglas es al mismo tiempo un método eficaz y un camino prometedor para futuras investigaciones.Postprint (published version

    Automatic Transcription of Bass Guitar Tracks applied for Music Genre Classification and Sound Synthesis

    Get PDF
    Musiksignale bestehen in der Regel aus einer Überlagerung mehrerer Einzelinstrumente. Die meisten existierenden Algorithmen zur automatischen Transkription und Analyse von Musikaufnahmen im Forschungsfeld des Music Information Retrieval (MIR) versuchen, semantische Information direkt aus diesen gemischten Signalen zu extrahieren. In den letzten Jahren wurde häufig beobachtet, dass die Leistungsfähigkeit dieser Algorithmen durch die Signalüberlagerungen und den daraus resultierenden Informationsverlust generell limitiert ist. Ein möglicher Lösungsansatz besteht darin, mittels Verfahren der Quellentrennung die beteiligten Instrumente vor der Analyse klanglich zu isolieren. Die Leistungsfähigkeit dieser Algorithmen ist zum aktuellen Stand der Technik jedoch nicht immer ausreichend, um eine sehr gute Trennung der Einzelquellen zu ermöglichen. In dieser Arbeit werden daher ausschließlich isolierte Instrumentalaufnahmen untersucht, die klanglich nicht von anderen Instrumenten überlagert sind. Exemplarisch werden anhand der elektrischen Bassgitarre auf die Klangerzeugung dieses Instrumentes hin spezialisierte Analyse- und Klangsynthesealgorithmen entwickelt und evaluiert.Im ersten Teil der vorliegenden Arbeit wird ein Algorithmus vorgestellt, der eine automatische Transkription von Bassgitarrenaufnahmen durchführt. Dabei wird das Audiosignal durch verschiedene Klangereignisse beschrieben, welche den gespielten Noten auf dem Instrument entsprechen. Neben den üblichen Notenparametern Anfang, Dauer, Lautstärke und Tonhöhe werden dabei auch instrumentenspezifische Parameter wie die verwendeten Spieltechniken sowie die Saiten- und Bundlage auf dem Instrument automatisch extrahiert. Evaluationsexperimente anhand zweier neu erstellter Audiodatensätze belegen, dass der vorgestellte Transkriptionsalgorithmus auf einem Datensatz von realistischen Bassgitarrenaufnahmen eine höhere Erkennungsgenauigkeit erreichen kann als drei existierende Algorithmen aus dem Stand der Technik. Die Schätzung der instrumentenspezifischen Parameter kann insbesondere für isolierte Einzelnoten mit einer hohen Güte durchgeführt werden.Im zweiten Teil der Arbeit wird untersucht, wie aus einer Notendarstellung typischer sich wieder- holender Basslinien auf das Musikgenre geschlossen werden kann. Dabei werden Audiomerkmale extrahiert, welche verschiedene tonale, rhythmische, und strukturelle Eigenschaften von Basslinien quantitativ beschreiben. Mit Hilfe eines neu erstellten Datensatzes von 520 typischen Basslinien aus 13 verschiedenen Musikgenres wurden drei verschiedene Ansätze für die automatische Genreklassifikation verglichen. Dabei zeigte sich, dass mit Hilfe eines regelbasierten Klassifikationsverfahrens nur Anhand der Analyse der Basslinie eines Musikstückes bereits eine mittlere Erkennungsrate von 64,8 % erreicht werden konnte.Die Re-synthese der originalen Bassspuren basierend auf den extrahierten Notenparametern wird im dritten Teil der Arbeit untersucht. Dabei wird ein neuer Audiosynthesealgorithmus vorgestellt, der basierend auf dem Prinzip des Physical Modeling verschiedene Aspekte der für die Bassgitarre charakteristische Klangerzeugung wie Saitenanregung, Dämpfung, Kollision zwischen Saite und Bund sowie dem Tonabnehmerverhalten nachbildet. Weiterhin wird ein parametrischerAudiokodierungsansatz diskutiert, der es erlaubt, Bassgitarrenspuren nur anhand der ermittel- ten notenweisen Parameter zu übertragen um sie auf Dekoderseite wieder zu resynthetisieren. Die Ergebnisse mehrerer Hötest belegen, dass der vorgeschlagene Synthesealgorithmus eine Re- Synthese von Bassgitarrenaufnahmen mit einer besseren Klangqualität ermöglicht als die Übertragung der Audiodaten mit existierenden Audiokodierungsverfahren, die auf sehr geringe Bitraten ein gestellt sind.Music recordings most often consist of multiple instrument signals, which overlap in time and frequency. In the field of Music Information Retrieval (MIR), existing algorithms for the automatic transcription and analysis of music recordings aim to extract semantic information from mixed audio signals. In the last years, it was frequently observed that the algorithm performance is limited due to the signal interference and the resulting loss of information. One common approach to solve this problem is to first apply source separation algorithms to isolate the present musical instrument signals before analyzing them individually. The performance of source separation algorithms strongly depends on the number of instruments as well as on the amount of spectral overlap.In this thesis, isolated instrumental tracks are analyzed in order to circumvent the challenges of source separation. Instead, the focus is on the development of instrument-centered signal processing algorithms for music transcription, musical analysis, as well as sound synthesis. The electric bass guitar is chosen as an example instrument. Its sound production principles are closely investigated and considered in the algorithmic design.In the first part of this thesis, an automatic music transcription algorithm for electric bass guitar recordings will be presented. The audio signal is interpreted as a sequence of sound events, which are described by various parameters. In addition to the conventionally used score-level parameters note onset, duration, loudness, and pitch, instrument-specific parameters such as the applied instrument playing techniques and the geometric position on the instrument fretboard will be extracted. Different evaluation experiments confirmed that the proposed transcription algorithm outperformed three state-of-the-art bass transcription algorithms for the transcription of realistic bass guitar recordings. The estimation of the instrument-level parameters works with high accuracy, in particular for isolated note samples.In the second part of the thesis, it will be investigated, whether the sole analysis of the bassline of a music piece allows to automatically classify its music genre. Different score-based audio features will be proposed that allow to quantify tonal, rhythmic, and structural properties of basslines. Based on a novel data set of 520 bassline transcriptions from 13 different music genres, three approaches for music genre classification were compared. A rule-based classification system could achieve a mean class accuracy of 64.8 % by only taking features into account that were extracted from the bassline of a music piece.The re-synthesis of a bass guitar recordings using the previously extracted note parameters will be studied in the third part of this thesis. Based on the physical modeling of string instruments, a novel sound synthesis algorithm tailored to the electric bass guitar will be presented. The algorithm mimics different aspects of the instrument’s sound production mechanism such as string excitement, string damping, string-fret collision, and the influence of the electro-magnetic pickup. Furthermore, a parametric audio coding approach will be discussed that allows to encode and transmit bass guitar tracks with a significantly smaller bit rate than conventional audio coding algorithms do. The results of different listening tests confirmed that a higher perceptual quality can be achieved if the original bass guitar recordings are encoded and re-synthesized using the proposed parametric audio codec instead of being encoded using conventional audio codecs at very low bit rate settings

    Effect of nano black rice husk ash on the chemical and physical properties of porous concrete pavement

    Get PDF
    Black rice husk is a waste from this agriculture industry. It has been found that majority inorganic element in rice husk is silica. In this study, the effect of Nano from black rice husk ash (BRHA) on the chemical and physical properties of concrete pavement was investigated. The BRHA produced from uncontrolled burning at rice factory was taken. It was then been ground using laboratory mill with steel balls and steel rods. Four different grinding grades of BRHA were examined. A rice husk ash dosage of 10% by weight of binder was used throughout the experiments. The chemical and physical properties of the Nano BRHA mixtures were evaluated using fineness test, X-ray Fluorescence spectrometer (XRF) and X-ray diffraction (XRD). In addition, the compressive strength test was used to evaluate the performance of porous concrete pavement. Generally, the results show that the optimum grinding time was 63 hours. The result also indicated that the use of Nano black rice husk ash ground for 63hours produced concrete with good strengt
    corecore