853 research outputs found
On the Modeling of Musical Solos as Complex Networks
Notes in a musical piece are building blocks employed in non-random ways to
create melodies. It is the "interaction" among a limited amount of notes that
allows constructing the variety of musical compositions that have been written
in centuries and within different cultures. Networks are a modeling tool that
is commonly employed to represent a set of entities interacting in some way.
Thus, notes composing a melody can be seen as nodes of a network that are
connected whenever these are played in sequence. The outcome of such a process
results in a directed graph. By using complex network theory, some main metrics
of musical graphs can be measured, which characterize the related musical
pieces. In this paper, we define a framework to represent melodies as networks.
Then, we provide an analysis on a set of guitar solos performed by main
musicians. Results of this study indicate that the presented model can have an
impact on audio and multimedia applications such as music classification,
identification, e-learning, automatic music generation, multimedia
entertainment.Comment: to appear in Information Science, Elsevier. Please cite the paper
including such information. arXiv admin note: text overlap with
arXiv:1603.0497
Linking Music Metadata.
PhDThe internet has facilitated music metadata production and distribution on an unprecedented
scale. A contributing factor of this data deluge is a change in the
authorship of this data from the expert few to the untrained crowd. The resulting
unordered flood of imperfect annotations provides challenges and opportunities in
identifying accurate metadata and linking it to the music audio in order to provide
a richer listening experience. We advocate novel adaptations of Dynamic Programming
for music metadata synchronisation, ranking and comparison. This thesis
introduces Windowed Time Warping, Greedy, Constrained On-Line Time Warping
for synchronisation and the Concurrence Factor for automatically ranking metadata.
We begin by examining the availability of various music metadata on the web.
We then review Dynamic Programming methods for aligning and comparing two
source sequences whilst presenting novel, specialised adaptations for efficient, realtime
synchronisation of music and metadata that make improvements in speed and
accuracy over existing algorithms. The Concurrence Factor, which measures the
degree in which an annotation of a song agrees with its peers, is proposed in order to
utilise the wisdom of the crowds to establish a ranking system. This attribute uses
a combination of the standard Dynamic Programming methods Levenshtein Edit
Distance, Dynamic Time Warping, and Longest Common Subsequence to compare
annotations.
We present a synchronisation application for applying the aforementioned methods
as well as a tablature-parsing application for mining and analysing guitar tablatures
from the web. We evaluate the Concurrence Factor as a ranking system on a largescale
collection of guitar tablatures and lyrics to show a correlation with accuracy
that is superior to existing methods currently used in internet search engines, which
are based on popularity and human ratingsEngineering
and Physical Sciences Research Council; Travel grant from the Royal Engineering Society
ShredGP: Guitarist Style-Conditioned Tablature Generation
GuitarPro format tablatures are a type of digital music notation that
encapsulates information about guitar playing techniques and fingerings. We
introduce ShredGP, a GuitarPro tablature generative Transformer-based model
conditioned to imitate the style of four distinct iconic electric guitarists.
In order to assess the idiosyncrasies of each guitar player, we adopt a
computational musicology methodology by analysing features computed from the
tokens yielded by the DadaGP encoding scheme. Statistical analyses of the
features evidence significant differences between the four guitarists. We
trained two variants of the ShredGP model, one using a multi-instrument corpus,
the other using solo guitar data. We present a BERT-based model for guitar
player classification and use it to evaluate the generated examples. Overall,
results from the classifier show that ShredGP is able to generate content
congruent with the style of the targeted guitar player. Finally, we reflect on
prospective applications for ShredGP for human-AI music interaction.Comment: Accepted for publication at CMMR 202
DadaGP: A Dataset of Tokenized GuitarPro Songs for Sequence Models
Originating in the Renaissance and burgeoning in the digital era, tablatures are a commonly used music notation system which provides explicit representations of instrument fingerings rather than pitches. GuitarPro has established itself as a widely used tablature format and software enabling musicians to edit and share songs for musical practice, learning, and composition. In this work, we present DadaGP, a new symbolic music dataset comprising 26,181 song scores in the GuitarPro format covering 739 musical genres, along with an accompanying tokenized format well-suited for generative sequence models such as the Transformer. The tokenized format is inspired by event-based MIDI encodings, often used in symbolic music generation models. The dataset is released with an encoder/decoder which converts GuitarPro files to tokens and back. We present results of a use case in which DadaGP is used to train a Transformer-based model to generate new songs in GuitarPro format. We discuss other relevant use cases for the dataset (guitar-bass transcription, music style transfer and artist/genre classification) as well as ethical implications. DadaGP opens up the possibility to train GuitarPro score generators, fine-tune models on custom data, create new styles of music, AI-powered songwriting apps, and human-AI improvisation
Poisoning Retrieval Corpora by Injecting Adversarial Passages
Dense retrievers have achieved state-of-the-art performance in various
information retrieval tasks, but to what extent can they be safely deployed in
real-world applications? In this work, we propose a novel attack for dense
retrieval systems in which a malicious user generates a small number of
adversarial passages by perturbing discrete tokens to maximize similarity with
a provided set of training queries. When these adversarial passages are
inserted into a large retrieval corpus, we show that this attack is highly
effective in fooling these systems to retrieve them for queries that were not
seen by the attacker. More surprisingly, these adversarial passages can
directly generalize to out-of-domain queries and corpora with a high success
attack rate -- for instance, we find that 50 generated passages optimized on
Natural Questions can mislead >94% of questions posed in financial documents or
online forums. We also benchmark and compare a range of state-of-the-art dense
retrievers, both unsupervised and supervised. Although different systems
exhibit varying levels of vulnerability, we show they can all be successfully
attacked by injecting up to 500 passages, a small fraction compared to a
retrieval corpus of millions of passages.Comment: EMNLP 2023. Our code is available at
https://github.com/princeton-nlp/corpus-poisonin
Multimodal Dataset Distillation for Image-Text Retrieval
Dataset distillation methods offer the promise of reducing a large-scale
dataset down to a significantly smaller set of (potentially synthetic) training
examples, which preserve sufficient information for training a new model from
scratch. So far dataset distillation methods have been developed for image
classification. However, with the rise in capabilities of vision-language
models, and especially given the scale of datasets necessary to train these
models, the time is ripe to expand dataset distillation methods beyond image
classification. In this work, we take the first steps towards this goal by
expanding on the idea of trajectory matching to create a distillation method
for vision-language datasets. The key challenge is that vision-language
datasets do not have a set of discrete classes. To overcome this, our proposed
multimodal dataset distillation method jointly distill the images and their
corresponding language descriptions in a contrastive formulation. Since there
are no existing baselines, we compare our approach to three coreset selection
methods (strategic subsampling of the training dataset), which we adapt to the
vision-language setting. We demonstrate significant improvements on the
challenging Flickr30K and COCO retrieval benchmark: the best coreset selection
method which selects 1000 image-text pairs for training is able to achieve only
5.6% image-to-text retrieval accuracy (recall@1); in contrast, our dataset
distillation approach almost doubles that with just 100 (an order of magnitude
fewer) training pairs.Comment: 28 pages, 11 figure
Function Based Design-by-Analogy: A Functional Vector Approach to Analogical Search
Design-by-analogy is a powerful approach to augment traditional concept generation methods by expanding the set of generated ideas using similarity relationships from solutions to analogous problems. While the concept of design-by-analogy has been known for some time, few actual methods and tools exist to assist designers in systematically seeking and identifying analogies from general data sources, databases, or repositories, such as patent databases. A new method for extracting functional analogies from data sources has been developed to provide this capability, here based on a functional basis rather than form or conflict descriptions. Building on past research, we utilize a functional vector space model (VSM) to quantify analogous similarity of an idea's functionality. We quantitatively evaluate the functional similarity between represented design problems and, in this case, patent descriptions of products. We also develop document parsing algorithms to reduce text descriptions of the data sources down to the key functions, for use in the functional similarity analysis and functional vector space modeling. To do this, we apply Zipf's law on word count order reduction to reduce the words within the documents down to the applicable functionally critical terms, thus providing a mapping process for function based search. The reduction of a document into functional analogous words enables the matching to novel ideas that are functionally similar, which can be customized various ways. This approach thereby provides relevant sources of design-by-analogy inspiration. As a verification of the approach, two original design problem case studies illustrate the distance range of analogical solutions that can be extracted. This range extends from very near-field, literal solutions to far-field cross-domain analogies.National Science Foundation (U.S.) (Grant CMMI-0855326)National Science Foundation (U.S.) (Grant CMMI-0855510)National Science Foundation (U.S.) (Grant CMMI-0855293)SUTD-MIT International Design Centre (IDC
- …