37 research outputs found
Non-pitched percussion instrument classifier to our âin-labâ test set against different window lengths.
Non-pitched percussion instrument classifier to our âin-labâ test set against different window lengths.</p
Non-pitched percussion musical instrument classifier performance to real-world playtime setting (participants = 12).
Non-pitched percussion musical instrument classifier performance to real-world playtime setting (participants = 12).</p
Bootle Band user interface for probability threshold of LGBM model.
Bootle Band user interface for probability threshold of LGBM model.</p
SHAP results.
âClass 0â = tambourines, âClass 1â = shakers, âClass 2â = castanets, âClass 3â = noise. (PDF)</p
Data splits for âin-labâ dataset: training, validation and testing.
Samples calculated with â93 ms window and 50% overlap.</p
At-home deployed dataset from a usability study with 12 families (â 93 ms window and 50% overlap).
At-home deployed dataset from a usability study with 12 families (â 93 ms window and 50% overlap).</p
List of feature extraction.
Italicized indicates selected features from NCA. (PDF)</p
Confusion matrix analysis for the LGBM model using a 93ms window.
Confusion matrix analysis for the LGBM model using a 93ms window.</p
Optimized LGBM parameters using Optuna.
While the musical instrument classification task is well-studied, there remains a gap in identifying non-pitched percussion instruments which have greater overlaps in frequency bands and variation in sound quality and play style than pitched instruments. In this paper, we present a musical instrument classifier for detecting tambourines, maracas and castanets, instruments that are often used in early childhood music education. We generated a dataset with diverse instruments (e.g., brand, materials, construction) played in different locations with varying background noise and play styles. We conducted sensitivity analyses to optimize feature selection, windowing time, and model selection. We deployed and evaluated our best model in a mixed reality music application with 12 families in a home setting. Our dataset was comprised of over 369,000 samples recorded in-lab and 35,361 samples recorded with families in a home setting. We observed the Light Gradient Boosting Machine (LGBM) model to perform best using an approximate 93 ms window with only 12 mel-frequency cepstral coefficients (MFCCs) and signal entropy. Our best LGBM model was observed to perform with over 84% accuracy across all three instrument families in-lab and over 73% accuracy when deployed to the home. To our knowledge, the dataset compiled of 369,000 samples of non-pitched instruments is first of its kind. This work also suggests that a low feature space is sufficient for the recognition of non-pitched instruments. Lastly, real-world deployment and testing of the algorithms created with participants of diverse physical and cognitive abilities was also an important contribution towards more inclusive design practices. This paper lays the technological groundwork for a mixed reality music application that can detect childrenâs use of non-pitched, percussion instruments to support early childhood music education and play.</div
Spectrogram of castanet (top), tambourine (middle) and shaker (bottom).
Parameters: 44.1 kHz sampling rate, 50% overlap Hanning window, and 4096 samples DFT. (PDF)</p