7 research outputs found
Digital Filter Design Using Improved Artificial Bee Colony Algorithms
Digital filters are often used in digital signal processing applications. The design objective of a digital filter is to find the optimal set of filter coefficients, which satisfies the desired specifications of magnitude and group delay responses. Evolutionary algorithms are population-based meta-heuristic algorithms inspired by the biological behaviors of species. Compared to gradient-based optimization algorithms such as steepest descent and Newton’s like methods, these bio-inspired algorithms have the advantages of not getting stuck at local optima and being independent of the starting point in the solution space. The limitations of evolutionary algorithms include the presence of control parameters, problem specific tuning procedure, premature convergence and slower convergence rate. The artificial bee colony (ABC) algorithm is a swarm-based search meta-heuristic algorithm inspired by the foraging behaviors of honey bee colonies, with the benefit of a relatively fewer control parameters. In its original form, the ABC algorithm has certain limitations such as low convergence rate, and insufficient balance between exploration and exploitation in the search equations. In this dissertation, an ABC-AMR algorithm is proposed by incorporating an adaptive modification rate (AMR) into the original ABC algorithm to increase convergence rate by adjusting the balance between exploration and exploitation in the search equations through an adaptive determination of the number of parameters to be updated in every iteration. A constrained ABC-AMR algorithm is also developed for solving constrained optimization problems.There are many real-world problems requiring simultaneous optimizations of more than one conflicting objectives. Multiobjective (MO) optimization produces a set of feasible solutions called the Pareto front instead of a single optimum solution. For multiobjective optimization, if a decision maker’s preferences can be incorporated during the optimization process, the search process can be confined to the region of interest instead of searching the entire region. In this dissertation, two algorithms are developed for such incorporation. The first one is a reference-point-based MOABC algorithm in which a decision maker’s preferences are included in the optimization process as the reference point. The second one is a physical-programming-based MOABC algorithm in which physical programming is used for setting the region of interest of a decision maker. In this dissertation, the four developed algorithms are applied to solve digital filter design problems. The ABC-AMR algorithm is used to design Types 3 and 4 linear phase FIR differentiators, and the results are compared to those obtained by the original ABC algorithm, three improved ABC algorithms, and the Parks-McClellan algorithm. The constrained ABC-AMR algorithm is applied to the design of sparse Type 1 linear phase FIR filters of filter orders 60, 70 and 80, and the results are compared to three state-of-the-art design methods. The reference-point-based multiobjective ABC algorithm is used to design of asymmetric lowpass, highpass, bandpass and bandstop FIR filters, and the results are compared to those obtained by the preference-based multiobjective differential evolution algorithm. The physical-programming-based multiobjective ABC algorithm is used to design IIR lowpass, highpass and bandpass filters, and the results are compared to three state-of-the-art design methods. Based on the obtained design results, the four design algorithms are shown to be competitive as compared to the state-of-the-art design methods
Digital Filter Design Using Improved Teaching-Learning-Based Optimization
Digital filters are an important part of digital signal processing systems. Digital filters are divided into finite impulse response (FIR) digital filters and infinite impulse response (IIR) digital filters according to the length of their impulse responses. An FIR digital filter is easier to implement than an IIR digital filter because of its linear phase and stability properties. In terms of the stability of an IIR digital filter, the poles generated in the denominator are subject to stability constraints. In addition, a digital filter can be categorized as one-dimensional or multi-dimensional digital filters according to the dimensions of the signal to be processed. However, for the design of IIR digital filters, traditional design methods have the disadvantages of easy to fall into a local optimum and slow convergence. The Teaching-Learning-Based optimization (TLBO) algorithm has been proven beneficial in a wide range of engineering applications. To this end, this dissertation focusses on using TLBO and its improved algorithms to design five types of digital filters, which include linear phase FIR digital filters, multiobjective general FIR digital filters, multiobjective IIR digital filters, two-dimensional (2-D) linear phase FIR digital filters, and 2-D nonlinear phase FIR digital filters. Among them, linear phase FIR digital filters, 2-D linear phase FIR digital filters, and 2-D nonlinear phase FIR digital filters use single-objective type of TLBO algorithms to optimize; multiobjective general FIR digital filters use multiobjective non-dominated TLBO (MOTLBO) algorithm to optimize; and multiobjective IIR digital filters use MOTLBO with Euclidean distance to optimize. The design results of the five types of filter designs are compared to those obtained by other state-of-the-art design methods. In this dissertation, two major improvements are proposed to enhance the performance of the standard TLBO algorithm. The first improvement is to apply a gradient-based learning to replace the TLBO learner phase to reduce approximation error(s) and CPU time without sacrificing design accuracy for linear phase FIR digital filter design. The second improvement is to incorporate Manhattan distance to simplify the procedure of the multiobjective non-dominated TLBO (MOTLBO) algorithm for general FIR digital filter design. The design results obtained by the two improvements have demonstrated their efficiency and effectiveness
Automatic Transcription of Bass Guitar Tracks applied for Music Genre Classification and Sound Synthesis
Musiksignale bestehen in der Regel aus einer Überlagerung mehrerer
Einzelinstrumente. Die meisten existierenden Algorithmen zur automatischen
Transkription und Analyse von Musikaufnahmen im Forschungsfeld des Music
Information Retrieval (MIR) versuchen, semantische Information direkt aus
diesen gemischten Signalen zu extrahieren. In den letzten Jahren wurde
häufig beobachtet, dass die Leistungsfähigkeit dieser Algorithmen durch
die Signalüberlagerungen und den daraus resultierenden Informationsverlust
generell limitiert ist. Ein möglicher Lösungsansatz besteht darin,
mittels Verfahren der Quellentrennung die beteiligten Instrumente vor der
Analyse klanglich zu isolieren. Die Leistungsfähigkeit dieser Algorithmen
ist zum aktuellen Stand der Technik jedoch nicht immer ausreichend, um eine
sehr gute Trennung der Einzelquellen zu ermöglichen. In dieser Arbeit
werden daher ausschließlich isolierte Instrumentalaufnahmen untersucht,
die klanglich nicht von anderen Instrumenten überlagert sind. Exemplarisch
werden anhand der elektrischen Bassgitarre auf die Klangerzeugung dieses
Instrumentes hin spezialisierte Analyse- und Klangsynthesealgorithmen
entwickelt und evaluiert.Im ersten Teil der vorliegenden Arbeit wird ein
Algorithmus vorgestellt, der eine automatische Transkription von
Bassgitarrenaufnahmen durchführt. Dabei wird das Audiosignal durch
verschiedene Klangereignisse beschrieben, welche den gespielten Noten auf
dem Instrument entsprechen. Neben den üblichen Notenparametern Anfang,
Dauer, Lautstärke und Tonhöhe werden dabei auch instrumentenspezifische
Parameter wie die verwendeten Spieltechniken sowie die Saiten- und Bundlage
auf dem Instrument automatisch extrahiert. Evaluationsexperimente anhand
zweier neu erstellter Audiodatensätze belegen, dass der vorgestellte
Transkriptionsalgorithmus auf einem Datensatz von realistischen
Bassgitarrenaufnahmen eine höhere Erkennungsgenauigkeit erreichen kann als
drei existierende Algorithmen aus dem Stand der Technik. Die Schätzung der
instrumentenspezifischen Parameter kann insbesondere für isolierte
Einzelnoten mit einer hohen Güte durchgeführt werden.Im zweiten Teil der
Arbeit wird untersucht, wie aus einer Notendarstellung typischer sich
wieder- holender Basslinien auf das Musikgenre geschlossen werden kann.
Dabei werden Audiomerkmale extrahiert, welche verschiedene tonale,
rhythmische, und strukturelle Eigenschaften von Basslinien quantitativ
beschreiben. Mit Hilfe eines neu erstellten Datensatzes von 520 typischen
Basslinien aus 13 verschiedenen Musikgenres wurden drei verschiedene
Ansätze für die automatische Genreklassifikation verglichen. Dabei zeigte
sich, dass mit Hilfe eines regelbasierten Klassifikationsverfahrens nur
Anhand der Analyse der Basslinie eines Musikstückes bereits eine mittlere
Erkennungsrate von 64,8 % erreicht werden konnte.Die Re-synthese der
originalen Bassspuren basierend auf den extrahierten Notenparametern wird
im dritten Teil der Arbeit untersucht. Dabei wird ein neuer
Audiosynthesealgorithmus vorgestellt, der basierend auf dem Prinzip des
Physical Modeling verschiedene Aspekte der für die Bassgitarre
charakteristische Klangerzeugung wie Saitenanregung, Dämpfung, Kollision
zwischen Saite und Bund sowie dem Tonabnehmerverhalten nachbildet.
Weiterhin wird ein parametrischerAudiokodierungsansatz diskutiert, der es
erlaubt, Bassgitarrenspuren nur anhand der ermittel- ten notenweisen
Parameter zu übertragen um sie auf Dekoderseite wieder zu
resynthetisieren. Die Ergebnisse mehrerer Hötest belegen, dass der
vorgeschlagene Synthesealgorithmus eine Re- Synthese von
Bassgitarrenaufnahmen mit einer besseren Klangqualität ermöglicht als die
Übertragung der Audiodaten mit existierenden Audiokodierungsverfahren, die
auf sehr geringe Bitraten ein gestellt sind.Music recordings most often consist of multiple instrument signals, which
overlap in time and frequency. In the field of Music Information Retrieval
(MIR), existing algorithms for the automatic transcription and analysis of
music recordings aim to extract semantic information from mixed audio
signals. In the last years, it was frequently observed that the algorithm
performance is limited due to the signal interference and the resulting
loss of information. One common approach to solve this problem is to first
apply source separation algorithms to isolate the present musical
instrument signals before analyzing them individually. The performance of
source separation algorithms strongly depends on the number of instruments
as well as on the amount of spectral overlap.In this thesis, isolated
instrumental tracks are analyzed in order to circumvent the challenges of
source separation. Instead, the focus is on the development of
instrument-centered signal processing algorithms for music transcription,
musical analysis, as well as sound synthesis. The electric bass guitar is
chosen as an example instrument. Its sound production principles are
closely investigated and considered in the algorithmic design.In the first
part of this thesis, an automatic music transcription algorithm for
electric bass guitar recordings will be presented. The audio signal is
interpreted as a sequence of sound events, which are described by various
parameters. In addition to the conventionally used score-level parameters
note onset, duration, loudness, and pitch, instrument-specific parameters
such as the applied instrument playing techniques and the geometric
position on the instrument fretboard will be extracted. Different
evaluation experiments confirmed that the proposed transcription algorithm
outperformed three state-of-the-art bass transcription algorithms for the
transcription of realistic bass guitar recordings. The estimation of the
instrument-level parameters works with high accuracy, in particular for
isolated note samples.In the second part of the thesis, it will be
investigated, whether the sole analysis of the bassline of a music piece
allows to automatically classify its music genre. Different score-based
audio features will be proposed that allow to quantify tonal, rhythmic, and
structural properties of basslines. Based on a novel data set of 520
bassline transcriptions from 13 different music genres, three approaches
for music genre classification were compared. A rule-based classification
system could achieve a mean class accuracy of 64.8 % by only taking
features into account that were extracted from the bassline of a music
piece.The re-synthesis of a bass guitar recordings using the previously
extracted note parameters will be studied in the third part of this thesis.
Based on the physical modeling of string instruments, a novel sound
synthesis algorithm tailored to the electric bass guitar will be presented.
The algorithm mimics different aspects of the instrument’s sound
production mechanism such as string excitement, string damping, string-fret
collision, and the influence of the electro-magnetic pickup. Furthermore, a
parametric audio coding approach will be discussed that allows to encode
and transmit bass guitar tracks with a significantly smaller bit rate than
conventional audio coding algorithms do. The results of different listening
tests confirmed that a higher perceptual quality can be achieved if the
original bass guitar recordings are encoded and re-synthesized using the
proposed parametric audio codec instead of being encoded using conventional
audio codecs at very low bit rate settings
Reverberation: models, estimation and application
The use of reverberation models is required in many applications such as acoustic measurements,
speech dereverberation and robust automatic speech recognition. The aim of this thesis is to
investigate different models and propose a perceptually-relevant reverberation model with suitable
parameter estimation techniques for different applications.
Reverberation can be modelled in both the time and frequency domain. The model parameters
give direct information of both physical and perceptual characteristics. These characteristics
create a multidimensional parameter space of reverberation, which can be to a large extent captured
by a time-frequency domain model. In this thesis, the relationship between physical and perceptual
model parameters will be discussed. In the first application, an intrusive technique is proposed to
measure the reverberation or reverberance, perception of reverberation and the colouration. The
room decay rate parameter is of particular interest.
In practical applications, a blind estimate of the decay rate of acoustic energy in a room
is required. A statistical model for the distribution of the decay rate of the reverberant signal
named the eagleMax distribution is proposed. The eagleMax distribution describes the reverberant
speech decay rates as a random variable that is the maximum of the room decay rates and anechoic
speech decay rates. Three methods were developed to estimate the mean room decay rate from
the eagleMax distributions alone. The estimated room decay rates form a reverberation model that
will be discussed in the context of room acoustic measurements, speech dereverberation and robust
automatic speech recognition individually