10 research outputs found
Voice-signature-based Speaker Recognition
Magister Scientiae - MSc (Computer Science)Personal
identification
and
the
protection
of
data
are
important
issues
because
of
the
ubiquitousness
of
computing
and
these
have
thus
become
interesting
areas
of
research
in
the
field
of
computer
science.
Previously
people
have
used
a
variety
of
ways
to
identify
an
individual
and
protect
themselves,
their
property
and
their
information.
This
they
did
mostly
by
means
of
locks,
passwords,
smartcards
and
biometrics.
Verifying
individuals
by
using
their
physical
or
behavioural
features
is
more
secure
than
using
other
data
such
as
passwords
or
smartcards,
because
everyone
has
unique
features
which
distinguish
him
or
her
from
others.
Furthermore
the
biometrics
of
a
person
are
difficult
to
imitate
or
steal.
Biometric
technologies
represent
a
significant
component
of
a
comprehensive
digital
identity
solution
and
play
an
important
role
in
security.
The
technologies
that
support
identification
and
authentication
of
individuals
is
based
on
either
their
physiological
or
their
behavioural
characteristics.
Live-Ââdata,
in
this
instance
the
human
voice,
is
the
topic
of
this
research.
The
aim
is
to
recognize
a
personâs
voice
and
to
identify
the
user
by
verifying
that
his/her
voice
is
the
same
as
a
record
of
his
/
her
voice-Ââsignature
in
a
systems
database.
To
address
the
main
research
question:
âWhat
is
the
best
way
to
identify
a
person
by
his
/
her
voice
signature?â,
design
science
research,
was
employed.
This
methodology
is
used
to
develop
an
artefact
for
solving
a
problem.
Initially
a
pilot
study
was
conducted
using
visual
representation
of
voice
signatures,
to
check
if
it
is
possible
to
identify
speakers
without
using
feature
extraction
or
matching
methods.
Subsequently,
experiments
were
conducted
with
6300
data
sets
derived
from
Texas
Instruments
and
the
Massachusetts
Institute
of
Technology
audio
database.
Two
methods
of
feature
extraction
and
classification
were
consideredâmel
frequency
cepstrum
coefficient
and
linear
prediction
cepstral
coefficient
feature
extractionâand
for
classification,
the
Support
Vector
Machines
method
was
used.
The
three
methods
were
compared
in
terms
of
their
effectiveness
and
it
was
found
that
the
system
using
the
mel
frequency
cepstrum
coefficient,
for
feature
extraction,
gave
the
marginally
better
results
for
speaker
recognition
Voice signature based Speaker Recognition
Magister Scientiae - MSc (Computer Science)Personal identification and the protection of data are important issues because of the ubiquitousness of computing and these havethus become interesting areas of research in the field of computer science. Previously people have used a variety of ways to identify an individual and protect themselves, their property and their information
Robust speaker recognition using both vocal source and vocal tract features estimated from noisy input utterances.
Wang, Ning.Thesis (M.Phil.)--Chinese University of Hong Kong, 2007.Includes bibliographical references (leaves 106-115).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Introduction to Speech and Speaker Recognition --- p.1Chapter 1.2 --- Difficulties and Challenges of Speaker Authentication --- p.6Chapter 1.3 --- Objectives and Thesis Outline --- p.7Chapter 2 --- Speaker Recognition System --- p.10Chapter 2.1 --- Baseline Speaker Recognition System Overview --- p.10Chapter 2.1.1 --- Feature Extraction --- p.12Chapter 2.1.2 --- Pattern Generation and Classification --- p.24Chapter 2.2 --- Performance Evaluation Metric for Different Speaker Recognition Tasks --- p.30Chapter 2.3 --- Robustness of Speaker Recognition System --- p.30Chapter 2.3.1 --- Speech Corpus: CU2C --- p.30Chapter 2.3.2 --- Noise Database: NOISEX-92 --- p.34Chapter 2.3.3 --- Mismatched Training and Testing Conditions --- p.35Chapter 2.4 --- Summary --- p.37Chapter 3 --- Speaker Recognition System using both Vocal Tract and Vocal Source Features --- p.38Chapter 3.1 --- Speech Production Mechanism --- p.39Chapter 3.1.1 --- Speech Production: An Overview --- p.39Chapter 3.1.2 --- Acoustic Properties of Human Speech --- p.40Chapter 3.2 --- Source-filter Model and Linear Predictive Analysis --- p.44Chapter 3.2.1 --- Source-filter Speech Model --- p.44Chapter 3.2.2 --- Linear Predictive Analysis for Speech Signal --- p.46Chapter 3.3 --- Vocal Tract Features --- p.51Chapter 3.4 --- Vocal Source Features --- p.52Chapter 3.4.1 --- Source Related Features: An Overview --- p.52Chapter 3.4.2 --- Source Related Features: Technical Viewpoints --- p.54Chapter 3.5 --- Effects of Noises on Speech Properties --- p.55Chapter 3.6 --- Summary --- p.61Chapter 4 --- Estimation of Robust Acoustic Features for Speaker Discrimination --- p.62Chapter 4.1 --- Robust Speech Techniques --- p.63Chapter 4.1.1 --- Noise Resilience --- p.64Chapter 4.1.2 --- Speech Enhancement --- p.64Chapter 4.2 --- Spectral Subtractive-Type Preprocessing --- p.65Chapter 4.2.1 --- Noise Estimation --- p.66Chapter 4.2.2 --- Spectral Subtraction Algorithm --- p.66Chapter 4.3 --- LP Analysis of Noisy Speech --- p.67Chapter 4.3.1 --- LP Inverse Filtering: Whitening Process --- p.68Chapter 4.3.2 --- Magnitude Response of All-pole Filter in Noisy Condition --- p.70Chapter 4.3.3 --- Noise Spectral Reshaping --- p.72Chapter 4.4 --- Distinctive Vocal Tract and Vocal Source Feature Extraction . . --- p.73Chapter 4.4.1 --- Vocal Tract Feature Extraction --- p.73Chapter 4.4.2 --- Source Feature Generation Procedure --- p.75Chapter 4.4.3 --- Subband-specific Parameterization Method --- p.79Chapter 4.5 --- Summary --- p.87Chapter 5 --- Speaker Recognition Tasks & Performance Evaluation --- p.88Chapter 5.1 --- Speaker Recognition Experimental Setup --- p.89Chapter 5.1.1 --- Task Description --- p.89Chapter 5.1.2 --- Baseline Experiments --- p.90Chapter 5.1.3 --- Identification and Verification Results --- p.91Chapter 5.2 --- Speaker Recognition using Source-tract Features --- p.92Chapter 5.2.1 --- Source Feature Selection --- p.92Chapter 5.2.2 --- Source-tract Feature Fusion --- p.94Chapter 5.2.3 --- Identification and Verification Results --- p.95Chapter 5.3 --- Performance Analysis --- p.98Chapter 6 --- Conclusion --- p.102Chapter 6.1 --- Discussion and Conclusion --- p.102Chapter 6.2 --- Suggestion of Future Work --- p.10
The Rise of iWar: Identity, Information, and the Individualization of Modern Warfare
During a decade of global counterterrorism operations and two extended counterinsurgency campaigns, the United States was confronted with a new kind of adversary. Without uniforms, flags, and formations, the task of identifying and targeting these combatants represented an unprecedented operational challenge for which Cold War era doctrinal methods were largely unsuited. This monograph examines the doctrinal, technical, and bureaucratic innovations that evolved in response to these new operational challenges. It discusses the transition from a conventionally focused, Cold War-era targeting process to one optimized for combating networks and conducting identity-based targeting. It analyzes the policy decisions and strategic choices that were the catalysts of this change and concludes with an in depth examination of emerging technologies that are likely to shape how this mode of warfare will be waged in the future.https://press.armywarcollege.edu/monographs/1436/thumbnail.jp
Robust speaker recognition in presence of non-trivial environmental noise (toward greater biometric security)
The aim of this thesis is to investigate speaker recognition in the presence of environmental noise, and to develop a robust speaker recognition method. Recently, Speaker Recognition has been the object of considerable research due to its wide use in various areas. Despite major developments in this field, there are still many limitations and challenges. Environmental noises and their variations are high up in the list of challenges since it impossible to provide a noise free environment. A novel approach is proposed to address the issue of performance degradation in environmental noise. This approach is based on the estimation of signal-to-noise ratio (SNR) and detection of ambient noise from the recognition signal to re-train the reference model for the claimed speaker and to generate a new adapted noisy model to decrease the noise mismatch with recognition utterances. This approach is termed âTraining on the flyâ for robustness of speaker recognition under noisy environments. To detect the noise in the recognition signal two different techniques are proposed: the first technique including generating an emulated noise depending on estimated power spectrum of the original noise using 1/3 octave band filter bank and white noise signal. This emulated noise become close enough to original one that includes in the input signal (recognition signal). The second technique deals with extracting the noise from the input signal using one of speech enhancement algorithm with spectral subtraction to find the noise in the signal. Training on the fly approach (using both techniques) has been examined using two feature approaches and two different kinds of artificial clean and noisy speech databases collected in different environments. Furthermore, the speech samples were text independent. The training on the fly approach is a significant improvement in performance when compared with the performance of conventional speaker recognition (based on clean reference models). Moreover, the training on the fly based on noise extraction showed the best results for all types of noisy data
Sources of the communicative body
This study provides evidence for the warranted assertion that classroom practices will be enhanced by awareness of how non-linguistic modalities of the face, hands and vocal intonation contribute to cohesive and cooperative strategies within social groups. Both the history and observations of non-linguistic communication presented by this study suggest that visual, kinesic, and spatial comprehension create and influence social fields and common spaces, yet our language for these fields and spaces is impoverished. This knowledge has been submerged and marginalized through history. At the same time, through time, despite this submersion and marginalization, interrelational and intrarelational synchrony and dis-synchrony, centered on and by the communicative body, occur in social settings in ways that can be considered from both historical and observational perspectives. Buildi ng on recent theory by Damasio, Donald, Noddings, Grumet, Terdiman, and Nussbaum, the historical concepts and classroom observations presented here evidence that social values such as caring, loyalty, and generosity are sometimes understood, implicitly and explicitly, through the exchange, perception, and interpretation of non-linguistic signs. By understanding how the face and hands and rhythm and pitch of the voice create cohesive and cooperative social values in learning spaces - separate from racial, ethnic, and intellectual differences - this investigation recovers a submerged knowledge in order to offer a new logic for understanding social process. In turn, this new logic hopes to further transformational practice in the learning and teaching arts and sciences