452 research outputs found
Recommended from our members
Optophone design: optical-to-auditory vision substitution for the blind
An optophone is a device that turns light into sound for the benefit of blind people. The present project is intended to produce a general-purpose optophone to be worn on the head about the house and in the street, to give the wearer a detailed description in sound of the'scene he is facing. The device will therefore consist'of an'electronic camera, some signal-processing electronics, earphones`, and a battery. The two major problems are the derivation of (a) the most suitable mapping from images to sounds, and (b) an algorithm to perform the mapping in real'time on existing electronic components. This thesis concerns problem (a). Chapter 2 goes into the general scene-to-sound mapping problem in some detail'and presents the work of earlier investigators. Chapter 3 1- discusses the design of tests to evaluate the performance of candidate mappings. A theoretical performance test (TPT) is derived. Chapter 4 applies the TPT to the most obvious mapping, the cartesian piano transform. Chapter 5 applies the TPT to a mapping based on the cosine transform. Chapter 6 attempts to derive a mapping by principal component analysis, using the inaccuracies of human sight and hearing and the statistical properties of real scenes and sounds. Chapter 7 presents a complete scheme, implemented in software, for representing digitised colour scenes by audible digitised stereo sound. Chapter 8 tries to decide how'many numbers are required to specify a steady spectrum with no noticeable degradation. Chapter 9 looks'at a scheme designed to produce more natural-sounding sounds related to more meaningful portions of the scene. This scheme maps windows in the scene to steady spectral patterns of short duration, the location of the window being conveyed by simulated free-field listening. Chapter 10 gives detailed recommendations as to further work
Representation of Instantaneous and Short-Term Loudness in the Human Cortex.
Acoustic signals pass through numerous transforms in the auditory system before perceptual attributes such as loudness and pitch are derived. However, relatively little is known as to exactly when these transformations happen, and where, cortically or sub-cortically, they occur. In an effort to examine this, we investigated the latencies and locations of cortical entrainment to two transforms predicted by a model of loudness perception for time-varying sounds: the transforms were instantaneous loudness and short-term loudness, where the latter is hypothesized to be derived from the former and therefore should occur later in time. Entrainment of cortical activity was estimated from electro- and magneto-encephalographic (EMEG) activity, recorded while healthy subjects listened to continuous speech. There was entrainment to instantaneous loudness bilaterally at 45, 100, and 165 ms, in Heschl's gyrus, dorsal lateral sulcus, and Heschl's gyrus, respectively. Entrainment to short-term loudness was found in both the dorsal lateral sulcus and superior temporal sulcus at 275 ms. These results suggest that short-term loudness is derived from instantaneous loudness, and that this derivation occurs after processing in sub-cortical structures.This work was supported by an ERC Advanced Grant (230570, ‘Neurolex’) to WMW, and by MRC Cognition and Brain Sciences Unit (CBU) funding to WMW (U.1055.04.002.00001.01). Computing resources were provided by the MRC-CBU.This is the final version of the article. It first appeared from Frontiers via http://dx.doi.org/10.3389/fnins.2016.0018
Representation of statistical sound properties in human auditory cortex
The work carried out in this doctoral thesis investigated the representation of
statistical sound properties in human auditory cortex. It addressed four key aspects in
auditory neuroscience: the representation of different analysis time windows in
auditory cortex; mechanisms for the analysis and segregation of auditory objects;
information-theoretic constraints on pitch sequence processing; and the analysis of
local and global pitch patterns. The majority of the studies employed a parametric
design in which the statistical properties of a single acoustic parameter were altered
along a continuum, while keeping other sound properties fixed.
The thesis is divided into four parts. Part I (Chapter 1) examines principles of
anatomical and functional organisation that constrain the problems addressed. Part II
(Chapter 2) introduces approaches to digital stimulus design, principles of functional
magnetic resonance imaging (fMRI), and the analysis of fMRI data. Part III (Chapters
3-6) reports five experimental studies. Study 1 controlled the spectrotemporal
correlation in complex acoustic spectra and showed that activity in auditory
association cortex increases as a function of spectrotemporal correlation. Study 2
demonstrated a functional hierarchy of the representation of auditory object
boundaries and object salience. Studies 3 and 4 investigated cortical mechanisms for
encoding entropy in pitch sequences and showed that the planum temporale acts as a
computational hub, requiring more computational resources for sequences with high
entropy than for those with high redundancy. Study 5 provided evidence for a
hierarchical organisation of local and global pitch pattern processing in neurologically
normal participants. Finally, Part IV (Chapter 7) concludes with a general discussion
of the results and future perspectives
Comparative linear accuracy and reliability of cone beam CT derived 2-dimensional and 3-dimensional images constructed using an orthodontic volumetric rendering program.
The purpose of this project was to compare the accuracy and reliability of linear measurements made on 2D projections and 3D reconstructions using Dolphin 3D software (Chatsworth, CA) as compared to direct measurements made on human skulls. The linear dimensions between 6 bilateral and 8 mid-sagittal anatomical landmarks on 23 dentate dry human skulls were measured three times by multiple observers using a digital caliper to provide twenty orthodontic linear measurements. The skulls were stabilized and imaged via PSP digital cephalometry as well as CBCT. The PSP cephalograms were imported into Dolphin (Chatsworth, CA, USA) and the 3D volumetric data set was imported into Dolphin 3D (Version 2.3, Chatsworth, CA, USA). Using Dolphin 3D, planar cephalograms as well as 3D volumetric surface reconstructions were (3D CBCT) generated. The linear measurements between landmarks of each three modalities were then computed by a single observer three times. For 2D measurements, a one way ANOVA for each measurement dimension was calculated as well as a post hoc Scheffe multiple comparison test with the anatomic distance as the control group. 3D measurements were compared to anatomic truth using Student\u27s t test (PiÜ50.05). The intraclass correlation coefficient (ICC) and absolute linear and percentage error were determined as indices of intraobserver reliability. Our results show that for 2D mid sagittal measurements that Simulated LC images are accurate and similar to those from PSP images (except for Ba-Na), and for bilateral measurements simulated LC measurements were similar to PSP but less accurate, underestimating dimensions by between 4.7% to 17%.For 3D volumetric renderings, 2/3 rd of CBCT measurements are statistically different from actual measurements, however this possibly is not clinically relevant
Computational Tonality Estimation: Signal Processing and Hidden Markov Models
PhDThis thesis investigates computational musical tonality estimation from an audio signal. We
present a hidden Markov model (HMM) in which relationships between chords and keys are
expressed as probabilities of emitting observable chords from a hidden key sequence. The model
is tested first using symbolic chord annotations as observations, and gives excellent global key
recognition rates on a set of Beatles songs.
The initial model is extended for audio input by using an existing chord recognition algorithm,
which allows it to be tested on a much larger database. We show that a simple model of the
upper partials in the signal improves percentage scores. We also present a variant of the HMM
which has a continuous observation probability density, but show that the discrete version gives
better performance.
Then follows a detailed analysis of the effects on key estimation and computation time of
changing the low level signal processing parameters. We find that much of the high frequency
information can be omitted without loss of accuracy, and significant computational savings can
be made by applying a threshold to the transform kernels. Results show that there is no single
ideal set of parameters for all music, but that tuning the parameters can make a difference to
accuracy.
We discuss methods of evaluating more complex tonal changes than a single global key, and
compare a metric that measures similarity to a ground truth to metrics that are rooted in music
retrieval. We show that the two measures give different results, and so recommend that the choice
of evaluation metric is determined by the intended application.
Finally we draw together our conclusions and use them to suggest areas for continuation of this
research, in the areas of tonality model development, feature extraction, evaluation methodology,
and applications of computational tonality estimation.Engineering and Physical
Sciences Research Council (EPSRC)
Investigating the build-up of precedence effect using reflection masking
The auditory processing level involved in the build‐up of precedence [Freyman et al., J. Acoust. Soc. Am. 90, 874–884 (1991)] has been investigated here by employing reflection masked threshold (RMT) techniques. Given that RMT techniques are generally assumed to address lower levels of the auditory signal processing, such an approach represents a bottom‐up approach to the buildup of precedence. Three conditioner configurations measuring a possible buildup of reflection suppression were compared to the baseline RMT for four reflection delays ranging from 2.5–15 ms. No buildup of reflection suppression was observed for any of the conditioner configurations. Buildup of template (decrease in RMT for two of the conditioners), on the other hand, was found to be delay dependent. For five of six listeners, with reflection delay=2.5 and 15 ms, RMT decreased relative to the baseline. For 5‐ and 10‐ms delay, no change in threshold was observed. It is concluded that the low‐level auditory processing involved in RMT is not sufficient to realize a buildup of reflection suppression. This confirms suggestions that higher level processing is involved in PE buildup. The observed enhancement of reflection detection (RMT) may contribute to active suppression at higher processing levels
Comparative TMJ imaging accuracy using iCAT cone beam computerized tomography.
A blinded observational cross-sectional in vitro study was conducted to compare the diagnostic accuracy of observers viewing images made using cone beam computerized tomography (CBCT), panoramic radiography and linear tomography. The sample consisted of 37 TMJ articulations from 30 human skulls demonstrating either normal condylar morphology (n=19) or erosion of the lateral pole (n=18). The articulations were imaged using corrected angle linear tomography, normal and TMJ specific panoramic radiography and CBCT. Images and 10 re-reads were presented to 10 observers. Multiple CBCT multi-planar images were presented both statically and interactively. Intra-observer reliability was determined by weighted kappa (Kw) and diagnostic accuracy by the fitted area under the ROC curve (Az). Means were compared using ANOVA (piÜ.05). Our results show CBCT images provide superior reliability and greater accuracy than corrected angle linear tomography and TMJ panoramic projections in the detection of condylar cortical erosion
- …