270 research outputs found

    Master of Science

    Get PDF
    thesisAdvances in silicon photonics are enabling hybrid integration of optoelectronic circuits alongside current complementary metal-oxide-semiconductor (CMOS) technologies. To fully exploit the capability of this integration, it is important to explore the effects of thermal gradients on optoelectronic devices. The sensitivity of optical components to temperature variation gives rise to design issues in silicon on insulator (SOI) optoelectronic technology. The thermo-electric effect becomes problematic with the integration of hybrid optoelectronic systems, where heat is generated from electrical components. Through the thermo-optic effect, the optical signals are in turn affected and compensation is necessary. To improve the capability of optical SOI designs, optical-wave-simulation models and the characteristic thermal operating environment need to be integrated to ensure proper operation. In order to exploit the potential for compensation by virtue of resynthesis, temperature characterization on a system level is required. Thermal characterization within the flow of physical design automation tools for hybrid optoelectronic technology enables device resynthesis and validation at a system level. Additionally, thermally-aware routing and placement would be possible. A simplified abstraction will help in the active design process, within the contemporary computer-aided design (CAD) flow when designing optoelectronic features. This thesis investigates an abstraction model to characterize the effect of a temperature gradient on optoelectronic circuit operation. To make the approach scalable, reduced order computations are desired that effectively model the effect of temperature on an optoelectronic layout; this is achieved using an electrical analogy to heat flow. Given an optoelectronic circuit, using a thermal resistance network to abstract thermal flow, we compute the temperature distribution throughout the layout. Subsequently, we show how this thermal distribution across the optoelectronic system layout can be integrated within optoelectronic device- and system-level analysis tools

    A computational framework for sound segregation in music signals

    Get PDF
    Tese de doutoramento. Engenharia Electrotécnica e de Computadores. Faculdade de Engenharia. Universidade do Porto. 200

    Glottal-synchronous speech processing

    No full text
    Glottal-synchronous speech processing is a field of speech science where the pseudoperiodicity of voiced speech is exploited. Traditionally, speech processing involves segmenting and processing short speech frames of predefined length; this may fail to exploit the inherent periodic structure of voiced speech which glottal-synchronous speech frames have the potential to harness. Glottal-synchronous frames are often derived from the glottal closure instants (GCIs) and glottal opening instants (GOIs). The SIGMA algorithm was developed for the detection of GCIs and GOIs from the Electroglottograph signal with a measured accuracy of up to 99.59%. For GCI and GOI detection from speech signals, the YAGA algorithm provides a measured accuracy of up to 99.84%. Multichannel speech-based approaches are shown to be more robust to reverberation than single-channel algorithms. The GCIs are applied to real-world applications including speech dereverberation, where SNR is improved by up to 5 dB, and to prosodic manipulation where the importance of voicing detection in glottal-synchronous algorithms is demonstrated by subjective testing. The GCIs are further exploited in a new area of data-driven speech modelling, providing new insights into speech production and a set of tools to aid deployment into real-world applications. The technique is shown to be applicable in areas of speech coding, identification and artificial bandwidth extension of telephone speec

    Speech Enhancement Using Speech Synthesis Techniques

    Full text link
    Traditional speech enhancement systems reduce noise by modifying the noisy signal to make it more like a clean signal, which suffers from two problems: under-suppression of noise and over-suppression of speech. These problems create distortions in enhanced speech and hurt the quality of the enhanced signal. We propose to utilize speech synthesis techniques for a higher quality speech enhancement system. Synthesizing clean speech based on the noisy signal could produce outputs that are both noise-free and high quality. We first show that we can replace the noisy speech with its clean resynthesis from a previously recorded clean speech dictionary from the same speaker (concatenative resynthesis). Next, we show that using a speech synthesizer (vocoder) we can create a clean resynthesis of the noisy speech for more than one speaker. We term this parametric resynthesis (PR). PR can generate better prosody from noisy speech than a TTS system which uses textual information only. Additionally, we can use the high quality speech generation capability of neural vocoders for better quality speech enhancement. When trained on data from enough speakers, these vocoders can generate speech from unseen speakers, both male, and female, with similar quality as seen speakers in training. Finally, we show that using neural vocoders we can achieve better objective signal and overall quality than the state-of-the-art speech enhancement systems and better subjective quality than an oracle mask-based system

    Harmonic Sinusoid Modeling of Tonal Music Events

    Get PDF
    PhDThis thesis presents the theory, implementation and applications of the harmonic sinusoid modeling of pitched audio events. Harmonic sinusoid modeling is a parametric model that expresses an audio signal, or part of an audio signal, as the linear combination of concurrent slow-varying sinusoids, grouped together under harmonic frequency constraints. The harmonic sinusoid modeling is an extension of the sinusoid modeling, with the additional frequency constraints so that it is capable to directly model tonal sounds. This enables applications such as object-oriented audio manipulations, polyphonic transcription, instrument/singer recognition with background music, etc. The modeling system consists of an analyzer and a synthesizer. The analyzer extracts harmonic sinusoidal parameters from an audio waveform, while the synthesizer rebuilds an audio waveform from these parameters. Parameter estimation is based on a detecting-grouping-tracking framework. The detecting stage finds and estimates sinusoid atoms; the grouping stage collects concurrent atoms into harmonic groups; the tracking stage collects the atom groups at different time to form continuous harmonic sinusoid tracks. Compared to standard sinusoid model, the harmonic model focuses on harmonic groups of atoms rather than on isolated atoms, therefore naturally represents tonal sounds. The synthesizer rebuilds the audio signal by interpolating measured parameters along the found tracks. We propose the first application of the harmonic sinusoid model in digital audio editors. For audio editing, with the tonal events directly represented by a parametric model, we can implement standard audio editing functionalities on tonal events embedded in an audio signal, or invent new sound effects based on the model parameters themselves. Possibilities for other applications are suggested at the end of this thesis.Financial support: European Commission, the Higher Education Funding Council for England, and Queen Mary, University of Londo

    Predictive Articulatory speech synthesis Utilizing Lexical Embeddings (PAULE)

    Get PDF
    Das Predictive Articulatory speech synthesis Utilizing Lexical Embeddings (PAULE) Modell ist ein neues Modell zur Kontrolle des artikulatorischen Sprachsynthesizers VocalTractLab (VTL) [15] . Mit PAULE lassen sich deutsche Wörter synthetisieren. Die Wortsynthese kann entweder mit Hilfe eines semantischen Vektors, der die Wortbedeu- tung kodiert, und der gewünschten Dauer der Wortsynthese gestartet werden oder es kann eine Resynthese von einer Audiodatei gemacht werden. Die Audiodatei kann beliebige Aufnahmen von Sprecher:innen enthalten, wobei die Resynthese immer über den Standardsprecher des VTL erfolgt. Abhängig von der Wortbedeutung und der Audiodatei variiert die Synthesequalität. Neu an PAULE ist, dass es einen prädiktiven Ansatz verwendet, indem es aus der geplanten Artikulation die dazugehörige perzeptuelle Akustik vorhersagt und daraus die Wortbedeutung ableitet. Sowohl die Akustik als auch die Wortbedeutung sind als metrische Vektorräume implementiert. Dadurch lässt sich ein Fehler zu einer gewünschten Zielakustik und Zielbedeutung berechnen und minimieren. Bei dem minimierten Fehler handelt es sich nicht um den tatsächlichen Fehler, der aus der Synthese mit dem VTL entsteht, sondern um den Fehler, der aus den Vorhersagen eines prädiktiven Modells generiert wird. Obwohl es nicht der tatsächliche Fehler ist, kann dieser Fehler genutzt werden, um die tatsächliche Artikulation zu verbessern. Um das prädiktive Modell mit der tatsächlichen Akustik in Einklang zu bringen, hört sich PAULE selbst zu. Ein in der Sprachsynthese zentrales Eins-Zu-Viele-Problem ist, dass eine Akustik durch viele verschiedene Artikulationen erzeugt werden kann. Dieses Eins-Zu-Viele-Problem wird durch die Vorhersagefehlerminimierung in PAULE aufgelöst, zusammen mit der Bedingung, dass die Artikulation möglichst stationär und mit möglichst konstanter Kraft ausgeführt wird. PAULE funktioniert ohne jegliche symbolische Repräsentation in der Akustik (Phoneme) und in der Artikulation (motorische Gesten oder Ziele). Damit zeigt PAULE, dass sich gesprochene Wörter ohne symbolische Beschreibungsebene model- lieren lassen. Der gesprochenen Sprache könnte daher im Vergleich zur geschriebenen Sprache eine fundamental andere Verarbeitungsebene zugrunde liegen. PAULE integriert Erfahrungswissen sukzessive. Damit findet PAULE nicht die global beste Artikulation sondern lokal gute Artikulationen. Intern setzt PAULE auf künstliche neuronale Netze und die damit verbundenen Gradienten, die zur Fehlerkorrektur verwendet werden. PAULE kann weder ganze Sätze synthetisieren noch wird somatosensorisches Feedback berücksichtigt. Zu Beidem gibt es Vorarbeiten, die in zukünftige Versionen integriert werden sollen.The Predictive Articulatory speech synthesis Utilizing Lexical Embeddings (PAULE) model is a new control model for the VocalTractLab (VTL) [15] speech synthesizer, a simulator of the human speech system. It is capable of synthesizing single words in the German language. The speech synthesis can be based on a target semantic vector or on target acoustics, i.e., a recorded word token. VTL is controlled by 30 parameters. These parameters have to be estimated for each time point during the production of a word, which is roughly every 2.5 milliseconds. The time-series of these 30 control parameters (cps) of the VTL are the control parameter trajectories (cp-trajectories). The high dimensionality of the cp-trajectories in combination with non-linear interactions leads to a many-to-one mapping problem, where many sets of cp-trajectories produce highly similar synthesized audio. PAULE solves this many-to-one mapping problem by anticipating the effects of cp- trajectories and minimizing a semantic and acoustic error between this nticipation and a targeted meaning and acoustics. The quality of the anticipation is improved by an outer loop, where PAULE listens to itself. PAULE has three central design features that distinguish it from other control models: First, PAULE does not use any symbolic units, neither motor primitives, articulatory targets, or gestural scores on the movement side, nor any phone or syllable representation on the acoustic side. Second, PAULE is a learning model that accumulates experience with articulated words. As a consequence, PAULE will not find a global optimum for the inverse kinematic optimization task it has to solve. Instead, it finds a local optimum that is conditioned on its past experience. Third, PAULE uses gradient-based internal prediction errors of a predictive forward model to plan cp-trajectories for a given semantic or acoustic target. Thus, PAULE is an error-driven model that takes its previous experiences into account. Pilot study results indicate that PAULE is able to minimize an acoustic semantic and acoustic error in the resynthesized audio. This allows PAULE to find cp-trajectories that are correctly classified by a classification model as the correct word with an accuracy of 60 %, which is close to the accuracy for human recordings of 63 %. Furthermore, PAULE seems to model vowel-to-vowel anticipatory coarticulation in terms of formant shifts correctly and can be compared to human electromagnetic articulography (EMA) recordings in a straightforward way. Furthermore, with PAULE it is possible to condition on already executed past cp-trajectories and to smoothly continue the cp-trajectories from the current state. As a side-effect of developing PAULE, it is possible to create large amounts of training data for the VTL through an automated segment-based approach. Next steps, in the development of PAULE, include adding a somatosensory feedback channel, extending PAULE from producing single words to the articulation of small utterances and adding a thorough evaluation

    Synthesis and Verification of Digital Circuits using Functional Simulation and Boolean Satisfiability.

    Full text link
    The semiconductor industry has long relied on the steady trend of transistor scaling, that is, the shrinking of the dimensions of silicon transistor devices, as a way to improve the cost and performance of electronic devices. However, several design challenges have emerged as transistors have become smaller. For instance, wires are not scaling as fast as transistors, and delay associated with wires is becoming more significant. Moreover, in the design flow for integrated circuits, accurate modeling of wire-related delay is available only toward the end of the design process, when the physical placement of logic units is known. Consequently, one can only know whether timing performance objectives are satisfied, i.e., if timing closure is achieved, after several design optimizations. Unless timing closure is achieved, time-consuming design-flow iterations are required. Given the challenges arising from increasingly complex designs, failing to quickly achieve timing closure threatens the ability of designers to produce high-performance chips that can match continually growing consumer demands. In this dissertation, we introduce powerful constraint-guided synthesis optimizations that take into account upcoming timing closure challenges and eliminate expensive design iterations. In particular, we use logic simulation to approximate the behavior of increasingly complex designs leveraging a recently proposed concept, called bit signatures, which allows us to represent a large fraction of a complex circuit's behavior in a compact data structure. By manipulating these signatures, we can efficiently discover a greater set of valid logic transformations than was previously possible and, as a result, enhance timing optimization. Based on the abstractions enabled through signatures, we propose a comprehensive suite of novel techniques: (1) a fast computation of circuit don't-cares that increases restructuring opportunities, (2) a verification methodology to prove the correctness of speculative optimizations that efficiently utilizes the computational power of modern multi-core systems, and (3) a physical synthesis strategy using signatures that re-implements sections of a critical path while minimizing perturbations to the existing placement. Our results indicate that logic simulation is effective in approximating the behavior of complex designs and enables a broader family of optimizations than previous synthesis approaches.Ph.D.Computer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/61793/1/splaza_1.pd

    Analysis of Total and Segmental Body Composition Relative to Fitness Performance Measures in Law Enforcement Recruits

    Get PDF
    International Journal of Exercise Science 15(4): 245-260, 2022. Law enforcement agencies often test the fitness performance and body composition of incoming recruits. This study investigated the relationships between whole and segmental body composition, and fitness tests in law enforcement recruits. A retrospective analysis of 72 male and 11 female recruits was performed. Bioelectrical impedance analysis (BIA) variables were: lean mass (LM), upper-extremity lean mass (UELM), trunk LM, lower-extremity lean mass (LELM), fat mass (FM), upper-extremity fat mass (UEFM), trunk FM, and lower-extremity fat mass (LEFM). Fitness tests included: vertical jump (VJ), peak anaerobic power (PAPw), 75-yard pursuit run (75PR), push-ups, sit-ups, 2-kg medicine ball throw (MBT), and the multi-stage fitness test (MSFT). Partial correlations and ANCOVAs between quartiles assessed relationships between body composition and performance. Significant moderate-to-large relationships were found; LM, UELM, trunk LM, LELM all related to PAPw (r = 0.500-0.558) and MBT (r = 0.494-0.526). FM, UEFM, trunk FM, LEFM all related to VJ (r = -0.481 to -0.493), 75PR (r = 0.533-0.557), push-ups (r = -0.484 to -0.503), sit-ups (r = -0.435 to -0.449), and MSFT (r = -0.371 to -0.423). The highest LM quartile (4) had superior PAPw and MBT than LM quartiles 1-3. Higher FM quartiles performed poorer in VJ, push-ups, and sit-ups. The 75PR quartiles 2, 3, and 4 were slower than quartile 1, and MSFT quartile 4 completed less shuttles. Total and segmental measures of LM and FM shared the same relationships; lower FM and higher LM related to better performance. Monitoring body composition could help guide training to optimize performance
    corecore