765 research outputs found
Program latihan industri di Kolej Universiti Teknologi Tun Hussein Onn : kajian terhadap perlaksanaan sistem penilaian
Kajian yang dijalankan adalah bertajuk "Program Lalilian lndustri Di Kolej Universiti Teknologi Tun Hussein Onn : Kajian Terhadap Perlaksanaan Sistem Penilaian". Sampel terdin daripada 6 orang pakar serta 63 orang pelajar yang terlibat dalam latihan industri. Maklumat yang diperolehi berdasarkan kaedah kualitatif dan kuantitatif Data dianalisis untuk meninjau kaedah penilaian yang dijalankan dan seterusnya memastikan apakali sistem penilaian yang perlu diperbaiki. Secara keseluruhannya, kebanyakan responden berpendapat bahawa sistem penilaian yang sedia ada adalah perlu diperbaki dan disistematikkan selaras dengan ISO 9000 : 2001. Berdasarkan daripada keputusan yang diperolehi dan bimbingnan pakar dari Unit Latihan lndustri KUiTTHO, maka satu "Buku Panduan Penilaian Latihan lndustri" dihasilkan dengan panduan yang ringkas dan lampiran borang-borang yang telah diperbaiki dan diubahsuai. Diharapkan produk mi dapat digunakan untuk masa-masa akan datang
Models and analysis of vocal emissions for biomedical applications
This book of Proceedings collects the papers presented at the 3rd International Workshop on Models and Analysis of Vocal Emissions for Biomedical Applications, MAVEBA 2003, held 10-12 December 2003, Firenze, Italy. The workshop is organised every two years, and aims to stimulate contacts between specialists active in research and industrial developments, in the area of voice analysis for biomedical applications. The scope of the Workshop includes all aspects of voice modelling and analysis, ranging from fundamental research to all kinds of biomedical applications and related established and advanced technologies
SnakeGAN: A Universal Vocoder Leveraging DDSP Prior Knowledge and Periodic Inductive Bias
Generative adversarial network (GAN)-based neural vocoders have been widely
used in audio synthesis tasks due to their high generation quality, efficient
inference, and small computation footprint. However, it is still challenging to
train a universal vocoder which can generalize well to out-of-domain (OOD)
scenarios, such as unseen speaking styles, non-speech vocalization, singing,
and musical pieces. In this work, we propose SnakeGAN, a GAN-based universal
vocoder, which can synthesize high-fidelity audio in various OOD scenarios.
SnakeGAN takes a coarse-grained signal generated by a differentiable digital
signal processing (DDSP) model as prior knowledge, aiming at recovering
high-fidelity waveform from a Mel-spectrogram. We introduce periodic
nonlinearities through the Snake activation function and anti-aliased
representation into the generator, which further brings the desired inductive
bias for audio synthesis and significantly improves the extrapolation capacity
for universal vocoding in unseen scenarios. To validate the effectiveness of
our proposed method, we train SnakeGAN with only speech data and evaluate its
performance for various OOD distributions with both subjective and objective
metrics. Experimental results show that SnakeGAN significantly outperforms the
compared approaches and can generate high-fidelity audio samples including
unseen speakers with unseen styles, singing voices, instrumental pieces, and
nonverbal vocalization.Comment: Accepted by ICME 202
Pushing the boundaries of photoconductive sampling in solids
The advent of laser-based optical tools featuring few-cycle pulses with durations of less than a hundred femtoseconds in the late 1980s enabled scientists to initiate and observe the evolution of chemical reactions. This powerful approach combined the interactions of light and matter and unleashed an unprecedented metrology concept that tracks the interactions of atoms and molecules in their natural timescales. Electron wavepacket dynamics take place in the attosecond range, a thousand times faster than molecules. In optical terms, such durations typically last less than the half-cycle duration of optical fields. Consequently, the investigation of such electronic processes necessitates measurement techniques capable of resolving the oscillations of the electric field of light. The primary objective of this thesis is to develop and advance novel field characterisation techniques based on photoconductive sampling.
The first portion of this thesis addresses broadband field characterisation based on nonlinear photoconductive sampling. A theoretical analysis of current formation and localisation in solids is presented, prompting the fabrication of a heterostructured sample with the aim of enhancing the magnitude of the signal obtained from the measurement technique. A thorough proof-of-principle experiment is performed, whereby a significant enhancement in signal magnitude is established. As a consequence of signal improvement, the heterostructured sample reaches the desired stability regime earlier than its traditional bulk counterparts. Moreover, the performance of the heterostructured sample for field characterisation is compared to fused silica and benchmarked against the well-established technique of electro-optic sampling. These results pave the way towards field sampling in low pulse energy systems.
The following section details broadband field characterisation based on linear photoconductive sampling by employing tailored pulses from a waveform synthe- siser. Visible-ultraviolet pulses are utilised to inject carriers in a common semi- conductive material (gallium phosphide), enabling the complete characterisation of a mid-infrared test field. Furthermore, the technique is validated against electro-optic sampling. When compared to electro-optic sampling, the response function of linear photoconductive sampling is concerned with the intensity envelope of the gating field, relaxing the strict requisites on the temporal phase of the gate. The demonstrated results represent a significant achievement in extending field sampling techniques beyond 100 THz and towards the visible range.
Finally, a machine learning-based algorithm for denoising waveforms obtained from a laboratory setting is developed and implemented. The algorithm is based on a one-dimensional convolutional neural network, ideal for processing data presented on an evenly spaced grid. The model is compared with well-established methodologies, namely denoising via the fast Fourier transform and wavelet analysis and exhibits excellent performance, extending the repertoire of tools typically used for combating noise.
The field characterisation methodologies presented in this thesis pave the way towards accessible and cost-effective field sampling techniques, enabling researchers to study field-induced electron dynamics in matter and usher in ultrafast optoelectronic signal processing towards the PHz range. In general, the field characterisation techniques presented occupy a small footprint, and the measurements take place in ambient air conditions, facilitating their integration in existing experimental infrastructures. With the aid of AI-accelerator chips, the machine learning tool developed in this thesis can be implemented during laboratory measurements as a concurrent denoising technique
DeepVOX: Discovering Features from Raw Audio for Speaker Recognition in Degraded Audio Signals
Automatic speaker recognition algorithms typically use pre-defined
filterbanks, such as Mel-Frequency and Gammatone filterbanks, for
characterizing speech audio. The design of these filterbanks is based on
domain-knowledge and limited empirical observations. The resultant features,
therefore, may not generalize well to different types of audio degradation. In
this work, we propose a deep learning-based technique to induce the filterbank
design from vast amounts of speech audio. The purpose of such a filterbank is
to extract features robust to degradations in the input audio. To this effect,
a 1D convolutional neural network is designed to learn a time-domain filterbank
called DeepVOX directly from raw speech audio. Secondly, an adaptive triplet
mining technique is developed to efficiently mine the data samples best suited
to train the filterbank. Thirdly, a detailed ablation study of the DeepVOX
filterbanks reveals the presence of both vocal source and vocal tract
characteristics in the extracted features. Experimental results on VOXCeleb2,
NIST SRE 2008 and 2010, and Fisher speech datasets demonstrate the efficacy of
the DeepVOX features across a variety of audio degradations, multi-lingual
speech data, and varying-duration speech audio. The DeepVOX features also
improve the performance of existing speaker recognition algorithms, such as the
xVector-PLDA and the iVector-PLDA
- …