6,071 research outputs found
Unsupervised Spoken Term Detection with Spoken Queries by Multi-level Acoustic Patterns with Varying Model Granularity
This paper presents a new approach for unsupervised Spoken Term Detection
with spoken queries using multiple sets of acoustic patterns automatically
discovered from the target corpus. The different pattern HMM
configurations(number of states per model, number of distinct models, number of
Gaussians per state)form a three-dimensional model granularity space. Different
sets of acoustic patterns automatically discovered on different points properly
distributed over this three-dimensional space are complementary to one another,
thus can jointly capture the characteristics of the spoken terms. By
representing the spoken content and spoken query as sequences of acoustic
patterns, a series of approaches for matching the pattern index sequences while
considering the signal variations are developed. In this way, not only the
on-line computation load can be reduced, but the signal distributions caused by
different speakers and acoustic conditions can be reasonably taken care of. The
results indicate that this approach significantly outperformed the unsupervised
feature-based DTW baseline by 16.16\% in mean average precision on the TIMIT
corpus.Comment: Accepted by ICASSP 201
Rhythm-Flexible Voice Conversion without Parallel Data Using Cycle-GAN over Phoneme Posteriorgram Sequences
Speaking rate refers to the average number of phonemes within some unit time,
while the rhythmic patterns refer to duration distributions for realizations of
different phonemes within different phonetic structures. Both are key
components of prosody in speech, which is different for different speakers.
Models like cycle-consistent adversarial network (Cycle-GAN) and variational
auto-encoder (VAE) have been successfully applied to voice conversion tasks
without parallel data. However, due to the neural network architectures and
feature vectors chosen for these approaches, the length of the predicted
utterance has to be fixed to that of the input utterance, which limits the
flexibility in mimicking the speaking rates and rhythmic patterns for the
target speaker. On the other hand, sequence-to-sequence learning model was used
to remove the above length constraint, but parallel training data are needed.
In this paper, we propose an approach utilizing sequence-to-sequence model
trained with unsupervised Cycle-GAN to perform the transformation between the
phoneme posteriorgram sequences for different speakers. In this way, the length
constraint mentioned above is removed to offer rhythm-flexible voice conversion
without requiring parallel data. Preliminary evaluation on two datasets showed
very encouraging results.Comment: 8 pages, 6 figures, Submitted to SLT 201
Hafnium oxide-based ferroelectric thin-film transistor with a-InGaZnO channel fabricated at temperatures \u3c= 350°C
HfO2-based ferroelectric materials integrated with oxide-based thin-film transistors have been considered as potential candidates for back-end-of-line compatible ferroelectric field-effect transistors, which can be vertically stacked on silicon CMOS circuits to realize high-density neural network applications. However, the formation of ferroelectric orthorhombic phase in HfO2-based materials usually requires an annealing temperature of 400°C or higher. In this work, ferroelectric thin-film transistors (Fe-TFTs) were developed by monolithically integrating HfZrO2 (HZO) ferroelectric capacitors with amorphous indium-gallium-zinc oxide (a-IGZO) TFTs at a maximum processing temperature of 350°C on a glass substrate. A butterfly-shaped C-V curve was clearly observed in the low-temperature annealed metal-HZO-metal capacitor, indicating the formation of ferroelectricity in the HZO layer, as shown in Fig. 1. The positive and negative coercive voltages were 3 V and -2.4 V, respectively. The dielectric constant was 20.65. The field-effect mobility, threshold voltage, subthreshold swing and on/off current ratio of the a-IGZO TFT extracted from the transfer characteristics shown in Fig. 2 were 6.15 cm2V-1s-1, 1.5 V, 0.1 V/dec and 4.3´107, respectively. Fig. 3 shows the transfer hysteresis curves of the low-temperature Fe-TFTs in a metal-ferroelectric-metal-insulator-semiconductor configuration. The Fe-TFTs exhibited large hysteresis memory windows of 2.8 V and 3.8 V when the area ratios between ferroelectric capacitors and gate insulators (AFE / ADE) were 1/8 and 1/12, respectively. The result shows a great potential for back-end-of-line compatible memory applications.
Please click Download on the upper right corner to see the full abstract
The Potential Economic Impact of Avian Flu Pandemic on Taiwan
This study analyzes the potential consequences of an outbreak of avian influenza (H5N1) on Taiwan¡¦s macro economy and individual industries. Both the Input-Output (IO) Analysis Model and Computable General Equilibrium (CGE) Model are used to simulate the possible damage brought by lowering domestic consumption, export, and labor supply. The simulation results indicates that if the disease is confined within the poultry sector, then the impact on real GDP is around -0.1%~-0.4%. Once it becomes a human-to-human pandemic, the IO analysis suggests that the potential impacts on real GDP would be as much as -4.2%~-5.9% while labor demand would decrease 4.9%~6.4%. In the CGE analysis, which allows for resource mobility and substitutions through price adjustments, the real GDP and labor demand would contract 2.0%~2.4% and 2.2%~2.4%, respectively, and bringing down consumer prices by 3%. As for the individual sector, the outbreak will not only damage the poultry sector and its upstream and downstream industries, but also affect the service sectors including wholesale, retail, trade, air transportation, restaurants, as well as healthcare services. These results can be used to support public investment in animal disease control measures.Avian Flu Pandemic, Input-output Model, Computable General Equilibrium Model, Livestock Production/Industries,
Association of interieukin-18 gene polymorphism with asthma in Chinese patients
[[abstract]]Like other allergic diseases, asthma results from multiple conditions. Asthmatic beginning and severity are mediated by both environmental and genetic factors. In asthma studies, important work is realization of the genetic background and identification of genetic factors resulting in asthma development and phenomena. Here, we investigated whether interleukin (IL)-18 single nucleotide polymorphisms (SNPs) are involved in Chinese asthma patients. IL-18 (IL-18) SNP was detected by polymerase chain reaction (PCR)-based restriction analysis in 201 patients with asthma and 60 normal controls. Significant differences were found in the genotype distribution of IL-18 SNIP between asthma patients and controls (P = 0.000003). Allelic frequency of the IL-18 gene distinguished asthma patients from controls (P = 0.000066). The results revealed a significant difference between asthma patients and normal controls in IL-18 SNP and a statistical correlation between IL-18 polymorphisms (105A/C) and asthma formation. We concluded that Chinese who carry the C/C homozygote of the IL-18-105A/C gene polymorphism in coding regions may have a higher risk of developing asthma
DAHiTrA: Damage Assessment Using a Novel Hierarchical Transformer Architecture
This paper presents DAHiTrA, a novel deep-learning model with hierarchical
transformers to classify building damages based on satellite images in the
aftermath of hurricanes. An automated building damage assessment provides
critical information for decision making and resource allocation for rapid
emergency response. Satellite imagery provides real-time, high-coverage
information and offers opportunities to inform large-scale post-disaster
building damage assessment. In addition, deep-learning methods have shown to be
promising in classifying building damage. In this work, a novel
transformer-based network is proposed for assessing building damage. This
network leverages hierarchical spatial features of multiple resolutions and
captures temporal difference in the feature domain after applying a transformer
encoder on the spatial features. The proposed network achieves
state-of-the-art-performance when tested on a large-scale disaster damage
dataset (xBD) for building localization and damage classification, as well as
on LEVIR-CD dataset for change detection tasks. In addition, we introduce a new
high-resolution satellite imagery dataset, Ida-BD (related to the 2021
Hurricane Ida in Louisiana in 2021, for domain adaptation to further evaluate
the capability of the model to be applied to newly damaged areas with scarce
data. The domain adaptation results indicate that the proposed model can be
adapted to a new event with only limited fine-tuning. Hence, the proposed model
advances the current state of the art through better performance and domain
adaptation. Also, Ida-BD provides a higher-resolution annotated dataset for
future studies in this field
IoT-based Asset Management System for Healthcare-related Industries
The healthcare industry has been focusing efforts on optimizing inventory management procedures through the incorporation of Information and Communication Technology, in the form of tracking devices and data mining, to establish ideal inventory models. In this paper, a roadmap is developed towards a technological assessment of the Internet of Things (IoT) in the healthcare industry, 2010–2020. According to the roadmap, an IoT-based healthcare asset management system (IoT-HAMS) is proposed and developed based on Artificial Neural Network (ANN) and Fuzzy Logic (FL), incorporating IoT technologies for asset management to optimize the supply of resources
- …