114 research outputs found
Signatures of Valley Kondo Effect in Si/SiGe Quantum Dots
We report measurements consistent with the valley Kondo effect in Si/SiGe
quantum dots, evidenced by peaks in the conductance versus source-drain voltage
that show strong temperature dependence. The Kondo peaks show unusual behavior
in a magnetic field that we interpret as arising from the valley degree of
freedom. The interplay of valley and Zeeman splittings is suggested by the
presence of side peaks, revealing a zero-field valley splitting between 0.28 to
0.34 meV. A zero-bias conductance peak for non-zero magnetic field, a
phenomenon consistent with valley non- conservation in tunneling, is observed
in two samples.Comment: 16 pages, 7 figure
Actual Measurement and Analysis on Microbial Contamination in Central Air Conditioning System at a Venue in Dalian, China
AbstractActual measurement and analysis were carried out on microbial contamination in central air conditioning system at a venue in Dalian. By studying the microbial contamination in two air handling units with different thermal environments, we found that the fungi and bacteria were common existing on the surface of filter, and the trend of cell density distribution was center > against the wall > corner; The microbial pollution associated in the dust and floating in the air was extremely serious. By comparing the two units, we observed that fungus concentration: Unit A > Unit B, and bacteria concentration: Unit A < Unit B,. And the candida spp. accounted for 80 percent of the sample in Unit A; while in Unit B the cladosporium spp. occupied up to 50%. At the end of the paper, according to the results of measurement and analysis, the methods of controlling microbial contamination in HVAC system have been proposed
Towards Selection of Text-to-speech Data to Augment ASR Training
This paper presents a method for selecting appropriate synthetic speech
samples from a given large text-to-speech (TTS) dataset as supplementary
training data for an automatic speech recognition (ASR) model. We trained a
neural network, which can be optimised using cross-entropy loss or Arcface
loss, to measure the similarity of a synthetic data to real speech. We found
that incorporating synthetic samples with considerable dissimilarity to real
speech, owing in part to lexical differences, into ASR training is crucial for
boosting recognition performance. Experimental results on Librispeech test sets
indicate that, in order to maintain the same speech recognition accuracy as
when using all TTS data, our proposed solution can reduce the size of the TTS
data down below its , which is superior to several baseline methods
Towards General-Purpose Speech Abilities for Large Language Models Using Unpaired Data
In this work, we extend the instruction-tuned Llama-2 model with end-to-end
general-purpose speech processing and reasoning abilities while maintaining the
wide range of LLM capabilities, without using any carefully curated paired
data. The proposed model can utilize audio prompts as a replacement for text
and sustain a conversation. Such a model also has extended cross-modal
capabilities such as being able to perform speech question answering, speech
translation, and audio summarization amongst many other closed and open-domain
tasks. This is unlike prior approaches in speech, in which LLMs are extended to
handle audio for a limited number of pre-designated tasks. Experiments show
that our end-to-end approach is on par with or outperforms a cascaded system
(speech recognizer + LLM) in terms of modeling the response to a prompt.
Furthermore, unlike a cascade, our approach shows the ability to interchange
text and audio modalities and utilize the prior context in a conversation to
provide better results
Identification of disulfidptosis-related subgroups and prognostic signatures in lung adenocarcinoma using machine learning and experimental validation
BackgroundDisulfidptosis is a newly identified variant of cell death characterized by disulfide accumulation, which is independent of ATP depletion. Accordingly, the latent influence of disulfidptosis on the prognosis of lung adenocarcinoma (LUAD) patients and the progression of tumors remains poorly understood.MethodsWe conducted a multifaceted analysis of the transcriptional and genetic modifications in disulfidptosis regulators (DRs) specific to LUAD, followed by an evaluation of their expression configurations to define DR clusters. Harnessing the differentially expressed genes (DEGs) identified from these clusters, we formulated an optimal predictive model by amalgamating 10 distinct machine learning algorithms across 101 unique combinations to compute the disulfidptosis score (DS). Patients were subsequently stratified into high and low DS cohorts based on median DS values. We then performed an exhaustive comparison between these cohorts, focusing on somatic mutations, clinical attributes, tumor microenvironment, and treatment responsiveness. Finally, we empirically validated the biological implications of a critical gene, KYNU, through assays in LUAD cell lines.ResultsWe identified two DR clusters and there were great differences in overall survival (OS) and tumor microenvironment. We selected the "Least Absolute Shrinkage and Selection Operator (LASSO) + Random Survival Forest (RFS)" algorithm to develop a DS based on the average C-index across different cohorts. Our model effectively stratified LUAD patients into high- and low-DS subgroups, with this latter demonstrating superior OS, a reduced mutational landscape, enhanced immune status, and increased sensitivity to immunotherapy. Notably, the predictive accuracy of DS outperformed the published LUAD signature and clinical features. Finally, we validated the DS expression using clinical samples and found that inhibiting KYNU suppressed LUAD cells proliferation, invasiveness, and migration in vitro.ConclusionsThe DR-based scoring system that we developed enabled accurate prognostic stratification of LUAD patients and provides important insights into the molecular mechanisms and treatment strategies for LUAD
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model
Neural network pruning offers an effective method for compressing a
multilingual automatic speech recognition (ASR) model with minimal performance
loss. However, it entails several rounds of pruning and re-training needed to
be run for each language. In this work, we propose the use of an adaptive
masking approach in two scenarios for pruning a multilingual ASR model
efficiently, each resulting in sparse monolingual models or a sparse
multilingual model (named as Dynamic ASR Pathways). Our approach dynamically
adapts the sub-network, avoiding premature decisions about a fixed sub-network
structure. We show that our approach outperforms existing pruning methods when
targeting sparse monolingual models. Further, we illustrate that Dynamic ASR
Pathways jointly discovers and trains better sub-networks (pathways) of a
single multilingual model by adapting from different sub-network
initializations, thereby reducing the need for language-specific pruning
Signatures of the Valley Kondo Effect in Si/Sige Quantum Dots
We report measurements consistent with the valley Kondo effect in Si/SiGe quantum dots, evidenced by peaks in the conductance versus source-drain voltage that show strong temperature dependence. The Kondo peaks show unusual behavior in a magnetic field that we interpret as arising from the valley degree of freedom. The interplay of valley and Zeeman splittings is suggested by the presence of side peaks, revealing a zero-field valley splitting between 0.28 to 0.34 meV. A zero-bias conductance peak for nonzero magnetic field, a phenomenon consistent with valley nonconservation in tunneling, is observed in two samples
Prompting Large Language Models with Speech Recognition Abilities
Large language models have proven themselves highly flexible, able to solve a
wide range of generative tasks, such as abstractive summarization and
open-ended question answering. In this paper we extend the capabilities of LLMs
by directly attaching a small audio encoder allowing it to perform speech
recognition. By directly prepending a sequence of audial embeddings to the text
token embeddings, the LLM can be converted to an automatic speech recognition
(ASR) system, and be used in the exact same manner as its textual counterpart.
Experiments on Multilingual LibriSpeech (MLS) show that incorporating a
conformer encoder into the open sourced LLaMA-7B allows it to outperform
monolingual baselines by 18% and perform multilingual speech recognition
despite LLaMA being trained overwhelmingly on English text. Furthermore, we
perform ablation studies to investigate whether the LLM can be completely
frozen during training to maintain its original capabilities, scaling up the
audio encoder, and increasing the audio encoder striding to generate fewer
embeddings. The results from these studies show that multilingual ASR is
possible even when the LLM is frozen or when strides of almost 1 second are
used in the audio encoder opening up the possibility for LLMs to operate on
long-form audio
TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models
Automatic Speech Recognition (ASR) models need to be optimized for specific
hardware before they can be deployed on devices. This can be done by tuning the
model's hyperparameters or exploring variations in its architecture.
Re-training and re-validating models after making these changes can be a
resource-intensive task. This paper presents TODM (Train Once Deploy Many), a
new approach to efficiently train many sizes of hardware-friendly on-device ASR
models with comparable GPU-hours to that of a single training job. TODM
leverages insights from prior work on Supernet, where Recurrent Neural Network
Transducer (RNN-T) models share weights within a Supernet. It reduces layer
sizes and widths of the Supernet to obtain subnetworks, making them smaller
models suitable for all hardware types. We introduce a novel combination of
three techniques to improve the outcomes of the TODM Supernet: adaptive
dropouts, an in-place Alpha-divergence knowledge distillation, and the use of
ScaledAdam optimizer. We validate our approach by comparing Supernet-trained
versus individually tuned Multi-Head State Space Model (MH-SSM) RNN-T using
LibriSpeech. Results demonstrate that our TODM Supernet either matches or
surpasses the performance of manually tuned models by up to a relative of 3%
better in word error rate (WER), while efficiently keeping the cost of training
many models at a small constant.Comment: Meta AI; Submitted to ICASSP 202
- …