114 research outputs found

    Signatures of Valley Kondo Effect in Si/SiGe Quantum Dots

    Get PDF
    We report measurements consistent with the valley Kondo effect in Si/SiGe quantum dots, evidenced by peaks in the conductance versus source-drain voltage that show strong temperature dependence. The Kondo peaks show unusual behavior in a magnetic field that we interpret as arising from the valley degree of freedom. The interplay of valley and Zeeman splittings is suggested by the presence of side peaks, revealing a zero-field valley splitting between 0.28 to 0.34 meV. A zero-bias conductance peak for non-zero magnetic field, a phenomenon consistent with valley non- conservation in tunneling, is observed in two samples.Comment: 16 pages, 7 figure

    Actual Measurement and Analysis on Microbial Contamination in Central Air Conditioning System at a Venue in Dalian, China

    Get PDF
    AbstractActual measurement and analysis were carried out on microbial contamination in central air conditioning system at a venue in Dalian. By studying the microbial contamination in two air handling units with different thermal environments, we found that the fungi and bacteria were common existing on the surface of filter, and the trend of cell density distribution was center > against the wall > corner; The microbial pollution associated in the dust and floating in the air was extremely serious. By comparing the two units, we observed that fungus concentration: Unit A > Unit B, and bacteria concentration: Unit A < Unit B,. And the candida spp. accounted for 80 percent of the sample in Unit A; while in Unit B the cladosporium spp. occupied up to 50%. At the end of the paper, according to the results of measurement and analysis, the methods of controlling microbial contamination in HVAC system have been proposed

    Towards Selection of Text-to-speech Data to Augment ASR Training

    Full text link
    This paper presents a method for selecting appropriate synthetic speech samples from a given large text-to-speech (TTS) dataset as supplementary training data for an automatic speech recognition (ASR) model. We trained a neural network, which can be optimised using cross-entropy loss or Arcface loss, to measure the similarity of a synthetic data to real speech. We found that incorporating synthetic samples with considerable dissimilarity to real speech, owing in part to lexical differences, into ASR training is crucial for boosting recognition performance. Experimental results on Librispeech test sets indicate that, in order to maintain the same speech recognition accuracy as when using all TTS data, our proposed solution can reduce the size of the TTS data down below its 30%30\,\%, which is superior to several baseline methods

    Towards General-Purpose Speech Abilities for Large Language Models Using Unpaired Data

    Full text link
    In this work, we extend the instruction-tuned Llama-2 model with end-to-end general-purpose speech processing and reasoning abilities while maintaining the wide range of LLM capabilities, without using any carefully curated paired data. The proposed model can utilize audio prompts as a replacement for text and sustain a conversation. Such a model also has extended cross-modal capabilities such as being able to perform speech question answering, speech translation, and audio summarization amongst many other closed and open-domain tasks. This is unlike prior approaches in speech, in which LLMs are extended to handle audio for a limited number of pre-designated tasks. Experiments show that our end-to-end approach is on par with or outperforms a cascaded system (speech recognizer + LLM) in terms of modeling the response to a prompt. Furthermore, unlike a cascade, our approach shows the ability to interchange text and audio modalities and utilize the prior context in a conversation to provide better results

    Identification of disulfidptosis-related subgroups and prognostic signatures in lung adenocarcinoma using machine learning and experimental validation

    Get PDF
    BackgroundDisulfidptosis is a newly identified variant of cell death characterized by disulfide accumulation, which is independent of ATP depletion. Accordingly, the latent influence of disulfidptosis on the prognosis of lung adenocarcinoma (LUAD) patients and the progression of tumors remains poorly understood.MethodsWe conducted a multifaceted analysis of the transcriptional and genetic modifications in disulfidptosis regulators (DRs) specific to LUAD, followed by an evaluation of their expression configurations to define DR clusters. Harnessing the differentially expressed genes (DEGs) identified from these clusters, we formulated an optimal predictive model by amalgamating 10 distinct machine learning algorithms across 101 unique combinations to compute the disulfidptosis score (DS). Patients were subsequently stratified into high and low DS cohorts based on median DS values. We then performed an exhaustive comparison between these cohorts, focusing on somatic mutations, clinical attributes, tumor microenvironment, and treatment responsiveness. Finally, we empirically validated the biological implications of a critical gene, KYNU, through assays in LUAD cell lines.ResultsWe identified two DR clusters and there were great differences in overall survival (OS) and tumor microenvironment. We selected the "Least Absolute Shrinkage and Selection Operator (LASSO) + Random Survival Forest (RFS)" algorithm to develop a DS based on the average C-index across different cohorts. Our model effectively stratified LUAD patients into high- and low-DS subgroups, with this latter demonstrating superior OS, a reduced mutational landscape, enhanced immune status, and increased sensitivity to immunotherapy. Notably, the predictive accuracy of DS outperformed the published LUAD signature and clinical features. Finally, we validated the DS expression using clinical samples and found that inhibiting KYNU suppressed LUAD cells proliferation, invasiveness, and migration in vitro.ConclusionsThe DR-based scoring system that we developed enabled accurate prognostic stratification of LUAD patients and provides important insights into the molecular mechanisms and treatment strategies for LUAD

    Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of A Multilingual ASR Model

    Full text link
    Neural network pruning offers an effective method for compressing a multilingual automatic speech recognition (ASR) model with minimal performance loss. However, it entails several rounds of pruning and re-training needed to be run for each language. In this work, we propose the use of an adaptive masking approach in two scenarios for pruning a multilingual ASR model efficiently, each resulting in sparse monolingual models or a sparse multilingual model (named as Dynamic ASR Pathways). Our approach dynamically adapts the sub-network, avoiding premature decisions about a fixed sub-network structure. We show that our approach outperforms existing pruning methods when targeting sparse monolingual models. Further, we illustrate that Dynamic ASR Pathways jointly discovers and trains better sub-networks (pathways) of a single multilingual model by adapting from different sub-network initializations, thereby reducing the need for language-specific pruning

    Signatures of the Valley Kondo Effect in Si/Sige Quantum Dots

    Get PDF
    We report measurements consistent with the valley Kondo effect in Si/SiGe quantum dots, evidenced by peaks in the conductance versus source-drain voltage that show strong temperature dependence. The Kondo peaks show unusual behavior in a magnetic field that we interpret as arising from the valley degree of freedom. The interplay of valley and Zeeman splittings is suggested by the presence of side peaks, revealing a zero-field valley splitting between 0.28 to 0.34 meV. A zero-bias conductance peak for nonzero magnetic field, a phenomenon consistent with valley nonconservation in tunneling, is observed in two samples

    Prompting Large Language Models with Speech Recognition Abilities

    Full text link
    Large language models have proven themselves highly flexible, able to solve a wide range of generative tasks, such as abstractive summarization and open-ended question answering. In this paper we extend the capabilities of LLMs by directly attaching a small audio encoder allowing it to perform speech recognition. By directly prepending a sequence of audial embeddings to the text token embeddings, the LLM can be converted to an automatic speech recognition (ASR) system, and be used in the exact same manner as its textual counterpart. Experiments on Multilingual LibriSpeech (MLS) show that incorporating a conformer encoder into the open sourced LLaMA-7B allows it to outperform monolingual baselines by 18% and perform multilingual speech recognition despite LLaMA being trained overwhelmingly on English text. Furthermore, we perform ablation studies to investigate whether the LLM can be completely frozen during training to maintain its original capabilities, scaling up the audio encoder, and increasing the audio encoder striding to generate fewer embeddings. The results from these studies show that multilingual ASR is possible even when the LLM is frozen or when strides of almost 1 second are used in the audio encoder opening up the possibility for LLMs to operate on long-form audio

    TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-device ASR Models

    Full text link
    Automatic Speech Recognition (ASR) models need to be optimized for specific hardware before they can be deployed on devices. This can be done by tuning the model's hyperparameters or exploring variations in its architecture. Re-training and re-validating models after making these changes can be a resource-intensive task. This paper presents TODM (Train Once Deploy Many), a new approach to efficiently train many sizes of hardware-friendly on-device ASR models with comparable GPU-hours to that of a single training job. TODM leverages insights from prior work on Supernet, where Recurrent Neural Network Transducer (RNN-T) models share weights within a Supernet. It reduces layer sizes and widths of the Supernet to obtain subnetworks, making them smaller models suitable for all hardware types. We introduce a novel combination of three techniques to improve the outcomes of the TODM Supernet: adaptive dropouts, an in-place Alpha-divergence knowledge distillation, and the use of ScaledAdam optimizer. We validate our approach by comparing Supernet-trained versus individually tuned Multi-Head State Space Model (MH-SSM) RNN-T using LibriSpeech. Results demonstrate that our TODM Supernet either matches or surpasses the performance of manually tuned models by up to a relative of 3% better in word error rate (WER), while efficiently keeping the cost of training many models at a small constant.Comment: Meta AI; Submitted to ICASSP 202
    corecore