225 research outputs found

    sVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural Networks

    Full text link
    Speech applications are expected to be low-power and robust under noisy conditions. An effective Voice Activity Detection (VAD) front-end lowers the computational need. Spiking Neural Networks (SNNs) are known to be biologically plausible and power-efficient. However, SNN-based VADs have yet to achieve noise robustness and often require large models for high performance. This paper introduces a novel SNN-based VAD model, referred to as sVAD, which features an auditory encoder with an SNN-based attention mechanism. Particularly, it provides effective auditory feature representation through SincNet and 1D convolution, and improves noise robustness with attention mechanisms. The classifier utilizes Spiking Recurrent Neural Networks (sRNN) to exploit temporal speech information. Experimental results demonstrate that our sVAD achieves remarkable noise robustness and meanwhile maintains low power consumption and a small footprint, making it a promising solution for real-world VAD applications.Comment: Accepted by ICASSP 202

    Origin and evolution of intercrystalline brine in the northern Qaidam Basin based on hydrochemistry and stable isotopes

    Get PDF
    The Kunteyi Basin, located in northern Qaidam, is known as a significant potash ore deposit in China. It is of great significance to study the origin of the potassium-rich intercrystalline brine to support the exploitation of potassium salts. In this study, the major ion concentrations and isotopic ratios (δ2H, δ18O, and δ11B) of intercrystalline brine were used to analyze the evolution of the brine. The results show that the intercrystalline brine has a much higher concentration of total dissolved solids compared with the oil-field brine. Most of the ions are enriched except Ca2+ and Br−. The value of δ2H and δ18O are much negative while the δ11B values are positive. The analysis of CNa/CCl, CBr/CCl, Cl/(Na + K + Mg) and isotopes ratios, indicate that (1) Atmospheric precipitation is the primary source of water in brine; (2) The salinity of the brine is mainly influenced by halite dissolution; (3) The study area was influenced by the deep hydrothermal fluids. The thermal water recharged the Pleistocene layer, reacted with polyhalite, and formed Mg- and K-rich brine. The solution rose along the channel formed by the Shuangqiquan Fault and was supplied to the shallow intercrystalline brine

    Enhancing Privacy-Preserving Intrusion Detection in Blockchain-Based Networks with Deep Learning

    Get PDF
    Data transfer in sensitive industries such as healthcare presents significant challenges due to privacy issues, which makes it difficult to collaborate and use machine learning effectively. These issues are explored in this study by looking at how hybrid learning approaches can be used to move models between users and consumers as well as within organizations. Blockchain technology is used, compensating participants with tokens, to provide privacy-preserving data collection and safe model transfer. The proposed approach combines Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU) to create a privacy-preserving secure framework for predictive analytics. LSTM-GRU-based federated learning techniques are used for local model training. The approach uses blockchain to securely transmit data to a distributed, decentralised cloud server, guaranteeing data confidentiality and privacy using a variety of storage techniques. This architecture addresses privacy issues and encourages seamless cooperation by utilising hybrid learning, federated learning, and blockchain technology. The study contributes to bridging the gap between secure data transfer and effective deep learning, specifically within sensitive domains. Experimental results demonstrate an impressive accuracy rate of 99.01%

    Expression and Prognostic Significance of PD-L2 in Diffuse Large B-Cell Lymphoma

    Get PDF
    Recent studies suggest that programmed death ligand-2 (PD-L2) constitutes an important antitumor immune response. Here, we investigated the relationship between PD-L2 expression and clinicopathological features in diffuse large B-cell lymphoma (DLBCL). Immunohistochemistry showed that positive expression of PD-L2 was observed in 45 of 181 newly diagnosed patients, including 14 cases with expression exclusively on tumor cells (TCs) and 31 cases with the expression on both TCs and immune cells (ICs) in the tumor microenvironment (TME). In 21 recurrent patients, positive expression of PD-L2 was present in six cases, including two cases with expression exclusively on TCs, and four cases with the expression on both TCs and ICs in the TME. Patients with PD-L2 tumor proportion score (TPS) ≥1% exhibited a better ECOG performance status (PS) (ECOG PS score <2, P = 0.041), lower international prognostic index (IPI) score (P < 0.001), and early Ann Arbor stage (Ann Arbor stage I or II, P = 0.010). Similarly, patients with PD-L2 immune proportion score (IPS) ≥1% also exhibited a better ECOG PS (ECOG PS score < 2, P = 0.006) and lower IPI score (P = 0.001). Survival analysis showed that patients with PD-L2 TPS ≥1% exhibited prolonged overall survival (OS) and progression-free survival (PFS). However, survival analysis showed no prognostic significance based on expression of PD-L2 on ICs in the TME. TC PD-L2 expression was significantly associated with OS (P = 0.041) and PFS (P = 0.001). In the multivariate analysis, TC PD-L2 expression was an independent prognostic risk factor for PFS (P = 0.013), but not for OS (P = 0.249). Furthermore, we found that higher TC and IC PD-L2 expression was associated with higher objective response rate (ORR). Moreover, we demonstrated that the expression level of PD-L2 was positively correlated with the expression status of M1 macrophage markers CD86. Our findings highlight PD-L2 as a promising therapeutic target in DLBCL

    LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression

    Full text link
    In long context scenarios, large language models (LLMs) face three main challenges: higher computational/financial cost, longer latency, and inferior performance. Some studies reveal that the performance of LLMs depends on both the density and the position of the key information (question relevant) in the input prompt. Inspired by these findings, we propose LongLLMLingua for prompt compression towards improving LLMs' perception of the key information to simultaneously address the three challenges. We conduct evaluation on a wide range of long context scenarios including single-/multi-document QA, few-shot learning, summarization, synthetic tasks, and code completion. The experimental results show that LongLLMLingua compressed prompt can derive higher performance with much less cost. The latency of the end-to-end system is also reduced. For example, on NaturalQuestions benchmark, LongLLMLingua gains a performance boost of up to 17.1% over the original prompt with ~4x fewer tokens as input to GPT-3.5-Turbo. It can derive cost savings of \$28.5 and \$27.4 per 1,000 samples from the LongBench and ZeroScrolls benchmark, respectively. Additionally, when compressing prompts of ~10k tokens at a compression rate of 2x-10x, LongLLMLingua can speed up the end-to-end latency by 1.4x-3.8x. Our code is available at https://aka.ms/LLMLingua

    Geometric Features Informed Multi-person Human-object Interaction Recognition in Videos

    Full text link
    Human-Object Interaction (HOI) recognition in videos is important for analyzing human activity. Most existing work focusing on visual features usually suffer from occlusion in the real-world scenarios. Such a problem will be further complicated when multiple people and objects are involved in HOIs. Consider that geometric features such as human pose and object position provide meaningful information to understand HOIs, we argue to combine the benefits of both visual and geometric features in HOI recognition, and propose a novel Two-level Geometric feature-informed Graph Convolutional Network (2G-GCN). The geometric-level graph models the interdependency between geometric features of humans and objects, while the fusion-level graph further fuses them with visual features of humans and objects. To demonstrate the novelty and effectiveness of our method in challenging scenarios, we propose a new multi-person HOI dataset (MPHOI-72). Extensive experiments on MPHOI-72 (multi-person HOI), CAD-120 (single-human HOI) and Bimanual Actions (two-hand HOI) datasets demonstrate our superior performance compared to state-of-the-arts.Comment: Accepted by ECCV 202
    • …
    corecore