225 research outputs found
sVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural Networks
Speech applications are expected to be low-power and robust under noisy
conditions. An effective Voice Activity Detection (VAD) front-end lowers the
computational need. Spiking Neural Networks (SNNs) are known to be biologically
plausible and power-efficient. However, SNN-based VADs have yet to achieve
noise robustness and often require large models for high performance. This
paper introduces a novel SNN-based VAD model, referred to as sVAD, which
features an auditory encoder with an SNN-based attention mechanism.
Particularly, it provides effective auditory feature representation through
SincNet and 1D convolution, and improves noise robustness with attention
mechanisms. The classifier utilizes Spiking Recurrent Neural Networks (sRNN) to
exploit temporal speech information. Experimental results demonstrate that our
sVAD achieves remarkable noise robustness and meanwhile maintains low power
consumption and a small footprint, making it a promising solution for
real-world VAD applications.Comment: Accepted by ICASSP 202
Origin and evolution of intercrystalline brine in the northern Qaidam Basin based on hydrochemistry and stable isotopes
The Kunteyi Basin, located in northern Qaidam, is known as a significant potash ore deposit in China. It is of great significance to study the origin of the potassium-rich intercrystalline brine to support the exploitation of potassium salts. In this study, the major ion concentrations and isotopic ratios (δ2H, δ18O, and δ11B) of intercrystalline brine were used to analyze the evolution of the brine. The results show that the intercrystalline brine has a much higher concentration of total dissolved solids compared with the oil-field brine. Most of the ions are enriched except Ca2+ and Br−. The value of δ2H and δ18O are much negative while the δ11B values are positive. The analysis of CNa/CCl, CBr/CCl, Cl/(Na + K + Mg) and isotopes ratios, indicate that (1) Atmospheric precipitation is the primary source of water in brine; (2) The salinity of the brine is mainly influenced by halite dissolution; (3) The study area was influenced by the deep hydrothermal fluids. The thermal water recharged the Pleistocene layer, reacted with polyhalite, and formed Mg- and K-rich brine. The solution rose along the channel formed by the Shuangqiquan Fault and was supplied to the shallow intercrystalline brine
Enhancing Privacy-Preserving Intrusion Detection in Blockchain-Based Networks with Deep Learning
Data transfer in sensitive industries such as healthcare presents significant challenges due to privacy issues, which makes it difficult to collaborate and use machine learning effectively. These issues are explored in this study by looking at how hybrid learning approaches can be used to move models between users and consumers as well as within organizations. Blockchain technology is used, compensating participants with tokens, to provide privacy-preserving data collection and safe model transfer. The proposed approach combines Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU) to create a privacy-preserving secure framework for predictive analytics. LSTM-GRU-based federated learning techniques are used for local model training. The approach uses blockchain to securely transmit data to a distributed, decentralised cloud server, guaranteeing data confidentiality and privacy using a variety of storage techniques. This architecture addresses privacy issues and encourages seamless cooperation by utilising hybrid learning, federated learning, and blockchain technology. The study contributes to bridging the gap between secure data transfer and effective deep learning, specifically within sensitive domains. Experimental results demonstrate an impressive accuracy rate of 99.01%
Expression and Prognostic Significance of PD-L2 in Diffuse Large B-Cell Lymphoma
Recent studies suggest that programmed death ligand-2 (PD-L2) constitutes an important antitumor immune response. Here, we investigated the relationship between PD-L2 expression and clinicopathological features in diffuse large B-cell lymphoma (DLBCL). Immunohistochemistry showed that positive expression of PD-L2 was observed in 45 of 181 newly diagnosed patients, including 14 cases with expression exclusively on tumor cells (TCs) and 31 cases with the expression on both TCs and immune cells (ICs) in the tumor microenvironment (TME). In 21 recurrent patients, positive expression of PD-L2 was present in six cases, including two cases with expression exclusively on TCs, and four cases with the expression on both TCs and ICs in the TME. Patients with PD-L2 tumor proportion score (TPS) ≥1% exhibited a better ECOG performance status (PS) (ECOG PS score <2, P = 0.041), lower international prognostic index (IPI) score (P < 0.001), and early Ann Arbor stage (Ann Arbor stage I or II, P = 0.010). Similarly, patients with PD-L2 immune proportion score (IPS) ≥1% also exhibited a better ECOG PS (ECOG PS score < 2, P = 0.006) and lower IPI score (P = 0.001). Survival analysis showed that patients with PD-L2 TPS ≥1% exhibited prolonged overall survival (OS) and progression-free survival (PFS). However, survival analysis showed no prognostic significance based on expression of PD-L2 on ICs in the TME. TC PD-L2 expression was significantly associated with OS (P = 0.041) and PFS (P = 0.001). In the multivariate analysis, TC PD-L2 expression was an independent prognostic risk factor for PFS (P = 0.013), but not for OS (P = 0.249). Furthermore, we found that higher TC and IC PD-L2 expression was associated with higher objective response rate (ORR). Moreover, we demonstrated that the expression level of PD-L2 was positively correlated with the expression status of M1 macrophage markers CD86. Our findings highlight PD-L2 as a promising therapeutic target in DLBCL
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression
In long context scenarios, large language models (LLMs) face three main
challenges: higher computational/financial cost, longer latency, and inferior
performance. Some studies reveal that the performance of LLMs depends on both
the density and the position of the key information (question relevant) in the
input prompt. Inspired by these findings, we propose LongLLMLingua for prompt
compression towards improving LLMs' perception of the key information to
simultaneously address the three challenges. We conduct evaluation on a wide
range of long context scenarios including single-/multi-document QA, few-shot
learning, summarization, synthetic tasks, and code completion. The experimental
results show that LongLLMLingua compressed prompt can derive higher performance
with much less cost. The latency of the end-to-end system is also reduced. For
example, on NaturalQuestions benchmark, LongLLMLingua gains a performance boost
of up to 17.1% over the original prompt with ~4x fewer tokens as input to
GPT-3.5-Turbo. It can derive cost savings of \$28.5 and \$27.4 per 1,000
samples from the LongBench and ZeroScrolls benchmark, respectively.
Additionally, when compressing prompts of ~10k tokens at a compression rate of
2x-10x, LongLLMLingua can speed up the end-to-end latency by 1.4x-3.8x. Our
code is available at https://aka.ms/LLMLingua
Geometric Features Informed Multi-person Human-object Interaction Recognition in Videos
Human-Object Interaction (HOI) recognition in videos is important for
analyzing human activity. Most existing work focusing on visual features
usually suffer from occlusion in the real-world scenarios. Such a problem will
be further complicated when multiple people and objects are involved in HOIs.
Consider that geometric features such as human pose and object position provide
meaningful information to understand HOIs, we argue to combine the benefits of
both visual and geometric features in HOI recognition, and propose a novel
Two-level Geometric feature-informed Graph Convolutional Network (2G-GCN). The
geometric-level graph models the interdependency between geometric features of
humans and objects, while the fusion-level graph further fuses them with visual
features of humans and objects. To demonstrate the novelty and effectiveness of
our method in challenging scenarios, we propose a new multi-person HOI dataset
(MPHOI-72). Extensive experiments on MPHOI-72 (multi-person HOI), CAD-120
(single-human HOI) and Bimanual Actions (two-hand HOI) datasets demonstrate our
superior performance compared to state-of-the-arts.Comment: Accepted by ECCV 202
- …