138 research outputs found
MIR-GAN : Refining Frame-Level Modality-Invariant Representations with Adversarial Network for Audio-Visual Speech Recognition
PreprintPublisher PD
Leveraging Modality-specific Representations for Audio-visual Speech Recognition via Reinforcement Learning
Audio-visual speech recognition (AVSR) has gained remarkable success for
ameliorating the noise-robustness of speech recognition. Mainstream methods
focus on fusing audio and visual inputs to obtain modality-invariant
representations. However, such representations are prone to over-reliance on
audio modality as it is much easier to recognize than video modality in clean
conditions. As a result, the AVSR model underestimates the importance of visual
stream in face of noise corruption. To this end, we leverage visual
modality-specific representations to provide stable complementary information
for the AVSR task. Specifically, we propose a reinforcement learning (RL) based
framework called MSRL, where the agent dynamically harmonizes
modality-invariant and modality-specific representations in the auto-regressive
decoding process. We customize a reward function directly related to
task-specific metrics (i.e., word error rate), which encourages the MSRL to
effectively explore the optimal integration strategy. Experimental results on
the LRS3 dataset show that the proposed method achieves state-of-the-art in
both clean and various noisy conditions. Furthermore, we demonstrate the better
generality of MSRL system than other baselines when test set contains unseen
noises.Comment: Accepted by AAAI202
Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition
12 pages, 5 figures, Accepted by IJCAI 2023Preprin
Paeonol Ameliorates Glucose and Lipid Metabolism in Experimental Diabetes by Activating Akt
Our previous study proved that paeonol (Pae) could lower blood glucose levels of diabetic mice. There are also a few reports of its potential use for diabetes treatment. However, the role of Pae in regulating glucose and lipid metabolism in diabetes remains largely unknown. Considering the critical role of serine/threonine kinase B (Akt) in glucose and lipid metabolism, we explored whether Pae could improve glucose and lipid metabolism disorders via Akt. Here, we found that Pae attenuated fasting blood glucose, glycosylated serum protein, serum cholesterol and triglyceride (TG), hepatic glycogen, cholesterol and TG in diabetic mice. Moreover, Pae enhanced glucokinase (GCK) and low-density lipoprotein receptor (LDLR) protein expressions, and increased the phosphorylation of Akt. In insulin-resistant HepG2 cells, Pae increased glucose uptake and decreased lipid accumulation. What’s more, Pae elevated LDLR and GCK expressions as well as Akt phosphorylation, which was consistent with the in vivo results. Knockdown and inhibition experiments of Akt revealed that Pae regulated LDLR and GCK expressions through activation of Akt. Finally, molecular docking assay indicated the steady hydrogen bond was formed between Pae and Akt2. Experiments above suggested that Pae ameliorated glucose and lipid metabolism disorders and the underlying mechanism was closely related to the activation of Akt
Lipids, lipid-lowering agents, and inflammatory bowel disease: a Mendelian randomization study
BackgroundTo assess the causal role of lipid traits and lipid-lowering agents in inflammatory bowel disease (IBD).MethodsUnivariable mendelian randomization (MR) and multivariable MR (MVMR) analyses were conducted to evaluate the causal association between low-density lipoprotein cholesterol (LDL-C), triglycerides (TG), high-density lipoprotein cholesterol (HDL-C) and IBD. Drug-targeted MR analyzed the effects of lipid-lowering drugs on IBD, and network MR was used to analyze potential mediation effects.ResultsThe levels of HDL-C had an inverse relationship with the risk of Crohn’s disease (CD, OR: 0.85, 95% CI: 0.73-0.98, P = 0.024). In MVMR, the inverse relationships were found in all three outcomes. Drug-targeted MR analyses showed that with one-SD LDL-C decrease predicted by variants at or near proprotein convertase subtilisin/kexin type 9 (PCSK9), the OR values of people diagnosed with IBD, ulcerative colitis (UC) and CD were 1.75 (95%CI: 1.13-2.69, P = 0.011), 2.1 (95%CI: 1.28-3.42, P = 0.003) and 2.24 (95%CI: 1.11-4.5, P = 0.024), respectively. With one-SD LDL-C decrease predicted by variants at or near cholesteryl ester transfer protein (CETP), the OR value of people diagnosed with CD was 0.12 (95%CI: 0.03-0.51, P = 0.004). Network-MR showed that HDL-C mediated the causal pathway from variants at or near CETP to CD.ConclusionOur study suggested a causal association between HDL-C and IBD, UC and CD. Genetically proxied inhibition of PCSK9 increased the risk of IBD, UC and CD, while inhibition of CETP decreased the risk of CD. Further studies are needed to clarify the long-term effect of lipid-lowering drugs on the gastrointestinal disorders
A Novel Unsupervised Video Anomaly Detection Framework Based on Optical Flow Reconstruction and Erased Frame Prediction
Reconstruction-based and prediction-based approaches are widely used for video anomaly detection (VAD) in smart city surveillance applications. However, neither of these approaches can effectively utilize the rich contextual information that exists in videos, which makes it difficult to accurately perceive anomalous activities. In this paper, we exploit the idea of a training model based on the “Cloze Test” strategy in natural language processing (NLP) and introduce a novel unsupervised learning framework to encode both motion and appearance information at an object level. Specifically, to store the normal modes of video activity reconstructions, we first design an optical stream memory network with skip connections. Secondly, we build a space–time cube (STC) for use as the basic processing unit of the model and erase a patch in the STC to form the frame to be reconstructed. This enables a so-called ”incomplete event (IE)” to be completed. On this basis, a conditional autoencoder is utilized to capture the high correspondence between optical flow and STC. The model predicts erased patches in IEs based on the context of the front and back frames. Finally, we employ a generating adversarial network (GAN)-based training method to improve the performance of VAD. By distinguishing the predicted erased optical flow and erased video frame, the anomaly detection results are shown to be more reliable with our proposed method which can help reconstruct the original video in IE. Comparative experiments conducted on the benchmark UCSD Ped2, CUHK Avenue, and ShanghaiTech datasets demonstrate AUROC scores reaching 97.7%, 89.7%, and 75.8%, respectively
- …