Search CORE

109 research outputs found

Mobile AR Depth Estimation: Challenges & Prospects -- Extended Version

Author: Ganj Ashkan
Guo Tian
Su Hang
Zhao Yiqin
Publication venue
Publication date: 22/10/2023
Field of study

Metric depth estimation plays an important role in mobile augmented reality (AR). With accurate metric depth, we can achieve more realistic user interactions such as object placement and occlusion detection. While specialized hardware like LiDAR demonstrates its promise, its restricted availability, i.e., only on selected high-end mobile devices, and performance limitations such as range and sensitivity to the environment, make it less ideal. Monocular depth estimation, on the other hand, relies solely on mobile cameras, which are ubiquitous, making it a promising alternative for mobile AR. In this paper, we investigate the challenges and opportunities of achieving accurate metric depth estimation in mobile AR. We tested four different state-of-the-art monocular depth estimation models on a newly introduced dataset (ARKitScenes) and identified three types of challenges: hard-ware, data, and model related challenges. Furthermore, our research provides promising future directions to explore and solve those challenges. These directions include (i) using more hardware-related information from the mobile device's camera and other available sensors, (ii) capturing high-quality data to reflect real-world AR scenarios, and (iii) designing a model architecture to utilize the new information

arXiv.org e-Print Archive

Exploring Spatio-Temporal Representations by Integrating Attention-based Bidirectional-LSTM-RNNs and FCNs for Speech Emotion Recognition

Author: Li Chao
Wang Haishuai
Zhang Zixing
Zhao Yiqin
Zhao Ziping
Zheng Yu
Publication venue: DigitalCommons@Fairfield
Publication date: 01/01/2018
Field of study

Automatic emotion recognition from speech, which is an important and challenging task in the field of affective computing, heavily relies on the effectiveness of the speech features for classification. Previous approaches to emotion recognition have mostly focused on the extraction of carefully hand-crafted features. How to model spatio-temporal dynamics for speech emotion recognition effectively is still under active investigation. In this paper, we propose a method to tackle the problem of emotional relevant feature extraction from speech by leveraging Attention-based Bidirectional Long Short-Term Memory Recurrent Neural Networks with fully convolutional networks in order to automatically learn the best spatio-temporal representations of speech signals. The learned high-level features are then fed into a deep neural network (DNN) to predict the final emotion. The experimental results on the Chinese Natural Audio-Visual Emotion Database (CHEAVD) and the Interactive Emotional Dyadic Motion Capture (IEMOCAP) corpora show that our method provides more accurate predictions compared with other existing emotion recognition algorithms

Crossref

Fairfield University: DigitalCommons@Fairfield

Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference

Author: Huang Longbo
Li Jian
Luo Simian
Tan Yiqin
Zhao Hang
Publication venue
Publication date: 06/10/2023
Field of study

Latent Diffusion models (LDMs) have achieved remarkable results in synthesizing high-resolution images. However, the iterative sampling process is computationally intensive and leads to slow generation. Inspired by Consistency Models (song et al.), we propose Latent Consistency Models (LCMs), enabling swift inference with minimal steps on any pre-trained LDMs, including Stable Diffusion (rombach et al). Viewing the guided reverse diffusion process as solving an augmented probability flow ODE (PF-ODE), LCMs are designed to directly predict the solution of such ODE in latent space, mitigating the need for numerous iterations and allowing rapid, high-fidelity sampling. Efficiently distilled from pre-trained classifier-free guided diffusion models, a high-quality 768 x 768 2~4-step LCM takes only 32 A100 GPU hours for training. Furthermore, we introduce Latent Consistency Fine-tuning (LCF), a novel method that is tailored for fine-tuning LCMs on customized image datasets. Evaluation on the LAION-5B-Aesthetics dataset demonstrates that LCMs achieve state-of-the-art text-to-image generation performance with few-step inference. Project Page: https://latent-consistency-models.github.io

arXiv.org e-Print Archive

Melodic Phrase Segmentation By Deep Neural Networks

Author: Guan Yixing
Qiu Yiqin
Xia Gus
Zhang Zheng
Zhao Jinyu
Publication venue
Publication date: 14/11/2018
Field of study

Automated melodic phrase detection and segmentation is a classical task in content-based music information retrieval and also the key towards automated music structure analysis. However, traditional methods still cannot satisfy practical requirements. In this paper, we explore and adapt various neural network architectures to see if they can be generalized to work with the symbolic representation of music and produce satisfactory melodic phrase segmentation. The main issue of applying deep-learning methods to phrase detection is the sparse labeling problem of training sets. We proposed two tailored label engineering with corresponding training techniques for different neural networks in order to make decisions at a sequential level. Experiment results show that the CNN-CRF architecture performs the best, being able to offer finer segmentation and faster to train, while CNN, Bi-LSTM-CNN and Bi-LSTM-CRF are acceptable alternatives

arXiv.org e-Print Archive

Biodiversity Heritage Library OAI Repository

An edge cloud and Fibonacci-Diffie-Hellman encryption scheme for secure printer data transmission

Author: Hongbing Lu
Jie Sun
Qiang Zhao
Wenbin Xu
Yiqin Bao
Publication venue: AIMS Press
Publication date: 01/01/2024
Field of study

Network printers face increasing security threats from network attacks that can lead to sensitive information leakage and data tampering. To address these risks, we propose a novel Fibonacci-Diffie-Hellman (FIB-DH) encryption scheme using edge cloud collaboration. Our approach utilizes properties of third-order Fibonacci matrices combined with the Diffie-Hellman key exchange to encrypt printer data transmissions. The encrypted data is transmitted via edge cloud servers and verified by the receiver using inverse Fibonacci transforms. Our experiments demonstrate that the FIB-DH scheme can effectively improve printer data transmission security against common attacks compared to conventional methods. The results show reduced vulnerabilities to leakage and tampering attacks in our approach. This work provides an innovative application of cryptographic techniques to strengthen security for network printer communications

Directory of Open Access Journals

Association of Geriatric Nutritional Risk Index with Mortality in Hemodialysis Patients: A Meta-Analysis of Cohort Studies

Author: Bing Feng
Jiachuan Xiong
Jingbo Zhang
Jinghong Zhao
Ling Nie
Min Wang
Ting He
Ying Zhang
Yiqin Wang
Yunjian Huang
Publication venue: 'S. Karger AG'
Publication date: 01/12/2018
Field of study

Background/Aims: Geriatric nutritional risk index (GNRI) was developed as a “nutrition-related” risk index and was reported in different populations as associated with the risk of all-cause and cardiovascular morbidity and mortality. Therefore, GNRI can be used to classify patients according to a risk of complications in relation to conditions associated with protein-energy wasting (PEW). However, not all reports pointed to the prognostic ability of the GNRI. The purpose of this study was to assess the associations of GNRI with mortality in chronic hemodialysis patients. Methods: We electronically searched original articles published in peer-reviewed journals from their inception to September 2018 in The PubMed, Embase, and the Cochrane Library databases. The primary outcome was all-cause and cardiovascular mortality. We pooled unadjusted and adjusted odds ratios (ORs) with 95% confidence intervals (95% CIs) using Review Manager 5.3 software. Results: A total of 10,739 patients from 19 cohort studies published from 2010 to 2018 were included. A significant negative association was found between the GNRI and all-cause mortality in patients with chronic hemodialysis (OR, 0.90; 95% CI, 0.84-0.97, p=0.004) (per unit increase) and (OR, 2.15; 95% CI, 1.88-2.46, p＜0.00001) (low vs. high GNRI). Moreover, there was also a significant negative association between the GNRI (per unit increase) and cardiovascular events (OR, 0.98; 95% CI, 0.97-1.00, p=0.01), as well as cardiovascular mortality (OR, 0.89; 95% CI, 0.80-0.99, p=0.03). Conclusion: Our findings supported the hypothesis that the low GNRI is associated with an increased risk of all-cause and cardiovascular mortality in chronic hemodialysis patients. Based on our literature review, GNRI has been found to be an effective tool for identifying patients with nutrition-related risk of all-cause and cardiovascular disease

Directory of Open Access Journals