Search CORE

72 research outputs found

A Dual Sensor Computational Camera for High Quality Dark Videography

Author: Cheng Yuxiao
Dai Qionghai
Suo Jinli
Yang Runzhao
Zhang Zhihong
Publication venue: 'Elsevier BV'
Publication date: 11/04/2022
Field of study

Videos captured under low light conditions suffer from severe noise. A variety of efforts have been devoted to image/video noise suppression and made large progress. However, in extremely dark scenarios, extensive photon starvation would hamper precise noise modeling. Instead, developing an imaging system collecting more photons is a more effective way for high-quality video capture under low illuminations. In this paper, we propose to build a dual-sensor camera to additionally collect the photons in NIR wavelength, and make use of the correlation between RGB and near-infrared (NIR) spectrum to perform high-quality reconstruction from noisy dark video pairs. In hardware, we build a compact dual-sensor camera capturing RGB and NIR videos simultaneously. Computationally, we propose a dual-channel multi-frame attention network (DCMAN) utilizing spatial-temporal-spectral priors to reconstruct the low-light RGB and NIR videos. In addition, we build a high-quality paired RGB and NIR video dataset, based on which the approach can be applied to different sensors easily by training the DCMAN model with simulated noisy input following a physical-process-based CMOS noise model. Both experiments on synthetic and real videos validate the performance of this compact dual-sensor camera design and the corresponding reconstruction algorithm in dark videography

arXiv.org e-Print Archive

Lightweight High-Speed Photography Built on Coded Exposure and Implicit Neural Representation of Videos

Author: Cheng Yuxiao
Dai Qionghai
Suo Jinli
Yang Runzhao
Zhang Zhihong
Publication venue
Publication date: 21/11/2023
Field of study

The compact cameras recording high-speed scenes with high resolution are highly demanded, but the required high bandwidth often leads to bulky, heavy systems, which limits their applications on low-capacity platforms. Adopting a coded exposure setup to encode a frame sequence into a blurry snapshot and retrieve the latent sharp video afterward can serve as a lightweight solution. However, restoring motion from blur is quite challenging due to the high ill-posedness of motion blur decomposition, intrinsic ambiguity in motion direction, and diverse motions in natural videos. In this work, by leveraging classical coded exposure imaging technique and emerging implicit neural representation for videos, we tactfully embed the motion direction cues into the blurry image during the imaging process and develop a novel self-recursive neural network to sequentially retrieve the latent video sequence from the blurry image utilizing the embedded motion direction cues. To validate the effectiveness and efficiency of the proposed framework, we conduct extensive experiments on benchmark datasets and real-captured blurry images. The results demonstrate that our proposed framework significantly outperforms existing methods in quality and flexibility. The code for our work is available at https://github.com/zhihongz/BDINRComment: 19 pages, 10 figure

arXiv.org e-Print Archive

Can the Inference Logic of Large Language Models be Disentangled into Symbolic Concepts?

Author: Cheng Lei
Li Mingjie
Shen Wen
Yang Yuxiao
Zhang Quanshi
Publication venue
Publication date: 03/04/2023
Field of study

In this paper, we explain the inference logic of large language models (LLMs) as a set of symbolic concepts. Many recent studies have discovered that traditional DNNs usually encode sparse symbolic concepts. However, because an LLM has much more parameters than traditional DNNs, whether the LLM also encodes sparse symbolic concepts is still an open problem. Therefore, in this paper, we propose to disentangle the inference score of LLMs for dialogue tasks into a small number of symbolic concepts. We verify that we can use those sparse concepts to well estimate all inference scores of the LLM on all arbitrarily masking states of the input sentence. We also evaluate the transferability of concepts encoded by an LLM and verify that symbolic concepts usually exhibit high transferability across similar input sentences. More crucially, those symbolic concepts can be used to explain the exact reasons accountable for the LLM's prediction errors

arXiv.org e-Print Archive

Generalized Activation via Multivariate Projection

Author: Cheng Yuxiao
Huang Gao
Li Jiayun
Lu Yiwen
Mo Yilin
Xia Zhuofan
Publication venue
Publication date: 27/01/2024
Field of study

Activation functions are essential to introduce nonlinearity into neural networks, with the Rectified Linear Unit (ReLU) often favored for its simplicity and effectiveness. Motivated by the structural similarity between a shallow Feedforward Neural Network (FNN) and a single iteration of the Projected Gradient Descent (PGD) algorithm, a standard approach for solving constrained optimization problems, we consider ReLU as a projection from R onto the nonnegative half-line R+. Building on this interpretation, we extend ReLU by substituting it with a generalized projection operator onto a convex cone, such as the Second-Order Cone (SOC) projection, thereby naturally extending it to a Multivariate Projection Unit (MPU), an activation function with multiple inputs and multiple outputs. We further provide mathematical proof establishing that FNNs activated by SOC projections outperform those utilizing ReLU in terms of expressive power. Experimental evaluations on widely-adopted architectures further corroborate MPU's effectiveness against a broader range of existing activation functions

arXiv.org e-Print Archive

CUTS: Neural Causal Discovery from Irregular Time-Series Data

Author: Cheng Yuxiao
Dai Qionghai
He Kunlun
Li Zongren
Suo Jinli
Xiao Tingxiong
Yang Runzhao
Publication venue
Publication date: 14/02/2023
Field of study

Causal discovery from time-series data has been a central task in machine learning. Recently, Granger causality inference is gaining momentum due to its good explainability and high compatibility with emerging deep neural networks. However, most existing methods assume structured input data and degenerate greatly when encountering data with randomly missing entries or non-uniform sampling frequencies, which hampers their applications in real scenarios. To address this issue, here we present CUTS, a neural Granger causal discovery algorithm to jointly impute unobserved data points and build causal graphs, via plugging in two mutually boosting modules in an iterative framework: (i) Latent data prediction stage: designs a Delayed Supervision Graph Neural Network (DSGNN) to hallucinate and register unstructured data which might be of high dimension and with complex distribution; (ii) Causal graph fitting stage: builds a causal adjacency matrix with imputed data under sparse penalty. Experiments show that CUTS effectively infers causal graphs from unstructured time-series data, with significantly superior performance to existing methods. Our approach constitutes a promising step towards applying causal discovery to real applications with non-ideal observations.Comment: https://openreview.net/forum?id=UG8bQcD3Em

arXiv.org e-Print Archive

CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation

Author: Cheng Jiale
Dong Yuxiao
Feng Zhuoer
Huang Minlie
Ke Pei
Lei Xuanyu
Liu Xiao
Tang Jie
Wang Hongning
Wang Shengyuan
Wen Bosi
Zeng Aohan
Publication venue
Publication date: 30/11/2023
Field of study

Since the natural language processing (NLP) community started to make large language models (LLMs), such as GPT-4, act as a critic to evaluate the quality of generated texts, most of them only train a critique generation model of a specific scale on specific datasets. We argue that a comprehensive investigation on the key factor of LLM-based evaluation models, such as scaling properties, is lacking, so that it is still inconclusive whether these models have potential to replace GPT-4's evaluation in practical scenarios. In this paper, we propose a new critique generation model called CritiqueLLM, which includes a dialogue-based prompting method for high-quality referenced / reference-free evaluation data. Experimental results show that our model can achieve comparable evaluation performance to GPT-4 especially in system-level correlations, and even outperform GPT-4 in 3 out of 8 tasks in a challenging reference-free setting. We conduct detailed analysis to show promising scaling properties of our model in the quality of generated critiques. We also demonstrate that our generated critiques can act as scalable feedback to directly improve the generation quality of LLMs.Comment: 18 pages, 5 figure

arXiv.org e-Print Archive

Associations between pan-immune-inflammation value and abdominal aortic calcification: a cross-sectional study

Author: Chen Jin
Cheng Zhang
Deyu Zuo
Deyu Zuo
Xunjia Li
Xunjia Li
Yuxiao Luo
Publication venue: Frontiers Media S.A.
Publication date: 01/03/2024
Field of study

BackgroundAbdominal aortic calcification (AAC) pathogenesis is intricately linked with inflammation. The pan-immune-inflammation value (PIV) emerges as a potential biomarker, offering reflection into systemic inflammatory states and assisting in the prognosis of diverse diseases. This research aimed to explore the association between PIV and AAC.MethodsEmploying data from the National Health and Nutrition Examination Survey (NHANES), this cross-sectional analysis harnessed weighted multivariable regression models to ascertain the relationship between PIV and AAC. Trend tests probed the evolving relationship among PIV quartiles and AAC. The study also incorporated subgroup analysis and interaction tests to determine associations within specific subpopulations. Additionally, the least absolute shrinkage and selection operator (LASSO) regression and multivariable logistic regression were used for characteristics selection to construct prediction model. Nomograms were used for visualization. The receiver operator characteristic (ROC) curve, calibration plot and decision curve analysis were applied for evaluate the predictive performance.ResultsFrom the cohort of 3,047 participants, a distinct positive correlation was observed between PIV and AAC. Subsequent to full adjustments, a 100-unit increment in PIV linked to an elevation of 0.055 points in the AAC score (β=0.055, 95% CI: 0.014-0.095). Categorizing PIV into quartiles revealed an ascending trend: as PIV quartiles increased, AAC scores surged (β values in Quartile 2, Quartile 3, and Quartile 4: 0.122, 0.437, and 0.658 respectively; P for trend <0.001). Concurrently, a marked rise in SAAC prevalence was noted (OR values for Quartile 2, Quartile 3, and Quartile 4: 1.635, 1.842, and 2.572 respectively; P for trend <0.01). Individuals aged 60 or above and those with a history of diabetes exhibited a heightened association. After characteristic selection, models for predicting AAC and SAAC were constructed respectively. The AUC of AAC model was 0.74 (95%CI=0.71-0.77) and the AUC of SAAC model was 0.84 (95%CI=0.80-0.87). According to the results of calibration plots and DCA, two models showed high accuracy and clinical benefit.ConclusionThe research findings illuminate the potential correlation between elevated PIV and AAC presence. Our models indicate the potential utility of PIV combined with other simple predictors in the assessment and management of individuals with AAC

Directory of Open Access Journals

xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering the Language of Protein

Author: Bei Zhilei
Chen Bo
Cheng Xingyi
Dong Yuxiao
Geng Yangli-ao
Gong Jing
Li Pan
Li Shen
Liu Chiming
Song Le
Tan Xu
Tang Jie
Wang Boyan
Zeng Aohan
Zeng Xin
Publication venue
Publication date: 11/01/2024
Field of study

Protein language models have shown remarkable success in learning biological information from protein sequences. However, most existing models are limited by either autoencoding or autoregressive pre-training objectives, which makes them struggle to handle protein understanding and generation tasks concurrently. We propose a unified protein language model, xTrimoPGLM, to address these two types of tasks simultaneously through an innovative pre-training framework. Our key technical contribution is an exploration of the compatibility and the potential for joint optimization of the two types of objectives, which has led to a strategy for training xTrimoPGLM at an unprecedented scale of 100 billion parameters and 1 trillion training tokens. Our extensive experiments reveal that 1) xTrimoPGLM significantly outperforms other advanced baselines in 18 protein understanding benchmarks across four categories. The model also facilitates an atomic-resolution view of protein structures, leading to an advanced 3D structural prediction model that surpasses existing language model-based tools. 2) xTrimoPGLM not only can generate de novo protein sequences following the principles of natural ones, but also can perform programmable generation after supervised fine-tuning (SFT) on curated sequences. These results highlight the substantial capability and versatility of xTrimoPGLM in understanding and generating protein sequences, contributing to the evolving landscape of foundation models in protein science

arXiv.org e-Print Archive