72 research outputs found
A Dual Sensor Computational Camera for High Quality Dark Videography
Videos captured under low light conditions suffer from severe noise. A
variety of efforts have been devoted to image/video noise suppression and made
large progress. However, in extremely dark scenarios, extensive photon
starvation would hamper precise noise modeling. Instead, developing an imaging
system collecting more photons is a more effective way for high-quality video
capture under low illuminations. In this paper, we propose to build a
dual-sensor camera to additionally collect the photons in NIR wavelength, and
make use of the correlation between RGB and near-infrared (NIR) spectrum to
perform high-quality reconstruction from noisy dark video pairs. In hardware,
we build a compact dual-sensor camera capturing RGB and NIR videos
simultaneously. Computationally, we propose a dual-channel multi-frame
attention network (DCMAN) utilizing spatial-temporal-spectral priors to
reconstruct the low-light RGB and NIR videos. In addition, we build a
high-quality paired RGB and NIR video dataset, based on which the approach can
be applied to different sensors easily by training the DCMAN model with
simulated noisy input following a physical-process-based CMOS noise model. Both
experiments on synthetic and real videos validate the performance of this
compact dual-sensor camera design and the corresponding reconstruction
algorithm in dark videography
Lightweight High-Speed Photography Built on Coded Exposure and Implicit Neural Representation of Videos
The compact cameras recording high-speed scenes with high resolution are
highly demanded, but the required high bandwidth often leads to bulky, heavy
systems, which limits their applications on low-capacity platforms. Adopting a
coded exposure setup to encode a frame sequence into a blurry snapshot and
retrieve the latent sharp video afterward can serve as a lightweight solution.
However, restoring motion from blur is quite challenging due to the high
ill-posedness of motion blur decomposition, intrinsic ambiguity in motion
direction, and diverse motions in natural videos. In this work, by leveraging
classical coded exposure imaging technique and emerging implicit neural
representation for videos, we tactfully embed the motion direction cues into
the blurry image during the imaging process and develop a novel self-recursive
neural network to sequentially retrieve the latent video sequence from the
blurry image utilizing the embedded motion direction cues. To validate the
effectiveness and efficiency of the proposed framework, we conduct extensive
experiments on benchmark datasets and real-captured blurry images. The results
demonstrate that our proposed framework significantly outperforms existing
methods in quality and flexibility. The code for our work is available at
https://github.com/zhihongz/BDINRComment: 19 pages, 10 figure
Can the Inference Logic of Large Language Models be Disentangled into Symbolic Concepts?
In this paper, we explain the inference logic of large language models (LLMs)
as a set of symbolic concepts. Many recent studies have discovered that
traditional DNNs usually encode sparse symbolic concepts. However, because an
LLM has much more parameters than traditional DNNs, whether the LLM also
encodes sparse symbolic concepts is still an open problem. Therefore, in this
paper, we propose to disentangle the inference score of LLMs for dialogue tasks
into a small number of symbolic concepts. We verify that we can use those
sparse concepts to well estimate all inference scores of the LLM on all
arbitrarily masking states of the input sentence. We also evaluate the
transferability of concepts encoded by an LLM and verify that symbolic concepts
usually exhibit high transferability across similar input sentences. More
crucially, those symbolic concepts can be used to explain the exact reasons
accountable for the LLM's prediction errors
Generalized Activation via Multivariate Projection
Activation functions are essential to introduce nonlinearity into neural
networks, with the Rectified Linear Unit (ReLU) often favored for its
simplicity and effectiveness. Motivated by the structural similarity between a
shallow Feedforward Neural Network (FNN) and a single iteration of the
Projected Gradient Descent (PGD) algorithm, a standard approach for solving
constrained optimization problems, we consider ReLU as a projection from R onto
the nonnegative half-line R+. Building on this interpretation, we extend ReLU
by substituting it with a generalized projection operator onto a convex cone,
such as the Second-Order Cone (SOC) projection, thereby naturally extending it
to a Multivariate Projection Unit (MPU), an activation function with multiple
inputs and multiple outputs. We further provide mathematical proof establishing
that FNNs activated by SOC projections outperform those utilizing ReLU in terms
of expressive power. Experimental evaluations on widely-adopted architectures
further corroborate MPU's effectiveness against a broader range of existing
activation functions
CUTS: Neural Causal Discovery from Irregular Time-Series Data
Causal discovery from time-series data has been a central task in machine
learning. Recently, Granger causality inference is gaining momentum due to its
good explainability and high compatibility with emerging deep neural networks.
However, most existing methods assume structured input data and degenerate
greatly when encountering data with randomly missing entries or non-uniform
sampling frequencies, which hampers their applications in real scenarios. To
address this issue, here we present CUTS, a neural Granger causal discovery
algorithm to jointly impute unobserved data points and build causal graphs, via
plugging in two mutually boosting modules in an iterative framework: (i) Latent
data prediction stage: designs a Delayed Supervision Graph Neural Network
(DSGNN) to hallucinate and register unstructured data which might be of high
dimension and with complex distribution; (ii) Causal graph fitting stage:
builds a causal adjacency matrix with imputed data under sparse penalty.
Experiments show that CUTS effectively infers causal graphs from unstructured
time-series data, with significantly superior performance to existing methods.
Our approach constitutes a promising step towards applying causal discovery to
real applications with non-ideal observations.Comment: https://openreview.net/forum?id=UG8bQcD3Em
CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation
Since the natural language processing (NLP) community started to make large
language models (LLMs), such as GPT-4, act as a critic to evaluate the quality
of generated texts, most of them only train a critique generation model of a
specific scale on specific datasets. We argue that a comprehensive
investigation on the key factor of LLM-based evaluation models, such as scaling
properties, is lacking, so that it is still inconclusive whether these models
have potential to replace GPT-4's evaluation in practical scenarios. In this
paper, we propose a new critique generation model called CritiqueLLM, which
includes a dialogue-based prompting method for high-quality referenced /
reference-free evaluation data. Experimental results show that our model can
achieve comparable evaluation performance to GPT-4 especially in system-level
correlations, and even outperform GPT-4 in 3 out of 8 tasks in a challenging
reference-free setting. We conduct detailed analysis to show promising scaling
properties of our model in the quality of generated critiques. We also
demonstrate that our generated critiques can act as scalable feedback to
directly improve the generation quality of LLMs.Comment: 18 pages, 5 figure
Associations between pan-immune-inflammation value and abdominal aortic calcification: a cross-sectional study
BackgroundAbdominal aortic calcification (AAC) pathogenesis is intricately linked with inflammation. The pan-immune-inflammation value (PIV) emerges as a potential biomarker, offering reflection into systemic inflammatory states and assisting in the prognosis of diverse diseases. This research aimed to explore the association between PIV and AAC.MethodsEmploying data from the National Health and Nutrition Examination Survey (NHANES), this cross-sectional analysis harnessed weighted multivariable regression models to ascertain the relationship between PIV and AAC. Trend tests probed the evolving relationship among PIV quartiles and AAC. The study also incorporated subgroup analysis and interaction tests to determine associations within specific subpopulations. Additionally, the least absolute shrinkage and selection operator (LASSO) regression and multivariable logistic regression were used for characteristics selection to construct prediction model. Nomograms were used for visualization. The receiver operator characteristic (ROC) curve, calibration plot and decision curve analysis were applied for evaluate the predictive performance.ResultsFrom the cohort of 3,047 participants, a distinct positive correlation was observed between PIV and AAC. Subsequent to full adjustments, a 100-unit increment in PIV linked to an elevation of 0.055 points in the AAC score (β=0.055, 95% CI: 0.014-0.095). Categorizing PIV into quartiles revealed an ascending trend: as PIV quartiles increased, AAC scores surged (β values in Quartile 2, Quartile 3, and Quartile 4: 0.122, 0.437, and 0.658 respectively; P for trend <0.001). Concurrently, a marked rise in SAAC prevalence was noted (OR values for Quartile 2, Quartile 3, and Quartile 4: 1.635, 1.842, and 2.572 respectively; P for trend <0.01). Individuals aged 60 or above and those with a history of diabetes exhibited a heightened association. After characteristic selection, models for predicting AAC and SAAC were constructed respectively. The AUC of AAC model was 0.74 (95%CI=0.71-0.77) and the AUC of SAAC model was 0.84 (95%CI=0.80-0.87). According to the results of calibration plots and DCA, two models showed high accuracy and clinical benefit.ConclusionThe research findings illuminate the potential correlation between elevated PIV and AAC presence. Our models indicate the potential utility of PIV combined with other simple predictors in the assessment and management of individuals with AAC
xTrimoPGLM: Unified 100B-Scale Pre-trained Transformer for Deciphering the Language of Protein
Protein language models have shown remarkable success in learning biological
information from protein sequences. However, most existing models are limited
by either autoencoding or autoregressive pre-training objectives, which makes
them struggle to handle protein understanding and generation tasks
concurrently. We propose a unified protein language model, xTrimoPGLM, to
address these two types of tasks simultaneously through an innovative
pre-training framework. Our key technical contribution is an exploration of the
compatibility and the potential for joint optimization of the two types of
objectives, which has led to a strategy for training xTrimoPGLM at an
unprecedented scale of 100 billion parameters and 1 trillion training tokens.
Our extensive experiments reveal that 1) xTrimoPGLM significantly outperforms
other advanced baselines in 18 protein understanding benchmarks across four
categories. The model also facilitates an atomic-resolution view of protein
structures, leading to an advanced 3D structural prediction model that
surpasses existing language model-based tools. 2) xTrimoPGLM not only can
generate de novo protein sequences following the principles of natural ones,
but also can perform programmable generation after supervised fine-tuning (SFT)
on curated sequences. These results highlight the substantial capability and
versatility of xTrimoPGLM in understanding and generating protein sequences,
contributing to the evolving landscape of foundation models in protein science
- …