537 research outputs found
Model-free controlled variable selection via data splitting
Addressing the simultaneous identification of contributory variables while
controlling the false discovery rate (FDR) in high-dimensional data is a
crucial statistical challenge. In this paper, we propose a novel model-free
variable selection procedure in sufficient dimension reduction framework via a
data splitting technique. The variable selection problem is first converted to
a least squares procedure with several response transformations. We construct a
series of statistics with global symmetry property and leverage the symmetry to
derive a data-driven threshold aimed at error rate control. Our approach
demonstrates the capability for achieving finite-sample and asymptotic FDR
control under mild theoretical conditions. Numerical experiments confirm that
our procedure has satisfactory FDR control and higher power compared with
existing methods.Comment: 55 pages, 5 figures, 6 table
High-dimensional Inference and FDR Control for Simulated Markov Random Fields
Identifying important features linked to a response variable is a fundamental
task in various scientific domains. This article explores statistical inference
for simulated Markov random fields in high-dimensional settings. We introduce a
methodology based on Markov Chain Monte Carlo Maximum Likelihood Estimation
(MCMC-MLE) with Elastic-net regularization. Under mild conditions on the MCMC
method, our penalized MCMC-MLE method achieves -consistency. We
propose a decorrelated score test, establishing both its asymptotic normality
and that of a one-step estimator, along with the associated confidence
interval. Furthermore, we construct two false discovery rate control procedures
via the asymptotic behaviors for both p-values and e-values. Comprehensive
numerical simulations confirm the theoretical validity of the proposed methods
Learning to Check: Unleashing Potentials for Self-Correction in Large Language Models
Self-correction has achieved impressive results in enhancing the style and
security of the generated output from large language models (LLMs). However,
recent studies suggest that self-correction might be limited or even
counterproductive in reasoning tasks due to LLMs' difficulties in identifying
logical mistakes.
In this paper, we aim to enhance the self-checking capabilities of LLMs by
constructing training data for checking tasks. Specifically, we apply the Chain
of Thought (CoT) methodology to self-checking tasks, utilizing fine-grained
step-level analyses and explanations to assess the correctness of reasoning
paths. We propose a specialized checking format called "Step CoT Check".
Following this format, we construct a checking-correction dataset that includes
detailed step-by-step analysis and checking. Then we fine-tune LLMs to enhance
their error detection and correction abilities.
Our experiments demonstrate that fine-tuning with the "Step CoT Check" format
significantly improves the self-checking and self-correction abilities of LLMs
across multiple benchmarks. This approach outperforms other formats, especially
in locating the incorrect position, with greater benefits observed in larger
models.
For reproducibility, all the datasets and code are provided in
https://github.com/bammt/Learn-to-check
CaMML: Context-Aware Multimodal Learner for Large Models
In this work, we introduce Context-Aware MultiModal Learner (CaMML), for
tuning large multimodal models (LMMs). CaMML, a lightweight module, is crafted
to seamlessly integrate multimodal contextual samples into large models,
thereby empowering the model to derive knowledge from analogous,
domain-specific, up-to-date information and make grounded inferences.
Importantly, CaMML is highly scalable and can efficiently handle lengthy
multimodal context examples owing to its hierarchical design. Based on CaMML,
we have developed two multimodal models, CaMML-7B and CaMML-13B, that have
shown exceptional performance across an array of benchmark datasets for
multimodal tasks. Remarkably, CaMML-13B achieves the state-of-the-art
performance on over ten widely recognized multimodal benchmark datasets,
surpassing LLaVA-1.5 (13B) with a noticeable margin, without integration of any
external resources. Moreover, we have conducted extensive ablative studies to
inspect the inner workings of CaMML and performed qualitative analyses to
showcase its effectiveness in handling real-world challenging cases. Code and
models are available at: https://github.com/amazon-science/camml.Comment: Preprin
Smart Agent-Based Modeling: On the Use of Large Language Models in Computer Simulations
Computer simulations offer a robust toolset for exploring complex systems
across various disciplines. A particularly impactful approach within this realm
is Agent-Based Modeling (ABM), which harnesses the interactions of individual
agents to emulate intricate system dynamics. ABM's strength lies in its
bottom-up methodology, illuminating emergent phenomena by modeling the
behaviors of individual components of a system. Yet, ABM has its own set of
challenges, notably its struggle with modeling natural language instructions
and common sense in mathematical equations or rules. This paper seeks to
transcend these boundaries by integrating Large Language Models (LLMs) like GPT
into ABM. This amalgamation gives birth to a novel framework, Smart Agent-Based
Modeling (SABM). Building upon the concept of smart agents -- entities
characterized by their intelligence, adaptability, and computation ability --
we explore in the direction of utilizing LLM-powered agents to simulate
real-world scenarios with increased nuance and realism. In this comprehensive
exploration, we elucidate the state of the art of ABM, introduce SABM's
potential and methodology, and present three case studies (source codes
available at https://github.com/Roihn/SABM), demonstrating the SABM methodology
and validating its effectiveness in modeling real-world systems. Furthermore,
we cast a vision towards several aspects of the future of SABM, anticipating a
broader horizon for its applications. Through this endeavor, we aspire to
redefine the boundaries of computer simulations, enabling a more profound
understanding of complex systems.Comment: Source codes are available at https://github.com/Roihn/SAB
CARF expression-based screening of steatosis-modulating compounds
Introduction: Steatosis, excessive accumulation of fat in liver, is a stressed state of liver caused by various factors such as obesity, diabetes, alcohol consumption, certain medications and non-alcoholic fatty liver disease. While it is generally considered a benign condition, prolonged steatosis often progresses to serious liver diseases including liver fibrosis, nonalcoholic steatohepatitis and hepatocarcinoma. Whereas drug development is expensive and long process, anti-steatosis natural compounds are anticipated to be useful for management of this condition and prevent its progress to complicated lethal pathologies. CARF protein has been shown to play a key role in regulating the cellular response to stress. It has been shown to control the fate of cells to apoptosis, senescence and malignant transformation by its low, high and super-high levels, respectively. Most recently it has been shown that CARF expression may serve as a quantitative marker for stress response [1,2].
Methods: We used CARF expression screening in liver-derived cells (HepG2) as an assay to select compounds with steatosis-modulating activity. Cells were treated with Free Fatty Acid (FFA) and analyzed for expression of CARF by Western blotting and immunocytostaining by specific antibodies raised in our laboratory. In parallel assays, cells were subjected to Nile Red (NR) staining. We also used an additional marker Mortalin that has been shown to regulate liver fibrosis, HCC and its recurrence.
Results: HepG2 cells treated with non toxic concentration of FFA showed downregulation of CARF suggesting its role in lipid metabolism in line with a recent report [3]. We investigated if this phenomenon could be used as an assay system to screen anti-steatosis compounds. FFA-treated HepG2 cells were subjected 30 small molecules. Expression analysis of CARF revealed modulation of CARF expression with ~18 out of 30 compounds. Parrallel analyses of cells for accumulation of FFA by NR staining showed its decrease cells treated with ~14/18 compounds. Several of these compounds showed similar structure and belonged to withanolide class of phytochemicals. Furthermore, crude extracts from Ashwagandha containing mixture of these withanolides showed remarkable response suggesting use of CARF expression as a reliable reporter assay for anti-steatosis compound screening. Such compounds may offer a convenient and economic way to manage steaosis and related liver pathologies.
Conclusion: CARF-expression based screening of a small number of natural compounds led to identification of candidate steatosis-modulating compounds and warrant further molecular analyses
Evaluating and Inducing Personality in Pre-trained Language Models
Standardized and quantified evaluation of machine behaviors is a crux of
understanding LLMs. In this study, we draw inspiration from psychometric
studies by leveraging human personality theory as a tool for studying machine
behaviors. Originating as a philosophical quest for human behaviors, the study
of personality delves into how individuals differ in thinking, feeling, and
behaving. Toward building and understanding human-like social machines, we are
motivated to ask: Can we assess machine behaviors by leveraging human
psychometric tests in a principled and quantitative manner? If so, can we
induce a specific personality in LLMs? To answer these questions, we introduce
the Machine Personality Inventory (MPI) tool for studying machine behaviors;
MPI follows standardized personality tests, built upon the Big Five Personality
Factors (Big Five) theory and personality assessment inventories. By
systematically evaluating LLMs with MPI, we provide the first piece of evidence
demonstrating the efficacy of MPI in studying LLMs behaviors. We further devise
a Personality Prompting (P^2) method to induce LLMs with specific personalities
in a controllable way, capable of producing diverse and verifiable behaviors.
We hope this work sheds light on future studies by adopting personality as the
essential indicator for various downstream tasks, and could further motivate
research into equally intriguing human-like machine behaviors.Comment: Accepted at NeurIPS 2023 (Spotlight
DialCoT Meets PPO: Decomposing and Exploring Reasoning Paths in Smaller Language Models
Chain-of-Thought (CoT) prompting has proven to be effective in enhancing the
reasoning capabilities of Large Language Models (LLMs) with at least 100
billion parameters. However, it is ineffective or even detrimental when applied
to reasoning tasks in Smaller Language Models (SLMs) with less than 10 billion
parameters. To address this limitation, we introduce Dialogue-guided
Chain-of-Thought (DialCoT) which employs a dialogue format to generate
intermediate reasoning steps, guiding the model toward the final answer.
Additionally, we optimize the model's reasoning path selection using the
Proximal Policy Optimization (PPO) algorithm, further enhancing its reasoning
capabilities. Our method offers several advantages compared to previous
approaches. Firstly, we transform the process of solving complex reasoning
questions by breaking them down into a series of simpler sub-questions,
significantly reducing the task difficulty and making it more suitable for
SLMs. Secondly, we optimize the model's reasoning path selection through the
PPO algorithm. We conduct comprehensive experiments on four arithmetic
reasoning datasets, demonstrating that our method achieves significant
performance improvements compared to state-of-the-art competitors.Comment: Accepted to EMNLP 202
Study on stability and bearing characteristics of macroscopic pressure arch of surrounding rock in western deep buried stope of China
In view of the obvious loose and weak occurrence characteristics of the deeply buried thick weakly cemented stratum in the western mining area of China, the bearing characteristics and stability mechanism of the macrography surrounding rock pressure arch (SRPA) are studied. Firstly, considering the engineering characteristics of deep mining, a SRPA model with trapezoidal load was constructed based on the three-hinged arch theory, the shape characteristic, rise-span ratio and arch thickness equations were derived, the arch thickness under different stress paths is analyzed to characterize the bearing performance of pressure arch. Secondly, the internal force distribution law and destabilization damage type were studied by establishing a two-dimensional bearing SRPA model through arch without articulation theory. The instability type and location can be accurately judged and verified by simulation of similar materials. The results show that, the rational arch axis of SRPA is a cubic parabola with opening downward, its rise-span ratio is between 0.3 and 0.5. Increasing the rise-span ratio and lateral pressure coefficient can promote the stable bearing capacity of arch. Axial force distribution on the SRPA section is basically consistent with the arch axis, and the arch has the best bearing characteristics. The positive bending moment occurs in the ranges of [0°, 30°] and [81°, 90°] on both sides of the symmetry axis, where is prone to tensile failure. The maximum shear force is concentrated on the arch waist and skewback, and these sections are prone to shear failure. The instability modes of SRPA can be divided into “skewback—vault (arch waist)” and “vault (arch waist)—skewback”. The research results have theoretical guiding significance for mining roof management
Seismic Waveform Inversion Using the Finite-Difference Contrast Source Inversion Method
This paper extends the finite-difference contrast source inversion method
to reconstruct the mass density for two-dimensional elastic wave inversion in the
framework of the full-waveform inversion. The contrast source inversion method
is a nonlinear iterative method that alternatively reconstructs contrast sources and
contrast function. One of the most outstanding advantages of this inversion method
is the highly computational efficiency, since it does not need to simulate a full forward
problem for each inversion iteration. Another attractive feature of the inversion
method is that it is of strong capability in dealing with nonlinear inverse problems
in an inhomogeneous background medium, because a finite-difference operator
is used to represent the differential operator governing the two-dimensional elastic
wave propagation. Additionally, the techniques of a multiplicative regularization
and a sequential multifrequency inversion are employed to enhance the quality of
reconstructions for this inversion method. Numerical reconstruction results show
that the inversion method has an excellent performance for reconstructing the objects
embedded inside a homogeneous or an inhomogeneous background medium
- …