70 research outputs found
Distribution-Free Tests of Independence in High Dimensions
We consider the testing of mutual independence among all entries in a
-dimensional random vector based on independent observations. We study
two families of distribution-free test statistics, which include Kendall's tau
and Spearman's rho as important examples. We show that under the null
hypothesis the test statistics of these two families converge weakly to Gumbel
distributions, and propose tests that control the type I error in the
high-dimensional setting where . We further show that the two tests are
rate-optimal in terms of power against sparse alternatives, and outperform
competitors in simulations, especially when is large.Comment: to appear in Biometrik
UniTime: A Language-Empowered Unified Model for Cross-Domain Time Series Forecasting
Multivariate time series forecasting plays a pivotal role in contemporary web
technologies. In contrast to conventional methods that involve creating
dedicated models for specific time series application domains, this research
advocates for a unified model paradigm that transcends domain boundaries.
However, learning an effective cross-domain model presents the following
challenges. First, various domains exhibit disparities in data characteristics,
e.g., the number of variables, posing hurdles for existing models that impose
inflexible constraints on these factors. Second, the model may encounter
difficulties in distinguishing data from various domains, leading to suboptimal
performance in our assessments. Third, the diverse convergence rates of time
series domains can also result in compromised empirical performance. To address
these issues, we propose UniTime for effective cross-domain time series
learning. Concretely, UniTime can flexibly adapt to data with varying
characteristics. It also uses domain instructions and a Language-TS Transformer
to offer identification information and align two modalities. In addition,
UniTime employs masking to alleviate domain convergence speed imbalance issues.
Our extensive experiments demonstrate the effectiveness of UniTime in advancing
state-of-the-art forecasting performance and zero-shot transferability
Plum: Prompt Learning using Metaheuristic
Since the emergence of large language models, prompt learning has become a
popular method for optimizing and customizing these models. Special prompts,
such as Chain-of-Thought, have even revealed previously unknown reasoning
capabilities within these models. However, the progress of discovering
effective prompts has been slow, driving a desire for general prompt
optimization methods. Unfortunately, few existing prompt learning methods
satisfy the criteria of being truly "general", i.e., automatic, discrete,
black-box, gradient-free, and interpretable all at once. In this paper, we
introduce metaheuristics, a branch of discrete non-convex optimization methods
with over 100 options, as a promising approach to prompt learning. Within our
paradigm, we test six typical methods: hill climbing, simulated annealing,
genetic algorithms with/without crossover, tabu search, and harmony search,
demonstrating their effectiveness in black-box prompt learning and
Chain-of-Thought prompt tuning. Furthermore, we show that these methods can be
used to discover more human-understandable prompts that were previously
unknown, opening the door to a cornucopia of possibilities in prompt
optimization. We release all the codes in
\url{https://github.com/research4pan/Plum}
Application of diffusion tensor imaging in the diagnosis of post-stroke aphasia: a meta-analysis and systematic review
IntroductionDiffusion Tensor Imaging (DTI) indicators of different white matter (WM) fibers and brain region lesions for post-stroke aphasia (PSA) are inconsistent in existing studies. Our study examines the consistency and differences between PSA tests performed with DTI. In addition, obtaining consistent and independent conclusions between studies was made possible by utilizing DTI in PSA assessment.MethodsIn order to gather relevant studies using DTI for diagnosing PSA, we searched the Web of Science, PubMed, Embase, and CNKI databases. Based on the screening and evaluation of the included studies, the meta-analysis was used to conduct a quantitative analysis. Narrative descriptions were provided for studies that met the inclusion criteria but lacked data.ResultsFirst, we reported on the left hemisphere. The meta-analysis showed that fractional anisotropy (FA) of the arcuate fasciculus (AF) and superior longitudinal fasciculus (SLF), inferior frontal-occipital fasciculus (IFOF), inferior longitudinal fasciculus (ILF), and uncinate fasciculus (UF) were decreased in the PSA group in comparison with the healthy controls (p < 0.00001). However, in the comparison of axial diffusivity (AD), there was no statistically significant difference in white matter fiber tracts in the dual-stream language model of the PSA group. Elevated radial diffusivity (RD) was seen only in the IFOF and ILF (PIFOF = 0.01; PILF = 0.05). In the classic Broca’s area, the FA of the PSA group was decreased (p < 0.00001) while the apparent diffusion coefficient was elevated (p = 0.03). Secondly, we evaluated the white matter fiber tracts in the dual-stream language model of the right hemisphere. The FA of the PSA group was decreased only in the IFOF (p = 0.001). AD was elevated in the AF and UF (PAF < 0.00001; PUF = 0.009). RD was elevated in the AF and UF (PAF = 0.01; PUF = 0.003). The other fiber tracts did not undergo similar alterations.ConclusionIn conclusion, DTI is vital for diagnosing PSA because it detects WM changes effectively, but it still has some limitations. Due to a lack of relevant language scales and clinical manifestations, diagnosing and differentiating PSA independently remain challenging.Systematic review registrationhttps://www.crd.york.ac.uk/PROSPERO/display_record.php?RecordID=365897
Mitigating the Alignment Tax of RLHF
LLMs acquire a wide range of abilities during pre-training, but aligning LLMs
under Reinforcement Learning with Human Feedback (RLHF) can lead to forgetting,
which is also known as the alignment tax. To empirically verify this
hypothesis, we conducted experiments with existing RLHF algorithms using
OpenLLaMA-3B, which revealed a pronounced alignment tax in NLP tasks. On the
other hand, despite various techniques to mitigate forgetting, they are often
at odds with the RLHF performance, leading to a trade-off between reward
maximization and forgetting mitigation.
In light of the above pressing issue in aligning LLMs, in this paper we
explore model averaging, which interpolates between pre and post RLHF model
weights, to achieve a more efficient reward-tax Pareto front. To understand its
effectiveness, We offer theoretical insights into model averaging, revealing
that it enhances performance Pareto front by increasing feature diversity on
the layers where tasks share overlapped feature spaces. Empirical evidence
corroborates our analysis by showing the benefits of averaging low-level
transformer layers. Building on the analysis and the observation that averaging
different layers of the transformer leads to significantly different reward-tax
trade-offs, we propose Adaptive Model Averaging (AMA) to adaptively find
various combination ratios of model layers. AMA seeks to maximize the alignment
reward while incurring minimal alignment tax. Moreover, we validate AMA's
performance across a range of RLHF algorithms over OpenLLaMA-3B and further
extend our findings to Mistral-7B.Comment: 28 Page
Recommended from our members
Distribution-free tests of independence in high dimensions.
We consider the testing of mutual independence among all entries in a [Formula: see text]-dimensional random vector based on [Formula: see text] independent observations. We study two families of distribution-free test statistics, which include Kendalls tau and Spearmans rho as important examples. We show that under the null hypothesis the test statistics of these two families converge weakly to Gumbel distributions, and we propose tests that control the Type I error in the high-dimensional setting where [Formula: see text]. We further show that the two tests are rate-optimal in terms of power against sparse alternatives and that they outperform competitors in simulations, especially when [Formula: see text] is large
- …