87 research outputs found
Correction to the paper: Support functionals and smoothness in Musielak-Orlicz sequence spaces endowed with the Luxemburg norm
Tracking Referential Adaptation To Nonbinary They in A Mouse-Tracking Paradigm
In referential adaptation, people adapt to the relationships between a pronoun and its antecedent (Johnson & Arnold, 2022). Most evidence comes from offline tasks testing how exposure influences resolution of subject and nonsubject referents. Arnold et al. (2023) used mouse-tracking to examine processing of singular vs. plural they pronouns. The singular they elicits a processing difficulty. Based on these findings, the current study tests 1) the sensitivity of mouse-tracking to processing of subject and nonsubject references and 2) adaptation to singular they. Participants listened to stories of two characters doing an activity followed by a pronoun, which is disambiguated by a target object placed under one of the characters. Participants clicked on the target object. Mouse movements and RTs were analyzed. Two pilots tested the processing of subject and nonsubject interpretations but failed to replicate the subject bias. The main experiment exposed participants to singular or plural they and found adaptation effects.Master of Art
Semiparametric Analysis for Correlated Recurrent and Terminal Events
In clinical and observational studies, recurrent event data (e.g. hospitalization) with a terminal event (e.g. death) are often encountered. In many instances, the terminal event is strongly correlated with the recurrent event process. In this article, we propose a semiparametric method to jointly model the recurrent and terminal event processes. The dependence is modeled by a shared gamma frailty that is included in both the recurrent event rate and terminal event hazard function. Marginal models are used to estimate the regression effects on the terminal and recurrent event processes and a Poisson model is used to estimate the dispersion of the frailty variable. A sandwich estimator is used to achieve additional robustness. An analysis of hospitalization data for patients in the peritoneal dialysis study is presented to illustrate the proposed method
Two-dimensional modeling of the tearing-mode-governed magnetic reconnection in the large-scale current sheet above the two-ribbon flare
We attempt to model magnetic reconnection during the two-ribbon flare in the
gravitationally stratified solar atmosphere with the Lundquist number of
using 2D simulations. We found that the tearing mode instability leads
to the inhomogeneous turbulence inside the reconnecting current sheet (CS) and
invokes the fast phase of reconnection. Fast reconnection brings an extra
dissipation of magnetic field which enhances the reconnection rate in an
apparent way. The energy spectrum in the CS shows the power-law pattern and the
dynamics of plasmoids governs the associated spectral index. We noticed that
the energy dissipation occurs at a scale of 100-200~km, and the
associated CS thickness ranges from 1500 to 2500~km, which follows the Taylor
scale . The termination shock(TS) appears in the turbulent
region above flare loops, which is an important contributor to heating flare
loops. Substantial magnetic energy is converted into both kinetic and thermal
energies via TS, and the cumulative heating rate is greater than the rate of
the kinetic energy transfer. In addition, the turbulence is somehow amplified
by TS, of which the amplitude is related to the local geometry of the TS.Comment: 22 pages, 10 figures; Accepted for publication in Research in
Astronomy and Astrophysic
Qilin-Med: Multi-stage Knowledge Injection Advanced Medical Large Language Model
Integrating large language models (LLMs) into healthcare presents potential
but faces challenges. Directly pre-training LLMs for domains like medicine is
resource-heavy and sometimes unfeasible. Sole reliance on Supervised
Fine-tuning (SFT) can result in overconfident predictions and may not tap into
domain specific insights. Addressing these challenges, we present a multi-stage
training method combining Domain-specific Continued Pre-training (DCPT), SFT,
and Direct Preference Optimization (DPO). A notable contribution of our study
is the introduction of a 3Gb Chinese Medicine (ChiMed) dataset, encompassing
medical question answering, plain texts, knowledge graphs, and dialogues,
segmented into three training stages. The medical LLM trained with our
pipeline, Qilin-Med, exhibits significant performance boosts. In the CPT and
SFT phases, it achieves 38.4% and 40.0% accuracy on the CMExam, surpassing
Baichuan-7B's 33.5%. In the DPO phase, on the Huatuo-26M test set, it scores
16.66 in BLEU-1 and 27.44 in ROUGE1, outperforming the SFT's 12.69 and 24.21.
This highlights the strength of our training approach in refining LLMs for
medical applications
Neural Multi-Objective Combinatorial Optimization with Diversity Enhancement
Most of existing neural methods for multi-objective combinatorial
optimization (MOCO) problems solely rely on decomposition, which often leads to
repetitive solutions for the respective subproblems, thus a limited Pareto set.
Beyond decomposition, we propose a novel neural heuristic with diversity
enhancement (NHDE) to produce more Pareto solutions from two perspectives. On
the one hand, to hinder duplicated solutions for different subproblems, we
propose an indicator-enhanced deep reinforcement learning method to guide the
model, and design a heterogeneous graph attention mechanism to capture the
relations between the instance graph and the Pareto front graph. On the other
hand, to excavate more solutions in the neighborhood of each subproblem, we
present a multiple Pareto optima strategy to sample and preserve desirable
solutions. Experimental results on classic MOCO problems show that our NHDE is
able to generate a Pareto front with higher diversity, thereby achieving
superior overall performance. Moreover, our NHDE is generic and can be applied
to different neural methods for MOCO.Comment: Accepted at NeurIPS 202
Rethinking Multi-Interest Learning for Candidate Matching in Recommender Systems
Existing research efforts for multi-interest candidate matching in
recommender systems mainly focus on improving model architecture or
incorporating additional information, neglecting the importance of training
schemes. This work revisits the training framework and uncovers two major
problems hindering the expressiveness of learned multi-interest
representations. First, the current training objective (i.e., uniformly sampled
softmax) fails to effectively train discriminative representations in a
multi-interest learning scenario due to the severe increase in easy negative
samples. Second, a routing collapse problem is observed where each learned
interest may collapse to express information only from a single item, resulting
in information loss. To address these issues, we propose the REMI framework,
consisting of an Interest-aware Hard Negative mining strategy (IHN) and a
Routing Regularization (RR) method. IHN emphasizes interest-aware hard
negatives by proposing an ideal sampling distribution and developing a
Monte-Carlo strategy for efficient approximation. RR prevents routing collapse
by introducing a novel regularization term on the item-to-interest routing
matrices. These two components enhance the learned multi-interest
representations from both the optimization objective and the composition
information. REMI is a general framework that can be readily applied to various
existing multi-interest candidate matching methods. Experiments on three
real-world datasets show our method can significantly improve state-of-the-art
methods with easy implementation and negligible computational overhead. The
source code will be released.Comment: RecSys 202
Information Bottleneck Revisited: Posterior Probability Perspective with Optimal Transport
Information bottleneck (IB) is a paradigm to extract information in one
target random variable from another relevant random variable, which has aroused
great interest due to its potential to explain deep neural networks in terms of
information compression and prediction. Despite its great importance, finding
the optimal bottleneck variable involves a difficult nonconvex optimization
problem due to the nonconvexity of mutual information constraint. The
Blahut-Arimoto algorithm and its variants provide an approach by considering
its Lagrangian with fixed Lagrange multiplier. However, only the strictly
concave IB curve can be fully obtained by the BA algorithm, which strongly
limits its application in machine learning and related fields, as strict
concavity cannot be guaranteed in those problems. To overcome the above
difficulty, we derive an entropy regularized optimal transport (OT) model for
IB problem from a posterior probability perspective. Correspondingly, we use
the alternating optimization procedure and generalize the Sinkhorn algorithm to
solve the above OT model. The effectiveness and efficiency of our approach are
demonstrated via numerical experiments.Comment: ISIT 202
- …