289 research outputs found
Reconciliation of Pre-trained Models and Prototypical Neural Networks in Few-shot Named Entity Recognition
Incorporating large-scale pre-trained models with the prototypical neural
networks is a de-facto paradigm in few-shot named entity recognition. Existing
methods, unfortunately, are not aware of the fact that embeddings from
pre-trained models contain a prominently large amount of information regarding
word frequencies, biasing prototypical neural networks against learning word
entities. This discrepancy constrains the two models' synergy. Thus, we propose
a one-line-code normalization method to reconcile such a mismatch with
empirical and theoretical grounds. Our experiments based on nine benchmark
datasets show the superiority of our method over the counterpart models and are
comparable to the state-of-the-art methods. In addition to the model
enhancement, our work also provides an analytical viewpoint for addressing the
general problems in few-shot name entity recognition or other tasks that rely
on pre-trained models or prototypical neural networks.Comment: Findings of EMNLP 202
Towards Equipping Transformer with the Ability of Systematic Compositionality
One of the key factors in language productivity and human cognition is the
ability of systematic compositionality, which refers to understanding composed
unseen examples of seen primitives. However, recent evidence reveals that the
Transformers have difficulty generalizing the composed context based on the
seen primitives. To this end, we take the first step to propose a
compositionality-aware Transformer called CAT and two novel pre-training tasks
to facilitate systematic compositionality. We tentatively provide a successful
implementation of a multi-layer CAT on the basis of the especially popular
BERT. The experimental results demonstrate that CAT outperforms baselines on
compositionality-aware tasks with minimal impact on the effectiveness on
standardized language understanding tasks.Comment: Accepted to AAAI 2024. Paper with appendi
DREditor: An Time-efficient Approach for Building a Domain-specific Dense Retrieval Model
Deploying dense retrieval models efficiently is becoming increasingly
important across various industries. This is especially true for enterprise
search services, where customizing search engines to meet the time demands of
different enterprises in different domains is crucial. Motivated by this, we
develop a time-efficient approach called DREditor to edit the matching rule of
an off-the-shelf dense retrieval model to suit a specific domain. This is
achieved by directly calibrating the output embeddings of the model using an
efficient and effective linear mapping. This mapping is powered by an edit
operator that is obtained by solving a specially constructed least squares
problem. Compared to implicit rule modification via long-time finetuning, our
experimental results show that DREditor provides significant advantages on
different domain-specific datasets, dataset sources, retrieval models, and
computing devices. It consistently enhances time efficiency by 100-300 times
while maintaining comparable or even superior retrieval performance. In a
broader context, we take the first step to introduce a novel embedding
calibration approach for the retrieval task, filling the technical blank in the
current field of embedding calibration. This approach also paves the way for
building domain-specific dense retrieval models efficiently and inexpensively.Comment: 15 pages, 6 figures, Codes are available at
https://github.com/huangzichun/DREdito
Airlines Content Recommendations Based on Passengers\u27 Choice Using Bayesian Belief Networks
Faced with the increasingly fierce competition in the aviation market, the strategy of consumer choice has gained increasing significance in both academia and practice. As ever-increasing travel choices and growing consumer heterogeneity, how do airline companies satisfy passengers\u27 needs? With a vast amount of data, how do airline managers combine information to excavate the relationship between independent variables to gain insight about passengers\u27 choices and value system as well as determining best personalized contents to them? Using the real case of China Southern Airlines, this paper illustrates how Bayesian belief network (BBN) can enable airlines dynamically recommend relevant contents based on predicting passengers\u27 choice to optimize the loyalty. The findings of this study provide airline companies useful insights to better understand the passengers\u27 choices and develop effective strategies for growing customer relationship
Herbal Medicine Cordyceps sinensis
Moderate-to-severe asthma has a substantial impact on the health-related quality of life (HR-QOL) of the patients. Cordyceps sinensis is a traditional Chinese medicine that is evaluated clinically for the treatment of many diseases, such as chronic allograft nephropathy, diabetic kidney disease, and lung fibrosis. In order to investigate the effects of Cordyceps sinensis on patients with moderate-to-severe persistent asthma, 120 subjects were randomized to receive Corbin capsule containing Cordyceps sinensis for 3 months (treatment group, n=60), whereas the control group (n=60) did not receive treatment with Corbin capsule. Inhaled corticosteroid and as-needed β-agonists were used in the treatment of both groups. HR-QOL was measured with the Juniper’s Asthma Quality of Life Questionnaire (AQLQ). The incidence of asthma exacerbation, pulmonary function testing, and serum measurements of inflammatory mediators were also evaluated. The results showed that the treatment group indicated a significant increase in AQLQ scores and lung function compared with the control group. The expression levels of the inflammation markers IgE, ICAM-1, IL-4, and MMP-9 in the serum were decreased and IgG increased in the treatment group compared with the control group. Therefore, the conclusion was reached that a formulation of Cordyceps sinensis improved the HR-QOL, asthma symptoms, lung function, and inflammatory profile of the patients with moderate-to-severe asthma. This trial is registered with ChiCTR-IPC-16008730
Concept -- An Evaluation Protocol on Conversational Recommender Systems with System-centric and User-centric Factors
The conversational recommendation system (CRS) has been criticized regarding
its user experience in real-world scenarios, despite recent significant
progress achieved in academia. Existing evaluation protocols for CRS may
prioritize system-centric factors such as effectiveness and fluency in
conversation while neglecting user-centric aspects. Thus, we propose a new and
inclusive evaluation protocol, Concept, which integrates both system- and
user-centric factors. We conceptualise three key characteristics in
representing such factors and further divide them into six primary abilities.
To implement Concept, we adopt a LLM-based user simulator and evaluator with
scoring rubrics that are tailored for each primary ability. Our protocol,
Concept, serves a dual purpose. First, it provides an overview of the pros and
cons in current CRS models. Second, it pinpoints the problem of low usability
in the "omnipotent" ChatGPT and offers a comprehensive reference guide for
evaluating CRS, thereby setting the foundation for CRS improvement.Comment: 33 pages, 18 tables, and 10 figures. Our code is available at
https://github.com/huangzichun/Concept4CR
Carbon Footprint Assessment of Large-scale Pig Production System in Northern China: a Case Study
China raises 50% of the global live pigs. However, few studies on carbon footprint (CF) of large-scale pig production based on China’s actual production conditions have been carried out. In this study, life cycle assessment (LCA) method and actual production data of a typical large-scale pig farm in Northern China were used to assess greenhouse gas (GHG) emissions or CF associated with the whole process of pig production, including feed production (crop planting, feed processing, and transportation), enteric fermentation, manure management and energy consumption. The results showed a CF of 3.39 kg CO2-eq per kg of live market pig, and relative contributions of 55%, 28%, 13%, and 4% to the total CF by feed production, manure management, farm energy consumption, and enteric fermentation, respectively. Crop planting accounted for 66% of the feed production CF, while feed processing and transportation accounted for the remaining 34%. Long-distance transport of semi-raw feed materials caused by planting-feeding separation and over-fertilization in feed crop planting were two main reasons for the largest contribution of GHG emissions from feed production for the total CF. CF from nitrogen fertilizer application accounted for 33%-44% of crop planting, and contributed to 16% of the total CF. CF from transportation of feed ingredients accounted for 17% of the total CF. If the amount of nitrogen fertilizer used for producing the main feed ingredients is reduced from 209 kg/hm2 (for corn) and 216 kg/hm2 (for wheat) to 140 kg/hm2 (corn) and 180 kg/hm2 (wheat), respectively, the total CF would be reduced by 7%. If transportation distance for feed materials decreased from 325-493 km to 30 km, along with reducing the number of empty vehicles for the transport, total CF would be reduced by 18%. The combined CF mitigation potential for over-fertilization and transportation distance is 26%. In addition, use of pit storage – anaerobic digestion – lagoon practice can reduce GHG emissions from manure management by 76% as compared to the traditional pit storage – lagoon manure treatment method. This case study reveals the impact of planting-feeding separation and over-fertilization on CF of pig supply chain in China. Manure management practice of pit storage – anaerobic digestion – lagoon is much more conductive to reducing CF as compared to the traditional method of pit storage – lagoon
R\'{e}nyi Divergence Deep Mutual Learning
This paper revisits Deep Mutual Learning (DML), a simple yet effective
computing paradigm. We propose using R\'{e}nyi divergence instead of the KL
divergence, which is more flexible and tunable, to improve vanilla DML. This
modification is able to consistently improve performance over vanilla DML with
limited additional complexity. The convergence properties of the proposed
paradigm are analyzed theoretically, and Stochastic Gradient Descent with a
constant learning rate is shown to converge with -bias in the
worst case scenario for nonconvex optimization tasks. That is, learning will
reach nearby local optima but continue searching within a bounded scope, which
may help mitigate overfitting. Finally, our extensive empirical results
demonstrate the advantage of combining DML and R\'{e}nyi divergence, which
further improves generalized models
- …