289 research outputs found

    Reconciliation of Pre-trained Models and Prototypical Neural Networks in Few-shot Named Entity Recognition

    Full text link
    Incorporating large-scale pre-trained models with the prototypical neural networks is a de-facto paradigm in few-shot named entity recognition. Existing methods, unfortunately, are not aware of the fact that embeddings from pre-trained models contain a prominently large amount of information regarding word frequencies, biasing prototypical neural networks against learning word entities. This discrepancy constrains the two models' synergy. Thus, we propose a one-line-code normalization method to reconcile such a mismatch with empirical and theoretical grounds. Our experiments based on nine benchmark datasets show the superiority of our method over the counterpart models and are comparable to the state-of-the-art methods. In addition to the model enhancement, our work also provides an analytical viewpoint for addressing the general problems in few-shot name entity recognition or other tasks that rely on pre-trained models or prototypical neural networks.Comment: Findings of EMNLP 202

    Towards Equipping Transformer with the Ability of Systematic Compositionality

    Full text link
    One of the key factors in language productivity and human cognition is the ability of systematic compositionality, which refers to understanding composed unseen examples of seen primitives. However, recent evidence reveals that the Transformers have difficulty generalizing the composed context based on the seen primitives. To this end, we take the first step to propose a compositionality-aware Transformer called CAT and two novel pre-training tasks to facilitate systematic compositionality. We tentatively provide a successful implementation of a multi-layer CAT on the basis of the especially popular BERT. The experimental results demonstrate that CAT outperforms baselines on compositionality-aware tasks with minimal impact on the effectiveness on standardized language understanding tasks.Comment: Accepted to AAAI 2024. Paper with appendi

    DREditor: An Time-efficient Approach for Building a Domain-specific Dense Retrieval Model

    Full text link
    Deploying dense retrieval models efficiently is becoming increasingly important across various industries. This is especially true for enterprise search services, where customizing search engines to meet the time demands of different enterprises in different domains is crucial. Motivated by this, we develop a time-efficient approach called DREditor to edit the matching rule of an off-the-shelf dense retrieval model to suit a specific domain. This is achieved by directly calibrating the output embeddings of the model using an efficient and effective linear mapping. This mapping is powered by an edit operator that is obtained by solving a specially constructed least squares problem. Compared to implicit rule modification via long-time finetuning, our experimental results show that DREditor provides significant advantages on different domain-specific datasets, dataset sources, retrieval models, and computing devices. It consistently enhances time efficiency by 100-300 times while maintaining comparable or even superior retrieval performance. In a broader context, we take the first step to introduce a novel embedding calibration approach for the retrieval task, filling the technical blank in the current field of embedding calibration. This approach also paves the way for building domain-specific dense retrieval models efficiently and inexpensively.Comment: 15 pages, 6 figures, Codes are available at https://github.com/huangzichun/DREdito

    Airlines Content Recommendations Based on Passengers\u27 Choice Using Bayesian Belief Networks

    Get PDF
    Faced with the increasingly fierce competition in the aviation market, the strategy of consumer choice has gained increasing significance in both academia and practice. As ever-increasing travel choices and growing consumer heterogeneity, how do airline companies satisfy passengers\u27 needs? With a vast amount of data, how do airline managers combine information to excavate the relationship between independent variables to gain insight about passengers\u27 choices and value system as well as determining best personalized contents to them? Using the real case of China Southern Airlines, this paper illustrates how Bayesian belief network (BBN) can enable airlines dynamically recommend relevant contents based on predicting passengers\u27 choice to optimize the loyalty. The findings of this study provide airline companies useful insights to better understand the passengers\u27 choices and develop effective strategies for growing customer relationship

    Herbal Medicine Cordyceps sinensis

    Get PDF
    Moderate-to-severe asthma has a substantial impact on the health-related quality of life (HR-QOL) of the patients. Cordyceps sinensis is a traditional Chinese medicine that is evaluated clinically for the treatment of many diseases, such as chronic allograft nephropathy, diabetic kidney disease, and lung fibrosis. In order to investigate the effects of Cordyceps sinensis on patients with moderate-to-severe persistent asthma, 120 subjects were randomized to receive Corbin capsule containing Cordyceps sinensis for 3 months (treatment group, n=60), whereas the control group (n=60) did not receive treatment with Corbin capsule. Inhaled corticosteroid and as-needed β-agonists were used in the treatment of both groups. HR-QOL was measured with the Juniper’s Asthma Quality of Life Questionnaire (AQLQ). The incidence of asthma exacerbation, pulmonary function testing, and serum measurements of inflammatory mediators were also evaluated. The results showed that the treatment group indicated a significant increase in AQLQ scores and lung function compared with the control group. The expression levels of the inflammation markers IgE, ICAM-1, IL-4, and MMP-9 in the serum were decreased and IgG increased in the treatment group compared with the control group. Therefore, the conclusion was reached that a formulation of Cordyceps sinensis improved the HR-QOL, asthma symptoms, lung function, and inflammatory profile of the patients with moderate-to-severe asthma. This trial is registered with ChiCTR-IPC-16008730

    Concept -- An Evaluation Protocol on Conversational Recommender Systems with System-centric and User-centric Factors

    Full text link
    The conversational recommendation system (CRS) has been criticized regarding its user experience in real-world scenarios, despite recent significant progress achieved in academia. Existing evaluation protocols for CRS may prioritize system-centric factors such as effectiveness and fluency in conversation while neglecting user-centric aspects. Thus, we propose a new and inclusive evaluation protocol, Concept, which integrates both system- and user-centric factors. We conceptualise three key characteristics in representing such factors and further divide them into six primary abilities. To implement Concept, we adopt a LLM-based user simulator and evaluator with scoring rubrics that are tailored for each primary ability. Our protocol, Concept, serves a dual purpose. First, it provides an overview of the pros and cons in current CRS models. Second, it pinpoints the problem of low usability in the "omnipotent" ChatGPT and offers a comprehensive reference guide for evaluating CRS, thereby setting the foundation for CRS improvement.Comment: 33 pages, 18 tables, and 10 figures. Our code is available at https://github.com/huangzichun/Concept4CR

    Carbon Footprint Assessment of Large-scale Pig Production System in Northern China: a Case Study

    Get PDF
    China raises 50% of the global live pigs. However, few studies on carbon footprint (CF) of large-scale pig production based on China’s actual production conditions have been carried out. In this study, life cycle assessment (LCA) method and actual production data of a typical large-scale pig farm in Northern China were used to assess greenhouse gas (GHG) emissions or CF associated with the whole process of pig production, including feed production (crop planting, feed processing, and transportation), enteric fermentation, manure management and energy consumption. The results showed a CF of 3.39 kg CO2-eq per kg of live market pig, and relative contributions of 55%, 28%, 13%, and 4% to the total CF by feed production, manure management, farm energy consumption, and enteric fermentation, respectively. Crop planting accounted for 66% of the feed production CF, while feed processing and transportation accounted for the remaining 34%. Long-distance transport of semi-raw feed materials caused by planting-feeding separation and over-fertilization in feed crop planting were two main reasons for the largest contribution of GHG emissions from feed production for the total CF. CF from nitrogen fertilizer application accounted for 33%-44% of crop planting, and contributed to 16% of the total CF. CF from transportation of feed ingredients accounted for 17% of the total CF. If the amount of nitrogen fertilizer used for producing the main feed ingredients is reduced from 209 kg/hm2 (for corn) and 216 kg/hm2 (for wheat) to 140 kg/hm2 (corn) and 180 kg/hm2 (wheat), respectively, the total CF would be reduced by 7%. If transportation distance for feed materials decreased from 325-493 km to 30 km, along with reducing the number of empty vehicles for the transport, total CF would be reduced by 18%. The combined CF mitigation potential for over-fertilization and transportation distance is 26%. In addition, use of pit storage – anaerobic digestion – lagoon practice can reduce GHG emissions from manure management by 76% as compared to the traditional pit storage – lagoon manure treatment method. This case study reveals the impact of planting-feeding separation and over-fertilization on CF of pig supply chain in China. Manure management practice of pit storage – anaerobic digestion – lagoon is much more conductive to reducing CF as compared to the traditional method of pit storage – lagoon

    R\'{e}nyi Divergence Deep Mutual Learning

    Full text link
    This paper revisits Deep Mutual Learning (DML), a simple yet effective computing paradigm. We propose using R\'{e}nyi divergence instead of the KL divergence, which is more flexible and tunable, to improve vanilla DML. This modification is able to consistently improve performance over vanilla DML with limited additional complexity. The convergence properties of the proposed paradigm are analyzed theoretically, and Stochastic Gradient Descent with a constant learning rate is shown to converge with O(1)\mathcal{O}(1)-bias in the worst case scenario for nonconvex optimization tasks. That is, learning will reach nearby local optima but continue searching within a bounded scope, which may help mitigate overfitting. Finally, our extensive empirical results demonstrate the advantage of combining DML and R\'{e}nyi divergence, which further improves generalized models
    • …
    corecore