72 research outputs found

    Reinforced Multi-Teacher Selection for Knowledge Distillation

    Full text link
    In natural language processing (NLP) tasks, slow inference speed and huge footprints in GPU usage remain the bottleneck of applying pre-trained deep models in production. As a popular method for model compression, knowledge distillation transfers knowledge from one or multiple large (teacher) models to a small (student) model. When multiple teacher models are available in distillation, the state-of-the-art methods assign a fixed weight to a teacher model in the whole distillation. Furthermore, most of the existing methods allocate an equal weight to every teacher model. In this paper, we observe that, due to the complexity of training examples and the differences in student model capability, learning differentially from teacher models can lead to better performance of student models distilled. We systematically develop a reinforced method to dynamically assign weights to teacher models for different training instances and optimize the performance of student model. Our extensive experimental results on several NLP tasks clearly verify the feasibility and effectiveness of our approach.Comment: AAAI 202

    Characterization of Akirin 2 gene in Langshan chicken

    Get PDF
    Akirin play an important role not only in innate immune response but also in skeletal myogenesis. The chicken’s Akirin gene family only has Akirin 2. We detected the coding sequences of the Akirin 2 gene in a Chinese indigenous chicken (Langshan chicken) population to investigate the possibility of using this gene for chicken marker-assisted selection. The results of PCR-SSCP and DNA sequence showed that there were no polymorphisms in the six amplified regions of the Akirin 2 gene. In addition, there was no SNP record about the chicken’s Akirin 2 gene by searching in NCBI. Thus, the Akirin 2 gene may not be suitable to be used in chicken as a gene marker for marker- assisted selection

    BeamSearchQA: Large Language Models are Strong Zero-Shot QA Solver

    Full text link
    Open-domain question answering is a crucial task that often requires accessing external information. Existing methods typically adopt a single-turn retrieve-then-read approach, where relevant documents are first retrieved, and questions are then answered based on the retrieved information. However, there are cases where answering a question requires implicit knowledge that is not directly retrievable from the question itself. In this work, we propose a novel question-answering pipeline called BeamSearchQA. Our approach leverages large language models to iteratively generate new questions about the original question, enabling an iterative reasoning process. By iteratively refining and expanding the scope of the question, our method aims to capture and utilize hidden knowledge that may not be directly obtainable through retrieval. We evaluate our approach on the widely-used open-domain NQ and WebQ datasets. The experimental results demonstrate that BeamSearchQA significantly outperforms other zero-shot baselines, indicating its effectiveness in tackling the challenges of open-domain question answering.Comment: Work in progres

    LEAD: Liberal Feature-based Distillation for Dense Retrieval

    Full text link
    Knowledge distillation is often used to transfer knowledge from a strong teacher model to a relatively weak student model. Traditional knowledge distillation methods include response-based methods and feature-based methods. Response-based methods are used the most widely but suffer from lower upper limit of model performance, while feature-based methods have constraints on the vocabularies and tokenizers. In this paper, we propose a tokenizer-free method liberal feature-based distillation (LEAD). LEAD aligns the distribution between teacher model and student model, which is effective, extendable, portable and has no requirements on vocabularies, tokenizer, or model architecture. Extensive experiments show the effectiveness of LEAD on several widely-used benchmarks, including MS MARCO Passage, TREC Passage 19, TREC Passage 20, MS MARCO Document, TREC Document 19 and TREC Document 20.Comment: Work in progres

    Direct-Current Generator Based on Dynamic Water-Semiconductor Junction with Polarized Water as Moving Dielectric Medium

    Full text link
    There is a rising prospective in harvesting energy from water droplets, as microscale energy is required for the distributed sensors in the interconnected human society. However, achieving a sustainable direct-current generating device from water flow is rarely reported, and the quantum polarization principle of the water molecular remains uncovered. Herein, we propose a dynamic water-semiconductor junction with moving water sandwiched between two semiconductors as a moving dielectric medium, which outputs a sustainable direct-current voltage of 0.3 V and current of 0.64 uA with low internal resistance of 390 kilohm. The sustainable direct-current electricity is originating from the dynamic water polarization process in water-semiconductor junction, in which water molecules are continuously polarized and depolarized driven by the mechanical force and Fermi level difference, during the movement of the water on silicon. We further demonstrated an encapsulated portable power-generating device with simple structure and continuous direct-current voltage, which exhibits its promising potential application in the field of wearable electronic generators

    TaskMatrix.AI: Completing Tasks by Connecting Foundation Models with Millions of APIs

    Full text link
    Artificial Intelligence (AI) has made incredible progress recently. On the one hand, advanced foundation models like ChatGPT can offer powerful conversation, in-context learning and code generation abilities on a broad range of open-domain tasks. They can also generate high-level solution outlines for domain-specific tasks based on the common sense knowledge they have acquired. However, they still face difficulties with some specialized tasks because they lack enough domain-specific data during pre-training or they often have errors in their neural network computations on those tasks that need accurate executions. On the other hand, there are also many existing models and systems (symbolic-based or neural-based) that can do some domain-specific tasks very well. However, due to the different implementation or working mechanisms, they are not easily accessible or compatible with foundation models. Therefore, there is a clear and pressing need for a mechanism that can leverage foundation models to propose task solution outlines and then automatically match some of the sub-tasks in the outlines to the off-the-shelf models and systems with special functionalities to complete them. Inspired by this, we introduce TaskMatrix.AI as a new AI ecosystem that connects foundation models with millions of APIs for task completion. Unlike most previous work that aimed to improve a single AI model, TaskMatrix.AI focuses more on using existing foundation models (as a brain-like central system) and APIs of other AI models and systems (as sub-task solvers) to achieve diversified tasks in both digital and physical domains. As a position paper, we will present our vision of how to build such an ecosystem, explain each key component, and use study cases to illustrate both the feasibility of this vision and the main challenges we need to address next

    Comparison of Different Risk-Stratification Systems for the Diagnosis of Benign and Malignant Thyroid Nodules

    Get PDF
    Introduction: To compare the efficacy of four different ultrasound-based risk-stratification systems in assessing the malignancy risk of thyroid nodules in the Chinese population.Methods: We retrospectively reviewed the digital ultrasound images of 1,568 patients (1,612 thyroid nodules) who underwent surgery in our hospital between January 2012 and December 2017. All thyroid nodules were pathologically identified as malignant or benign. We evaluated the following ultrasound characteristics: size, location, composition, echogenicity, shape, margins, calcification or echogenic foci, and extrathyroidal extension. Each nodule was categorized using four risk-stratification systems: the American Thyroid Association (ATA) classification, the Thyroid Imaging, Reporting, and Data System (TIRADS) of the American College of Radiology (ACR-TIRADS), the European Thyroid Association TIRADS (EU-TIRADS), and the TIRADS developed by Kwak et al. (Kwak-TIRADS). The diagnostic performance of each risk-stratification system relative to the pathological results was analyzed. We used receiver operating characteristic curves to identify cutoff values that yielded optimal sensitivity (SEN), specificity (SPE), positive predictive value (PPV), negative predictive value (NPV), and accuracy (ACC).Results: Of the 1,612 nodules, 839 (52.0%) were benign, and 773 (48.0%) were malignant. The AUCs of the ACR-TIRADS, EU-TIRADS, Kwak-TIRADS, and ATA classification were 0.879, 0.872, 0.896, and 0.869, respectively. The Kwak-TIRADS had the best SEN, NPV, ACC, and AUC, while the ACR-TIRADS had the best SPE and PPV.Conclusion: All four risk-stratification systems had good diagnostic performances (AUCs > 86%). Considering its high SEN, NPV, ACC, and AUC, we believe that the Kwak-TIRADS may be the more effective risk-stratification system in the Chinese population
    • …
    corecore