41 research outputs found

    Efficiently Enhancing Zero-Shot Performance of Instruction Following Model via Retrieval of Soft Prompt

    Full text link
    Enhancing the zero-shot performance of instruction-following models requires heavy computation, either by scaling the total number of training datasets or the model size. In this work, we explore how retrieval of soft prompts obtained through prompt tuning can efficiently assist hard prompts in zero-shot task generalization. Specifically, we train soft prompt embeddings for each prompt through prompt tuning, store the samples of the training instances mapped with the prompt embeddings, and retrieve the corresponding prompt embedding of the training instance closest to the query instance during inference. While only adding 0.007% additional parameters, retrieval of soft prompt enhances the performance of T0 on unseen tasks by outperforming it on 10 out of 11 datasets as well as improving the mean accuracy of T0 on BIG-bench benchmark by 2.39% points. Also, we report an interesting finding that retrieving source embeddings trained on similar answer choice formats is more important than those on similar task types.Comment: EMNLP 2023 Finding

    How Well Do Large Language Models Truly Ground?

    Full text link
    Reliance on the inherent knowledge of Large Language Models (LLMs) can cause issues such as hallucinations, lack of control, and difficulties in integrating variable knowledge. To mitigate this, LLMs can be probed to generate responses by grounding on external context, often given as input (knowledge-augmented models). Yet, previous research is often confined to a narrow view of the term "grounding", often only focusing on whether the response contains the correct answer or not, which does not ensure the reliability of the entire response. To address this limitation, we introduce a strict definition of grounding: a model is considered truly grounded when its responses (1) fully utilize necessary knowledge from the provided context, and (2) don't exceed the knowledge within the contexts. We introduce a new dataset and a grounding metric to assess this new definition and perform experiments across 13 LLMs of different sizes and training methods to provide insights into the factors that influence grounding performance. Our findings contribute to a better understanding of how to improve grounding capabilities and suggest an area of improvement toward more reliable and controllable LLM applications

    The CoT Collection: Improving Zero-shot and Few-shot Learning of Language Models via Chain-of-Thought Fine-Tuning

    Full text link
    Language models (LMs) with less than 100B parameters are known to perform poorly on chain-of-thought (CoT) reasoning in contrast to large LMs when solving unseen tasks. In this work, we aim to equip smaller LMs with the step-by-step reasoning capability by instruction tuning with CoT rationales. In order to achieve this goal, we first introduce a new instruction-tuning dataset called the CoT Collection, which augments the existing Flan Collection (including only 9 CoT tasks) with additional 1.84 million rationales across 1,060 tasks. We show that CoT fine-tuning Flan-T5 (3B & 11B) with CoT Collection enables smaller LMs to have better CoT capabilities on unseen tasks. On the BIG-Bench-Hard (BBH) benchmark, we report an average improvement of +4.34% (Flan-T5 3B) and +2.60% (Flan-T5 11B), in terms of zero-shot task accuracy. Furthermore, we show that instruction tuning with CoT Collection allows LMs to possess stronger few-shot learning capabilities on 4 domain-specific tasks, resulting in an improvement of +2.24% (Flan-T5 3B) and +2.37% (Flan-T5 11B), even outperforming ChatGPT utilizing demonstrations until the max length by a +13.98% margin. Our code, the CoT Collection data, and model checkpoints are publicly available.Comment: EMNLP 2023 (Main Conference

    Low-frequency noise in junctionless multigate transistors

    Get PDF
    Low-frequency noise in n-type junctionless multigate transistors was investigated. It can be well understood with the carrier number fluctuations whereas the conduction is mainly limited by the bulk expecting Hooge mobility fluctuations. The trapping/release of charge carriers is related not only to the oxide-semiconductor interface but also to the depleted channel. The volume trap density is in the range of 6-30 x 10(16) cm(-3) eV(-1), which is similar to Si-SiO2 bulk transistors and remarkably lower than in high-k transistors. These results show that the noise in nanowire devices might be affected by additional trapping centers. (C) 2011 American Institute of Physics. (doi:10.1063/1.3569724

    Growth of vertically aligned arrays of carbon nanotubes for high field emission

    No full text
    International audienceVertically aligned multi-walled carbon nanotubes have been grown on Ni-coated silicon substrates, by using either direct current diode or triode plasma-enhanced chemical vapor deposition at low temperature (around 620 °C). Acetylene gas has been used as the carbon source while ammonia and hydrogen have been used for etching. However densely packed (∼ 109 cm− 2) CNTs were obtained when the pressure was ∼ 100 Pa. The alignment of nanotubes is a necessary, but not a sufficient condition in order to get an efficient electron emission: the growth of nanotubes should be controlled along regular arrays, in order to minimize the electrostatic interactions between them. So a three dimensional numerical simulation has been developed to calculate the local electric field in the vicinity of the tips for a finite square array of nanotubes and thus to calculate the maximum of the electron emission current density as a function of the spacing between nanotubes. Finally the triode plasma- enhanced process combined with pre-patterned catalyst films (using different lithography techniques) has been chosen in order to grow regular arrays of aligned CNTs with different pitches in the micrometer range. The comparison between the experimental and the simulation data permits to define the most efficient CNT-based electron field emitter

    Tactile Avatar: Tactile Sensing System Mimicking Human Tactile Cognition

    Get PDF
    As a surrogate for human tactile cognition, an artificial tactile perception and cognition system are proposed to produce smooth/soft and rough tactile sensations by its user's tactile feeling; and named this system as “tactile avatar”. A piezoelectric tactile sensor is developed to record dynamically various physical information such as pressure, temperature, hardness, sliding velocity, and surface topography. For artificial tactile cognition, the tactile feeling of humans to various tactile materials ranging from smooth/soft to rough are assessed and found variation among participants. Because tactile responses vary among humans, a deep learning structure is designed to allow personalization through training based on individualized histograms of human tactile cognition and recording physical tactile information. The decision error in each avatar system is less than 2% when 42 materials are used to measure the tactile data with 100 trials for each material under 1.2N of contact force with 4cm s−1 of sliding velocity. As a tactile avatar, the machine categorizes newly experienced materials based on the tactile knowledge obtained from training data. The tactile sensation showed a high correlation with the specific user's tendency. This approach can be applied to electronic devices with tactile emotional exchange capabilities, as well as advanced digital experiences. © 2021 The Authors. Advanced Science published by Wiley-VCH GmbH1

    Prediction of Atmospheric Duct Conditions from a Clutter Power Spectrum Using Deep Learning

    No full text
    This paper presents a method for predicting atmospheric duct conditions from a clutter power spectrum using deep learning. To accurately predict the duct conditions, deep learning with a binary classification is applied to the proposed refractivity from the clutter (RFC) method. The input data set is the artificial clutter data that are generated via the Advanced Refractive Prediction System (AREPS) simulation software Ver. 3.6 in conjunction with random atmospheric refractive indices. The output of the RFC method is then predicted via binary classification, indicating whether the atmospheric conditions are duct or non-duct. For the cross-validation, the clutter power spectrum data are generated based on real atmospheric refractivity data. The results show that the DNN trained with 5600 pieces of data (validation accuracy of 95.99%) exhibits a binary classification accuracy of 98.36%. The deep neural network (DNN) trained with 28,000 pieces of data (validation accuracy of 98.20%) achieves a binary classification accuracy of 99.06% with an F1-score of 0.9921

    Design of a Stacked Dual-Patch Antenna with 3D Printed Thick Quasi-Air Substrates and a Cavity Wall for Wideband Applications

    No full text
    In this paper, we propose a stacked dual-patch antenna with 3D printed thick quasi-air substrates and a cavity wall for wideband applications. To achieve the theoretical maximum bandwidth of the patch antenna, the quality factor of the system needs to be minimized. To achieve this, the area of the conductive radiator should be enlarged, while the permittivity of the substrate within the patch must be reduced close to 1. To realize a patch antenna with this maximum bandwidth, the stacked dual-patch configuration is employed to obtain an extended conductive radiator area. In addition, square-pipe resin frames manufactured using a 3D printing method are applied to the proposed antenna to implement a quasi-air substrate structure that has a low permittivity value close to 1. The proposed stacked dual-patch antenna with a quasi-air substrate has a broad bandwidth of 20.7%. The results demonstrate that by using the proposed antenna structure, broadband characteristics close to the fundamental bandwidth limit of the patch antenna can be achieved

    Design of a Shared-Aperture Dual-Loop Antenna Using a Mutual Complementary Shape to Improve an Electromagnetic Transparent Characteristics Between S/X-Band Elements

    No full text
    In this paper, we propose an S/X-band shared-aperture array antenna with a mutual complementary design to improve the electromagnetic (EM) transparent characteristics. A unit-cell of the proposed antenna includes one dual-loop element for the S-band and 3×33\times 3 dual-loop elements for the X-band. To configure the shared-aperture structure in a limited space, the S-band element is stacked on top of the X-band elements. To solve the practical engineering problems of the shared-aperture antennas, novel design techniques such as using a mutual complementary structure, a coupling compensation array, an interface layer, and an antenna modularization are employed. To verify the antenna feasibility, the fabricated unit-cell extends into a 4×44\times 4 unit-cell array. The fractional bandwidth of the reflection coefficients for the proposed array are 14.7% and 15% in the S- and X-bands, respectively. In the S-band, as the steering direction of the main beam increases from 0° to 45°, the maximum gain decreases from 14.6 dBi to 11.8 dBi. In the X-band under the same conditions, the maximum gain varies from 26.6 dBi to 25.3 dBi

    Statistical Indoor Exclusion Zone Analysis by Investigating Electromagnetic Fields inside a Nuclear Power Plant

    No full text
    This article investigates a statistical indoor exclusion zone (EZ) that can be efficiently applied to a nuclear power plant (NPP) by examining electromagnetic fields inside the actual NPP. To obtain the statistical indoor EZ, the indoor environment of the Korea Institute of Nuclear Safety (KINS) simulator room is modeled using the Wireless InSite commercial electromagnetic simulation software. The indoor space around the transmitting antenna is classified as multiple observation regions, and the EZ boundaries of each region are independently defined within each separate observation region. The EZ boundaries are then obtained using a margined regression model, which makes it possible to determine a reasonable boundary of the statistical indoor EZ. To validate the statistical indoor EZ, the received power inside the KINS simulator room is then measured, which agrees well with the simulated results. The results demonstrate that the proposed statistical indoor EZ can be properly obtained not only from the simulation data but also from the measurement data
    corecore