1,671 research outputs found

    Superiority of Multi-Head Attention in In-Context Linear Regression

    Full text link
    We present a theoretical analysis of the performance of transformer with softmax attention in in-context learning with linear regression tasks. While the existing literature predominantly focuses on the convergence of transformers with single-/multi-head attention, our research centers on comparing their performance. We conduct an exact theoretical analysis to demonstrate that multi-head attention with a substantial embedding dimension performs better than single-head attention. When the number of in-context examples D increases, the prediction loss using single-/multi-head attention is in O(1/D), and the one for multi-head attention has a smaller multiplicative constant. In addition to the simplest data distribution setting, we consider more scenarios, e.g., noisy labels, local examples, correlated features, and prior knowledge. We observe that, in general, multi-head attention is preferred over single-head attention. Our results verify the effectiveness of the design of multi-head attention in the transformer architecture

    Effects of surface-functionalized aluminum nitride on thermal, electrical, and mechanical behaviors of polyarylene ether nitrile-based composites

    Get PDF
    Aluminum nitride (AlN) with high thermal conductivity was blended in polyarylene ether nitrile (PEN) to obtain a composite system. A ball milling process could provide AlN particles of smaller size with higher surface silylation for homogeneous particle distribution in polymeric matrix. Thermal, electrical, and mechanical behaviors of the produced composites were characterized to investigate the effects of particles on the performance of PEN-based composites with functionalized AlN. The composite exhibited thermal conductivity of 0.779 W m−1 K−1, a dielectric constant of 7.7, dielectric loss of 0.032, electrical resistivity of 1.39 GΩ.cm, and break strength of 36 N when the fraction of functionalized AlN increased to 42.3 vol%. A fitted equation based on the improved Russell's model could effectively predict a trend for thermal conductivity of the composite systems with consideration of interfacial resistance between AlN and surrounding PEN

    Metformin alleviates hepatic iron overload and ferroptosis through AMPK-ferroportin pathway in HFD-induced NAFLD

    Get PDF
    Highlights Metformin alleviates HIO and ferroptosis in HFD-induced NAFLD FPN is involved in the molecular mechanism of metformin on HIO in HFD-induced NAFLD Metformin upregulates FPN expression by reducing lysosomal ubiquitination degradation Summary Metformin prevents progression of non-alcoholic fatty liver disease (NAFLD). However, the potential mechanism is not entirely understood. Ferroptosis, a recently recognized nonapoptotic form of regulated cell death, has been reported to be involved in the pathogenesis of NAFLD. Here, we investigated the effects of metformin on ferroptosis and its potential mechanism in NAFLD. We found that metformin prevented the progression of NAFLD, and alleviated hepatic iron overload (HIO), ferroptosis and upregulated ferroportin (FPN) expression in vivo and in vitro. Mechanically, metformin reduced the lysosomal degradation pathway of FPN through activation AMPK, thus upregulated the expression of FPN protein, alleviated HIO and ferroptosis, and prevented progression of NAFLD. These findings discover a mechanism of metformin, suggesting that targeting FPN may have the therapeutic potential for treating NAFLD and related disorders

    Exploring Memorization in Fine-tuned Language Models

    Full text link
    LLMs have shown great capabilities in various tasks but also exhibited memorization of training data, thus raising tremendous privacy and copyright concerns. While prior work has studied memorization during pre-training, the exploration of memorization during fine-tuning is rather limited. Compared with pre-training, fine-tuning typically involves sensitive data and diverse objectives, thus may bring unique memorization behaviors and distinct privacy risks. In this work, we conduct the first comprehensive analysis to explore LMs' memorization during fine-tuning across tasks. Our studies with open-sourced and our own fine-tuned LMs across various tasks indicate that fine-tuned memorization presents a strong disparity among tasks. We provide an understanding of this task disparity via sparse coding theory and unveil a strong correlation between memorization and attention score distribution. By investigating its memorization behavior, multi-task fine-tuning paves a potential strategy to mitigate fine-tuned memorization

    Rapid Detection of SARS-CoV-2 Nucleocapsid Protein by a Label-Free Biosensor Based on Optical Fiber Cylindrical Micro-Resonator

    Get PDF
    The current global outbreak of coronavirus (COVID-19) continues to be a severe threat to human health. Rapid, low-cost, and accurate antigen detection methods are very important for disease diagnosis. The severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) nucleocapsid protein (N-Protein) is often used as the diagnostic and screening for coronavirus detection. To this end, we propose and experimentally validate a highly sensitive whispering gallery mode (WGM) optical cylindrical micro-resonator (CMR) for bio immunoassay detection. To study the biokinetic process of immunoassay, the surface of the WGM micro-resonator is functionalized with N-Protein monoclonal antibody (N-Protein-m Ab), which led to the specific detection of N-Proteins. The spectral characteristics of the WGM resonance dip were investigated, and it is found that the transmission spectrum of WGM shows a monotonically increasing red-shift as a function of recording time. The WGM red-shift is due to the antibody-antigen reaction and can be used for the analysis of the immunoassay process. The wavelength shift is shown to be proportional to the concentration of N-Protein, which ranges between 0.1 and 100 μg /mL. Finally, different types of samples (concentration of 10 μg /mL of N-Protein) were prepared and tested to simulate the specificity of the sensor in the practical application environment. This method has the merits of a rapid assay, lower expense, easy preparation, and miniaturization, which makes the sensor have the potential for broad applications in the field of biochemistry and biomedical detection
    corecore