165 research outputs found

    DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning

    Full text link
    Contrastive-learning-based methods have dominated sentence representation learning. These methods regularize the representation space by pulling similar sentence representations closer and pushing away the dissimilar ones and have been proven effective in various NLP tasks, e.g., semantic textual similarity (STS) tasks. However, it is challenging for these methods to learn fine-grained semantics as they only learn from the inter-sentence perspective, i.e., their supervision signal comes from the relationship between data samples. In this work, we propose a novel denoising objective that inherits from another perspective, i.e., the intra-sentence perspective. By introducing both discrete and continuous noise, we generate noisy sentences and then train our model to restore them to their original form. Our empirical evaluations demonstrate that this approach delivers competitive results on both semantic textual similarity (STS) and a wide range of transfer tasks, standing up well in comparison to contrastive-learning-based methods. Notably, the proposed intra-sentence denoising objective complements existing inter-sentence contrastive methodologies and can be integrated with them to further enhance performance. Our code is available at https://github.com/xinghaow99/DenoSent.Comment: AAAI 202

    Turn Waste into Worth: Rectifying Top-kk Router of MoE

    Full text link
    Sparse Mixture of Experts (MoE) models are popular for training large language models due to their computational efficiency. However, the commonly used top-kk routing mechanism suffers from redundancy computation and memory costs due to the unbalanced routing. Some experts are overflow, where the exceeding tokens are dropped. While some experts are vacant, which are padded with zeros, negatively impacting model performance. To address the dropped tokens and padding, we propose the Rectify-Router, comprising the Intra-GPU Rectification and the Fill-in Rectification. The Intra-GPU Rectification handles dropped tokens, efficiently routing them to experts within the GPU where they are located to avoid inter-GPU communication. The Fill-in Rectification addresses padding by replacing padding tokens with the tokens that have high routing scores. Our experimental results demonstrate that the Intra-GPU Rectification and the Fill-in Rectification effectively handle dropped tokens and padding, respectively. Furthermore, the combination of them achieves superior performance, surpassing the accuracy of the vanilla top-1 router by 4.7%

    A flexible and accurate total variation and cascaded denoisers-based image reconstruction algorithm for hyperspectrally compressed ultrafast photography

    Full text link
    Hyperspectrally compressed ultrafast photography (HCUP) based on compressed sensing and the time- and spectrum-to-space mappings can simultaneously realize the temporal and spectral imaging of non-repeatable or difficult-to-repeat transient events passively in a single exposure. It possesses an incredibly high frame rate of tens of trillions of frames per second and a sequence depth of several hundred, and plays a revolutionary role in single-shot ultrafast optical imaging. However, due to the ultra-high data compression ratio induced by the extremely large sequence depth as well as the limited fidelities of traditional reconstruction algorithms over the reconstruction process, HCUP suffers from a poor image reconstruction quality and fails to capture fine structures in complex transient scenes. To overcome these restrictions, we propose a flexible image reconstruction algorithm based on the total variation (TV) and cascaded denoisers (CD) for HCUP, named the TV-CD algorithm. It applies the TV denoising model cascaded with several advanced deep learning-based denoising models in the iterative plug-and-play alternating direction method of multipliers framework, which can preserve the image smoothness while utilizing the deep denoising networks to obtain more priori, and thus solving the common sparsity representation problem in local similarity and motion compensation. Both simulation and experimental results show that the proposed TV-CD algorithm can effectively improve the image reconstruction accuracy and quality of HCUP, and further promote the practical applications of HCUP in capturing high-dimensional complex physical, chemical and biological ultrafast optical scenes.Comment: 25 pages, 5 figures and 1 tabl

    The Effect of UV-Irradiation (under Short-Circuit Condition) on Dye-Sensitized Solar Cells Sensitized with a Ru-Complex Dye Functionalized with a (diphenylamino)Styryl-Thiophen Group

    Get PDF
    A new ruthenium complex, cis-di(thiocyanato)(2,2′-bipyridine-4,4′-dicarboxylic acid)(4,4′-bis(2-(5-(2-(4-diphenylaminophenyl)ethenyl)-thiophen-2-yl)ethenyl)-2,2′-bipyridine)ruthenium(II) (named E322) has been synthesized for use in dye-sensitized solar cells (DSCs). Higher extinction coefficient and a broader absorption compared to the standard Ru-dye, N719, were aimed. DSCs were fabricated with E322, and the efficiency was 0.12% initially. (4.06% for N719, as reference). The efficiency was enhanced to 1.83% by exposing the cell under simulated sunlight containing UV-irradiation at short-circuit condition. The reasons of this enhancement are (1) enhanceing electron injection from sensitizer to TiO2 following a shift toward positive potentials of the conduction band of TiO2 by the adsorption of protons or cations from the sensitizer, or from the redox electrolyte and (2) improving the regeneration reaction of the oxidized dye by the redox electrolyte by the dissolution of aggregated dye from the surface of TiO2 following the treatment

    AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling

    Full text link
    We introduce AnyGPT, an any-to-any multimodal language model that utilizes discrete representations for the unified processing of various modalities, including speech, text, images, and music. AnyGPT can be trained stably without any alterations to the current large language model (LLM) architecture or training paradigms. Instead, it relies exclusively on data-level preprocessing, facilitating the seamless integration of new modalities into LLMs, akin to the incorporation of new languages. We build a multimodal text-centric dataset for multimodal alignment pre-training. Utilizing generative models, we synthesize the first large-scale any-to-any multimodal instruction dataset. It consists of 108k samples of multi-turn conversations that intricately interweave various modalities, thus equipping the model to handle arbitrary combinations of multimodal inputs and outputs. Experimental results demonstrate that AnyGPT is capable of facilitating any-to-any multimodal conversation while achieving performance comparable to specialized models across all modalities, proving that discrete representations can effectively and conveniently unify multiple modalities within a language model. Demos are shown in https://junzhan2000.github.io/AnyGPT.github.io/Comment: 28 pages, 16 figures, under review, work in progres

    Effect of Emodin on Preventing Postoperative Intra-Abdominal Adhesion Formation

    Get PDF
    Background. Postoperative intra-abdominal adhesions are a major complication after abdominal surgery. Although various methods have been used to prevent and treat adhesions, the effects have not been satisfactory. Emodin, a naturally occurring anthraquinone derivative and an active ingredient in traditional Chinese herbs, exhibits a variety of pharmacological effects. In our study, we demonstrated the effect of emodin treatment on preventing postoperative adhesion formation. Materials and Methods. A total of 48 rats were divided into six groups. Abdominal adhesions were created by abrasion of the cecum and its opposite abdominal wall. In the experimental groups, the rats were administered daily oral doses of emodin. On the seventh day after operation, the rats were euthanized, and blood and pathological specimens were collected. Abdominal adhesion formation was evaluated by necropsy, pathology, immunohistochemistry, Western blot, and enzyme-linked immunosorbent assay analyses. Results. Abdominal adhesions were markedly reduced by emodin treatment. Compared with the control group, collagen deposition was reduced and the peritoneal mesothelial completeness rate was higher in the emodin-treated groups. Emodin had anti-inflammatory effects, reduced oxidative stress, and promoted the movement of the intestinal tract (P<0.05). Conclusion. Emodin significantly reduced intra-abdominal adhesion formation in a rat model

    Check on the features of potted 20-inch PMTs with 1F3 electronics prototype at Pan-Asia

    Full text link
    The Jiangmen underground neutrino observatory (JUNO) is a neutrino project with a 20-kton liquid scintillator detector located at 700-m underground. The large 20-inch PMTs are one of the crucial components of the JUNO experiment aiming to precision neutrino measurements with better than 3% energy resolution at 1 MeV. The excellent energy resolution and a large fiducial volume provide many exciting opportunities for addressing important topics in neutrino and astro-particle physics. With the container #D at JUNO Pan-Asia PMT testing and potting station, the features of waterproof potted 20-inch PMTs were measured with JUNO 1F3 electronics prototype in waveform and charge, which are valuable for better understanding on the performance of the waterproof potted PMTs and the JUNO 1F3 electronics. In this paper, basic features of JUNO 1F3 electronics prototype run at Pan-Asia will be introduced, followed by an analysis of the waterproof potted 20-inch PMTs and a comparison with the results from commercial electronics used by the container #A and #B
    • …
    corecore