165 research outputs found
DenoSent: A Denoising Objective for Self-Supervised Sentence Representation Learning
Contrastive-learning-based methods have dominated sentence representation
learning. These methods regularize the representation space by pulling similar
sentence representations closer and pushing away the dissimilar ones and have
been proven effective in various NLP tasks, e.g., semantic textual similarity
(STS) tasks. However, it is challenging for these methods to learn fine-grained
semantics as they only learn from the inter-sentence perspective, i.e., their
supervision signal comes from the relationship between data samples. In this
work, we propose a novel denoising objective that inherits from another
perspective, i.e., the intra-sentence perspective. By introducing both discrete
and continuous noise, we generate noisy sentences and then train our model to
restore them to their original form. Our empirical evaluations demonstrate that
this approach delivers competitive results on both semantic textual similarity
(STS) and a wide range of transfer tasks, standing up well in comparison to
contrastive-learning-based methods. Notably, the proposed intra-sentence
denoising objective complements existing inter-sentence contrastive
methodologies and can be integrated with them to further enhance performance.
Our code is available at https://github.com/xinghaow99/DenoSent.Comment: AAAI 202
Turn Waste into Worth: Rectifying Top- Router of MoE
Sparse Mixture of Experts (MoE) models are popular for training large
language models due to their computational efficiency. However, the commonly
used top- routing mechanism suffers from redundancy computation and memory
costs due to the unbalanced routing. Some experts are overflow, where the
exceeding tokens are dropped. While some experts are vacant, which are padded
with zeros, negatively impacting model performance. To address the dropped
tokens and padding, we propose the Rectify-Router, comprising the Intra-GPU
Rectification and the Fill-in Rectification. The Intra-GPU Rectification
handles dropped tokens, efficiently routing them to experts within the GPU
where they are located to avoid inter-GPU communication. The Fill-in
Rectification addresses padding by replacing padding tokens with the tokens
that have high routing scores. Our experimental results demonstrate that the
Intra-GPU Rectification and the Fill-in Rectification effectively handle
dropped tokens and padding, respectively. Furthermore, the combination of them
achieves superior performance, surpassing the accuracy of the vanilla top-1
router by 4.7%
A flexible and accurate total variation and cascaded denoisers-based image reconstruction algorithm for hyperspectrally compressed ultrafast photography
Hyperspectrally compressed ultrafast photography (HCUP) based on compressed
sensing and the time- and spectrum-to-space mappings can simultaneously realize
the temporal and spectral imaging of non-repeatable or difficult-to-repeat
transient events passively in a single exposure. It possesses an incredibly
high frame rate of tens of trillions of frames per second and a sequence depth
of several hundred, and plays a revolutionary role in single-shot ultrafast
optical imaging. However, due to the ultra-high data compression ratio induced
by the extremely large sequence depth as well as the limited fidelities of
traditional reconstruction algorithms over the reconstruction process, HCUP
suffers from a poor image reconstruction quality and fails to capture fine
structures in complex transient scenes. To overcome these restrictions, we
propose a flexible image reconstruction algorithm based on the total variation
(TV) and cascaded denoisers (CD) for HCUP, named the TV-CD algorithm. It
applies the TV denoising model cascaded with several advanced deep
learning-based denoising models in the iterative plug-and-play alternating
direction method of multipliers framework, which can preserve the image
smoothness while utilizing the deep denoising networks to obtain more priori,
and thus solving the common sparsity representation problem in local similarity
and motion compensation. Both simulation and experimental results show that the
proposed TV-CD algorithm can effectively improve the image reconstruction
accuracy and quality of HCUP, and further promote the practical applications of
HCUP in capturing high-dimensional complex physical, chemical and biological
ultrafast optical scenes.Comment: 25 pages, 5 figures and 1 tabl
The Effect of UV-Irradiation (under Short-Circuit Condition) on Dye-Sensitized Solar Cells Sensitized with a Ru-Complex Dye Functionalized with a (diphenylamino)Styryl-Thiophen Group
A new ruthenium complex, cis-di(thiocyanato)(2,2′-bipyridine-4,4′-dicarboxylic acid)(4,4′-bis(2-(5-(2-(4-diphenylaminophenyl)ethenyl)-thiophen-2-yl)ethenyl)-2,2′-bipyridine)ruthenium(II) (named E322) has been synthesized for use in dye-sensitized solar cells (DSCs). Higher extinction coefficient and a broader absorption compared to the standard Ru-dye, N719, were aimed. DSCs were fabricated with E322, and the efficiency was 0.12% initially. (4.06% for N719, as reference). The efficiency was enhanced to 1.83% by exposing the cell under simulated sunlight containing UV-irradiation at short-circuit condition. The reasons of this enhancement are (1) enhanceing electron injection from sensitizer to TiO2 following a shift toward positive potentials of the conduction band of TiO2 by the adsorption of protons or cations from the sensitizer, or from the redox electrolyte and (2) improving the regeneration reaction of the oxidized dye by the redox electrolyte by the dissolution of aggregated dye from the surface of TiO2 following the treatment
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
We introduce AnyGPT, an any-to-any multimodal language model that utilizes
discrete representations for the unified processing of various modalities,
including speech, text, images, and music. AnyGPT can be trained stably without
any alterations to the current large language model (LLM) architecture or
training paradigms. Instead, it relies exclusively on data-level preprocessing,
facilitating the seamless integration of new modalities into LLMs, akin to the
incorporation of new languages. We build a multimodal text-centric dataset for
multimodal alignment pre-training. Utilizing generative models, we synthesize
the first large-scale any-to-any multimodal instruction dataset. It consists of
108k samples of multi-turn conversations that intricately interweave various
modalities, thus equipping the model to handle arbitrary combinations of
multimodal inputs and outputs. Experimental results demonstrate that AnyGPT is
capable of facilitating any-to-any multimodal conversation while achieving
performance comparable to specialized models across all modalities, proving
that discrete representations can effectively and conveniently unify multiple
modalities within a language model. Demos are shown in
https://junzhan2000.github.io/AnyGPT.github.io/Comment: 28 pages, 16 figures, under review, work in progres
Effect of Emodin on Preventing Postoperative Intra-Abdominal Adhesion Formation
Background. Postoperative intra-abdominal adhesions are a major complication after abdominal surgery. Although various methods have been used to prevent and treat adhesions, the effects have not been satisfactory. Emodin, a naturally occurring anthraquinone derivative and an active ingredient in traditional Chinese herbs, exhibits a variety of pharmacological effects. In our study, we demonstrated the effect of emodin treatment on preventing postoperative adhesion formation. Materials and Methods. A total of 48 rats were divided into six groups. Abdominal adhesions were created by abrasion of the cecum and its opposite abdominal wall. In the experimental groups, the rats were administered daily oral doses of emodin. On the seventh day after operation, the rats were euthanized, and blood and pathological specimens were collected. Abdominal adhesion formation was evaluated by necropsy, pathology, immunohistochemistry, Western blot, and enzyme-linked immunosorbent assay analyses. Results. Abdominal adhesions were markedly reduced by emodin treatment. Compared with the control group, collagen deposition was reduced and the peritoneal mesothelial completeness rate was higher in the emodin-treated groups. Emodin had anti-inflammatory effects, reduced oxidative stress, and promoted the movement of the intestinal tract (P<0.05). Conclusion. Emodin significantly reduced intra-abdominal adhesion formation in a rat model
Check on the features of potted 20-inch PMTs with 1F3 electronics prototype at Pan-Asia
The Jiangmen underground neutrino observatory (JUNO) is a neutrino project
with a 20-kton liquid scintillator detector located at 700-m underground. The
large 20-inch PMTs are one of the crucial components of the JUNO experiment
aiming to precision neutrino measurements with better than 3% energy resolution
at 1 MeV. The excellent energy resolution and a large fiducial volume provide
many exciting opportunities for addressing important topics in neutrino and
astro-particle physics. With the container #D at JUNO Pan-Asia PMT testing and
potting station, the features of waterproof potted 20-inch PMTs were measured
with JUNO 1F3 electronics prototype in waveform and charge, which are valuable
for better understanding on the performance of the waterproof potted PMTs and
the JUNO 1F3 electronics. In this paper, basic features of JUNO 1F3 electronics
prototype run at Pan-Asia will be introduced, followed by an analysis of the
waterproof potted 20-inch PMTs and a comparison with the results from
commercial electronics used by the container #A and #B
- …