179 research outputs found
Structure Similarity Preservation Learning for Asymmetric Image Retrieval
Asymmetric image retrieval is a task that seeks to balance retrieval accuracy
and efficiency by leveraging lightweight and large models for the query and
gallery sides, respectively. The key to asymmetric image retrieval is realizing
feature compatibility between different models. Despite the great progress,
most existing approaches either rely on classifiers inherited from gallery
models or simply impose constraints at the instance level, ignoring the
structure of embedding space. In this work, we propose a simple yet effective
structure similarity preserving method to achieve feature compatibility between
query and gallery models. Specifically, we first train a product quantizer
offline with the image features embedded by the gallery model. The centroid
vectors in the quantizer serve as anchor points in the embedding space of the
gallery model to characterize its structure. During the training of the query
model, anchor points are shared by the query and gallery models. The
relationships between image features and centroid vectors are considered as
structure similarities and constrained to be consistent. Moreover, our approach
makes no assumption about the existence of any labeled training data and thus
can be extended to an unlimited amount of data. Comprehensive experiments on
large-scale landmark retrieval demonstrate the effectiveness of our approach.
Our code is released at: https://github.com/MCC-WH/SSP
Asymmetric Feature Fusion for Image Retrieval
In asymmetric retrieval systems, models with different capacities are
deployed on platforms with different computational and storage resources.
Despite the great progress, existing approaches still suffer from a dilemma
between retrieval efficiency and asymmetric accuracy due to the limited
capacity of the lightweight query model. In this work, we propose an Asymmetric
Feature Fusion (AFF) paradigm, which advances existing asymmetric retrieval
systems by considering the complementarity among different features just at the
gallery side. Specifically, it first embeds each gallery image into various
features, e.g., local features and global features. Then, a dynamic mixer is
introduced to aggregate these features into compact embedding for efficient
search. On the query side, only a single lightweight model is deployed for
feature extraction. The query model and dynamic mixer are jointly trained by
sharing a momentum-updated classifier. Notably, the proposed paradigm boosts
the accuracy of asymmetric retrieval without introducing any extra overhead to
the query side. Exhaustive experiments on various landmark retrieval datasets
demonstrate the superiority of our paradigm
Nonvesicular Inhibitory Neurotransmission via Reversal of the GABA Transporter GAT-1
SummaryGABA transporters play an important but poorly understood role in neuronal inhibition. They can reverse, but this is widely thought to occur only under pathological conditions. Here we use a heterologous expression system to show that the reversal potential of GAT-1 under physiologically relevant conditions is near the normal resting potential of neurons and that reversal can occur rapidly enough to release GABA during simulated action potentials. We then use paired recordings from cultured hippocampal neurons and show that GABAergic transmission is not prevented by four methods widely used to block vesicular release. This nonvesicular neurotransmission was potently blocked by GAT-1 antagonists and was enhanced by agents that increase cytosolic [GABA] or [Na+] (which would increase GAT-1 reversal). We conclude that GAT-1 regulates tonic inhibition by clamping ambient [GABA] at a level high enough to activate high-affinity GABAA receptors and that transporter-mediated GABA release can contribute to phasic inhibition
State Sequences Prediction via Fourier Transform for Representation Learning
While deep reinforcement learning (RL) has been demonstrated effective in
solving complex control tasks, sample efficiency remains a key challenge due to
the large amounts of data required for remarkable performance. Existing
research explores the application of representation learning for data-efficient
RL, e.g., learning predictive representations by predicting long-term future
states. However, many existing methods do not fully exploit the structural
information inherent in sequential state signals, which can potentially improve
the quality of long-term decision-making but is difficult to discern in the
time domain. To tackle this problem, we propose State Sequences Prediction via
Fourier Transform (SPF), a novel method that exploits the frequency domain of
state sequences to extract the underlying patterns in time series data for
learning expressive representations efficiently. Specifically, we theoretically
analyze the existence of structural information in state sequences, which is
closely related to policy performance and signal regularity, and then propose
to predict the Fourier transform of infinite-step future state sequences to
extract such information. One of the appealing features of SPF is that it is
simple to implement while not requiring storage of infinite-step future states
as prediction targets. Experiments demonstrate that the proposed method
outperforms several state-of-the-art algorithms in terms of both sample
efficiency and performance
Unified 2D and 3D Pre-Training of Molecular Representations
Molecular representation learning has attracted much attention recently. A
molecule can be viewed as a 2D graph with nodes/atoms connected by edges/bonds,
and can also be represented by a 3D conformation with 3-dimensional coordinates
of all atoms. We note that most previous work handles 2D and 3D information
separately, while jointly leveraging these two sources may foster a more
informative representation. In this work, we explore this appealing idea and
propose a new representation learning method based on a unified 2D and 3D
pre-training. Atom coordinates and interatomic distances are encoded and then
fused with atomic representations through graph neural networks. The model is
pre-trained on three tasks: reconstruction of masked atoms and coordinates, 3D
conformation generation conditioned on 2D graph, and 2D graph generation
conditioned on 3D conformation. We evaluate our method on 11 downstream
molecular property prediction tasks: 7 with 2D information only and 4 with both
2D and 3D information. Our method achieves state-of-the-art results on 10
tasks, and the average improvement on 2D-only tasks is 8.3%. Our method also
achieves significant improvement on two 3D conformation generation tasks.Comment: KDD-202
Sinkhorn Distance Minimization for Knowledge Distillation
Knowledge distillation (KD) has been widely adopted to compress large
language models (LLMs). Existing KD methods investigate various divergence
measures including the Kullback-Leibler (KL), reverse Kullback-Leibler (RKL),
and Jensen-Shannon (JS) divergences. However, due to limitations inherent in
their assumptions and definitions, these measures fail to deliver effective
supervision when few distribution overlap exists between the teacher and the
student. In this paper, we show that the aforementioned KL, RKL, and JS
divergences respectively suffer from issues of mode-averaging, mode-collapsing,
and mode-underestimation, which deteriorates logits-based KD for diverse NLP
tasks. We propose the Sinkhorn Knowledge Distillation (SinKD) that exploits the
Sinkhorn distance to ensure a nuanced and precise assessment of the disparity
between teacher and student distributions. Besides, profit by properties of the
Sinkhorn metric, we can get rid of sample-wise KD that restricts the perception
of divergence in each teacher-student sample pair. Instead, we propose a
batch-wise reformulation to capture geometric intricacies of distributions
across samples in the high-dimensional space. Comprehensive evaluation on GLUE
and SuperGLUE, in terms of comparability, validity, and generalizability,
highlights our superiority over state-of-the-art methods on all kinds of LLMs
with encoder-only, encoder-decoder, and decoder-only architectures.Comment: Accepted by COLING 202
A Large Portal Vein: A Rare Finding of Recent Portal Vein Thrombosis
Acute portal vein thrombosis (PVT) is rarely encountered by clinicians. The most common manifestation of acute PVT is sudden onset of abdominal pain. A computed tomography scan without contrast often shows a high-density material in the portal vein. After injection of contrast agents, absence of luminal enhancement and enlargement of the obstructed portal vein are shown. In this case report, we demonstrated a rare computed tomography finding in which the diameter of the main portal vein was enormously distended to 3-fold that of the aorta in a patient with recent PVT. Despite thrombolysis and anticoagulation were immediately given, portal venous recanalization was not achieved in the patient. After 5 years, variceal bleeding and ascites occurred and liver function had persistently deteriorated. Finally, he died of progressive liver failure. Considering this case, we suggest that an early decision for invasive interventional treatment might be necessary to both increase the rate of portal venous recanalization and improve prognosis, as anticoagulation and thrombolysis therapy failed to recanalize recent PVT
Strong enhancement of photoresponsivity with shrinking the electrodes spacing in few layer GaSe photodetectors
A critical challenge for the integration of the optoelectronics is that
photodetectors have relatively poor sensitivities at the nanometer scale. It is
generally believed that a large electrodes spacing in photodetectors is
required to absorb sufficient light to maintain high photoresponsivity and
reduce the dark current. However, this will limit the optoelectronic
integration density. Through spatially resolved photocurrent investigation, we
find that the photocurrent in metal-semiconductor-metal (MSM) photodetectors
based on layered GaSe is mainly generated from the photoexcited carriers close
to the metal-GaSe interface and the photocurrent active region is always close
to the Schottky barrier with higher electrical potential. The photoresponsivity
monotonically increases with shrinking the spacing distance before the direct
tunneling happen, which was significantly enhanced up to 5,000 AW-1 for the
bottom contacted device at bias voltage 8 V and wavelength of 410 nm. It is
more than 1,700-fold improvement over the previously reported results. Besides
the systematically experimental investigation of the dependence of the
photoresponsivity on the spacing distance for both the bottom and top contacted
MSM photodetectors, a theoretical model has also been developed to well explain
the photoresponsivity for these two types of device configurations. Our
findings realize shrinking the spacing distance and improving the performance
of 2D semiconductor based MSM photodetectors simultaneously, which could pave
the way for future high density integration of 2D semiconductor optoelectronics
with high performances.Comment: 25 pages, 4 figure
- …