Search CORE

63 research outputs found

Spin-dependent Andreev reflection tunneling through a quantum dot with intradot spin-flip scattering

Author: A. V. Khaetskii
A. V. Khaetskii
Hao Chen
I. Žutić
Shiping Zhou
Xiaolong Song
Xiufeng Cao
Yaoming Shi
Publication venue: 'American Physical Society (APS)'
Publication date: 08/09/2004
Field of study

We study Andreev reflection (AR) tunneling through a quantum dot (QD) connected to a ferromagnet and a superconductor, in which the intradot spin-flip interaction is included. By using the nonequibrium-Green-function method, the formula of the linear AR conductance is derived at zero temperature. It is found that competition between the intradot spin-flip scattering and the tunneling coupling to the leads dominantes resonant behaviours of the AR conductance versus the gate voltage.A weak spin-flip scattering leads to a single peak resonance.However, with the spin-flip scattering strength increasing, the AR conductance will develop into a double peak resonannce implying a novel structure in the tunneling spectrum of the AR conductance. Besides, the effect of the spin-dependent tunneling couplings, the matching of Fermi velocity, and the spin polarization of the ferromagnet on the AR conductance is eximined in detail.Comment: 14 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Hybrid Distillation: Connecting Masked Autoencoders with Contrastive Learners

Author: Dai Wenrui
Li Jin
Shi Bowen
Tian Qi
Wang Yaoming
Xiong Hongkai
Zhang Xiaopeng
Zou Junni
Publication venue
Publication date: 27/06/2023
Field of study

Representation learning has been evolving from traditional supervised training to Contrastive Learning (CL) and Masked Image Modeling (MIM). Previous works have demonstrated their pros and cons in specific scenarios, i.e., CL and supervised pre-training excel at capturing longer-range global patterns and enabling better feature discrimination, while MIM can introduce more local and diverse attention across all transformer layers. In this paper, we explore how to obtain a model that combines their strengths. We start by examining previous feature distillation and mask feature reconstruction methods and identify their limitations. We find that their increasing diversity mainly derives from the asymmetric designs, but these designs may in turn compromise the discrimination ability. In order to better obtain both discrimination and diversity, we propose a simple but effective Hybrid Distillation strategy, which utilizes both the supervised/CL teacher and the MIM teacher to jointly guide the student model. Hybrid Distill imitates the token relations of the MIM teacher to alleviate attention collapse, as well as distills the feature maps of the supervised/CL teacher to enable discrimination. Furthermore, a progressive redundant token masking strategy is also utilized to reduce the distilling costs and avoid falling into local optima. Experiment results prove that Hybrid Distill can achieve superior performance on different benchmarks

arXiv.org e-Print Archive

AiluRus: A Scalable ViT Framework for Dense Prediction

Author: Dai Wenrui
Jiang Dongsheng
Li Chenglin
Li Jin
Shi Bowen
Tian Qi
Wang Yaoming
Xiong Hongkai
Zhang Xiaopeng
Publication venue
Publication date: 02/11/2023
Field of study

Vision transformers (ViTs) have emerged as a prevalent architecture for vision tasks owing to their impressive performance. However, when it comes to handling long token sequences, especially in dense prediction tasks that require high-resolution input, the complexity of ViTs increases significantly. Notably, dense prediction tasks, such as semantic segmentation or object detection, emphasize more on the contours or shapes of objects, while the texture inside objects is less informative. Motivated by this observation, we propose to apply adaptive resolution for different regions in the image according to their importance. Specifically, at the intermediate layer of the ViT, we utilize a spatial-aware density-based clustering algorithm to select representative tokens from the token sequence. Once the representative tokens are determined, we proceed to merge other tokens into their closest representative token. Consequently, semantic similar tokens are merged together to form low-resolution regions, while semantic irrelevant tokens are preserved independently as high-resolution regions. This strategy effectively reduces the number of tokens, allowing subsequent layers to handle a reduced token sequence and achieve acceleration. We evaluate our proposed method on three different datasets and observe promising performance. For example, the "Segmenter ViT-L" model can be accelerated by 48% FPS without fine-tuning, while maintaining the performance. Additionally, our method can be applied to accelerate fine-tuning as well. Experimental results demonstrate that we can save 52% training time while accelerating 2.46 times FPS with only a 0.09% performance drop. The code is available at https://github.com/caddyless/ailurus/tree/main.Comment: Accepted by NeurIPS 202

arXiv.org e-Print Archive