238 research outputs found

    Shot noise of spin current and spin transfer torque

    Get PDF
    We report the theoretical investigation of noise spectrum of spin current and spin transfer torque for non-colinear spin polarized transport in a spin-valve device which consists of normal scattering region connected by two ferromagnetic electrodes. Our theory was developed using non-equilibrium Green's function method and general non-linear SσVS^\sigma-V and SτVS^\tau-V relations were derived as a function of angle θ\theta between magnetization of two leads. We have applied our theory to a quantum dot system with a resonant level coupled with two ferromagnetic electrodes. It was found that for the MNM system, the auto-correlation of spin current is enough to characterize the fluctuation of spin current. For a system with three ferromagnetic layers, however, both auto-correlation and cross-correlation of spin current are needed to characterize the noise spectrum of spin current. Furthermore, the spin transfer torque and the torque noise were studied for the MNM system. For a quantum dot with a resonant level, the derivative of spin torque with respect to bias voltage is proportional to sinθ\sin\theta when the system is far away from the resonance. When the system is near the resonance, the spin transfer torque becomes non-sinusoidal function of θ\theta. The derivative of noise spectrum of spin transfer torque with respect to the bias voltage NτN_\tau behaves differently when the system is near or far away from the resonance. Specifically, the differential shot noise of spin transfer torque NτN_\tau is a concave function of θ\theta near the resonance while it becomes convex function of θ\theta far away from resonance. For certain bias voltages, the period Nτ(θ)N_\tau(\theta) becomes π\pi instead of 2π2\pi. For small θ\theta, it was found that the differential shot noise of spin transfer torque is very sensitive to the bias voltage and the other system parameters.Comment: 15pages, 6figure

    In Defense of Softmax Parametrization for Calibrated and Consistent Learning to Defer

    Full text link
    Enabling machine learning classifiers to defer their decision to a downstream expert when the expert is more accurate will ensure improved safety and performance. This objective can be achieved with the learning-to-defer framework which aims to jointly learn how to classify and how to defer to the expert. In recent studies, it has been theoretically shown that popular estimators for learning to defer parameterized with softmax provide unbounded estimates for the likelihood of deferring which makes them uncalibrated. However, it remains unknown whether this is due to the widely used softmax parameterization and if we can find a softmax-based estimator that is both statistically consistent and possesses a valid probability estimator. In this work, we first show that the cause of the miscalibrated and unbounded estimator in prior literature is due to the symmetric nature of the surrogate losses used and not due to softmax. We then propose a novel statistically consistent asymmetric softmax-based surrogate loss that can produce valid estimates without the issue of unboundedness. We further analyze the non-asymptotic properties of our method and empirically validate its performance and calibration on benchmark datasets.Comment: NeurIPS 202

    On the Importance of Feature Separability in Predicting Out-Of-Distribution Error

    Full text link
    Estimating the generalization performance is practically challenging on out-of-distribution (OOD) data without ground truth labels. While previous methods emphasize the connection between distribution difference and OOD accuracy, we show that a large domain gap not necessarily leads to a low test accuracy. In this paper, we investigate this problem from the perspective of feature separability, and propose a dataset-level score based upon feature dispersion to estimate the test accuracy under distribution shift. Our method is inspired by desirable properties of features in representation learning: high inter-class dispersion and high intra-class compactness. Our analysis shows that inter-class dispersion is strongly correlated with the model accuracy, while intra-class compactness does not reflect the generalization performance on OOD data. Extensive experiments demonstrate the superiority of our method in both prediction performance and computational efficiency

    DOS: Diverse Outlier Sampling for Out-of-Distribution Detection

    Full text link
    Modern neural networks are known to give overconfident prediction for out-of-distribution inputs when deployed in the open world. It is common practice to leverage a surrogate outlier dataset to regularize the model during training, and recent studies emphasize the role of uncertainty in designing the sampling strategy for outlier dataset. However, the OOD samples selected solely based on predictive uncertainty can be biased towards certain types, which may fail to capture the full outlier distribution. In this work, we empirically show that diversity is critical in sampling outliers for OOD detection performance. Motivated by the observation, we propose a straightforward and novel sampling strategy named DOS (Diverse Outlier Sampling) to select diverse and informative outliers. Specifically, we cluster the normalized features at each iteration, and the most informative outlier from each cluster is selected for model training with absent category loss. With DOS, the sampled outliers efficiently shape a globally compact decision boundary between ID and OOD data. Extensive experiments demonstrate the superiority of DOS, reducing the average FPR95 by up to 25.79% on CIFAR-100 with TI-300K

    Optimization-Free Test-Time Adaptation for Cross-Person Activity Recognition

    Full text link
    Human Activity Recognition (HAR) models often suffer from performance degradation in real-world applications due to distribution shifts in activity patterns across individuals. Test-Time Adaptation (TTA) is an emerging learning paradigm that aims to utilize the test stream to adjust predictions in real-time inference, which has not been explored in HAR before. However, the high computational cost of optimization-based TTA algorithms makes it intractable to run on resource-constrained edge devices. In this paper, we propose an Optimization-Free Test-Time Adaptation (OFTTA) framework for sensor-based HAR. OFTTA adjusts the feature extractor and linear classifier simultaneously in an optimization-free manner. For the feature extractor, we propose Exponential DecayTest-time Normalization (EDTN) to replace the conventional batch normalization (CBN) layers. EDTN combines CBN and Test-time batch Normalization (TBN) to extract reliable features against domain shifts with TBN's influence decreasing exponentially in deeper layers. For the classifier, we adjust the prediction by computing the distance between the feature and the prototype, which is calculated by a maintained support set. In addition, the update of the support set is based on the pseudo label, which can benefit from reliable features extracted by EDTN. Extensive experiments on three public cross-person HAR datasets and two different TTA settings demonstrate that OFTTA outperforms the state-of-the-art TTA approaches in both classification performance and computational efficiency. Finally, we verify the superiority of our proposed OFTTA on edge devices, indicating possible deployment in real applications. Our code is available at \href{https://github.com/Claydon-Wang/OFTTA}{this https URL}.Comment: To be presented at UbiComp 2024; Accepted by Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (IMWUT
    corecore