25 research outputs found

    Trainable Projected Gradient Method for Robust Fine-tuning

    Full text link
    Recent studies on transfer learning have shown that selectively fine-tuning a subset of layers or customizing different learning rates for each layer can greatly improve robustness to out-of-distribution (OOD) data and retain generalization capability in the pre-trained models. However, most of these methods employ manually crafted heuristics or expensive hyper-parameter searches, which prevent them from scaling up to large datasets and neural networks. To solve this problem, we propose Trainable Projected Gradient Method (TPGM) to automatically learn the constraint imposed for each layer for a fine-grained fine-tuning regularization. This is motivated by formulating fine-tuning as a bi-level constrained optimization problem. Specifically, TPGM maintains a set of projection radii, i.e., distance constraints between the fine-tuned model and the pre-trained model, for each layer, and enforces them through weight projections. To learn the constraints, we propose a bi-level optimization to automatically learn the best set of projection radii in an end-to-end manner. Theoretically, we show that the bi-level optimization formulation could explain the regularization capability of TPGM. Empirically, with little hyper-parameter search cost, TPGM outperforms existing fine-tuning methods in OOD performance while matching the best in-distribution (ID) performance. For example, when fine-tuned on DomainNet-Real and ImageNet, compared to vanilla fine-tuning, TPGM shows 22%22\% and 10%10\% relative OOD improvement respectively on their sketch counterparts. Code is available at \url{https://github.com/PotatoTian/TPGM}.Comment: Accepted to CVPR202

    A CRY-BIC negative-feedback circuitry regulating blue light sensitivity of Arabidopsis.

    Get PDF
    Cryptochromes are blue light receptors that regulate various light responses in plants. Arabidopsis cryptochrome 1 (CRY1) and cryptochrome 2 (CRY2) mediate blue light inhibition of hypocotyl elongation and long-day (LD) promotion of floral initiation. It has been reported recently that two negative regulators of Arabidopsis cryptochromes, Blue light Inhibitors of Cryptochromes 1 and 2 (BIC1 and BIC2), inhibit cryptochrome function by blocking blue light-dependent cryptochrome dimerization. However, it remained unclear how cryptochromes regulate the BIC gene activity. Here we show that cryptochromes mediate light activation of transcription of the BIC genes, by suppressing the activity of CONSTITUTIVE PHOTOMORPHOGENIC 1 (COP1), resulting in activation of the transcription activator ELONGATED HYPOCOTYL 5 (HY5) that is associated with chromatins of the BIC promoters. These results demonstrate a CRY-BIC negative-feedback circuitry that regulates the activity of each other. Surprisingly, phytochromes also mediate light activation of BIC transcription, suggesting a novel photoreceptor co-action mechanism to sustain blue light sensitivity of plants under the broad spectra of solar radiation in nature

    Security Meets Deep Learning

    No full text
    Recent years have witnessed the rapid development of deep learning in many domains. These successes inspire using deep learning in the area of security. However, there are at least two main challenges when deep learning meets security. First, the availability of attack data is a problem. It is challenging to construct a model that works well with limited attack data. Second, the deep learning systems themselves are vulnerable to various attacks, bringing new concerns when using deep learning to improve security in computer systems. To address the first challenge, this dissertation shows how to use deep learning techniques to improve the security of computer systems with limited or no attack data. To address the second challenge, we show how to protect the security and privacy of deep learning systems. Specifically, in the first part of this dissertation, we consider a practical scenario where no attack data are available, i.e., anomaly detection. We propose a new methodology, Reconstruction Error Distribution (RED), for real-time anomaly detection. Our key insight is that the normal behavior of a computer system can be captured through temporal deep learning models. Deviation from normal behavior indicates anomalies. We show that the proposed methodology can detect attacks with high accuracy in real-time in power-grid controller systems and general-purpose cloud computing servers. The second part of this dissertation focuses on protecting the security and privacy of deep learning. Specifically, we first show that in a Machine Learning as a Service (MLaaS) system, the integrity of a deep learning model in the cloud can be dynamically checked through a type of carefully designed input, i.e., Sensitive-Samples. In another scenario, e.g., distributed learning in edge-cloud systems, we demonstrate that an attacker in the cloud can reconstruct an edge device's input data with high fidelity under successively weaker attacker capabilities. We also propose a new defense to address these attacks. In summary, we hope the work in this dissertation can shed light on using deep learning for improving security and help improve the security of deep learning systems to attacks
    corecore