237 research outputs found

    A-JEPA: Joint-Embedding Predictive Architecture Can Listen

    Full text link
    This paper presents that the masked-modeling principle driving the success of large foundational vision models can be effectively applied to audio by making predictions in a latent space. We introduce Audio-based Joint-Embedding Predictive Architecture (A-JEPA), a simple extension method for self-supervised learning from the audio spectrum. Following the design of I-JEPA, our A-JEPA encodes visible audio spectrogram patches with a curriculum masking strategy via context encoder, and predicts the representations of regions sampled at well-designed locations. The target representations of those regions are extracted by the exponential moving average of context encoder, \emph{i.e.}, target encoder, on the whole spectrogram. We find it beneficial to transfer random block masking into time-frequency aware masking in a curriculum manner, considering the complexity of highly correlated in local time and frequency in audio spectrograms. To enhance contextual semantic understanding and robustness, we fine-tune the encoder with a regularized masking on target datasets, instead of input dropping or zero. Empirically, when built with Vision Transformers structure, we find A-JEPA to be highly scalable and sets new state-of-the-art performance on multiple audio and speech classification tasks, outperforming other recent models that use externally supervised pre-training.Comment: arXiv admin note: text overlap with arXiv:2207.06405 by other author

    Electronic-Mechanical Coupling in Graphene from in situ Nanoindentation Experiments and Multiscale Atomistic Simulations

    Get PDF
    We present the in situ nanoindentation experiments performed on suspended graphene devices to introduce homogeneous tensile strain, while simultaneously carrying out electrical measurements. We find that the electrical resistance shows only a marginal change even under severe strain, and the electronic transport measurement confirms that there is no band gap opening for graphene under moderate uniform strain, which is consistent with our results from the first-principles informed molecular dynamics simulation

    On the Trustworthiness Landscape of State-of-the-art Generative Models: A Comprehensive Survey

    Full text link
    Diffusion models and large language models have emerged as leading-edge generative models and have sparked a revolutionary impact on various aspects of human life. However, the practical implementation of these models has also exposed inherent risks, highlighting their dual nature and raising concerns regarding their trustworthiness. Despite the abundance of literature on this subject, a comprehensive survey specifically delving into the intersection of large-scale generative models and their trustworthiness remains largely absent. To bridge this gap, This paper investigates both the long-standing and emerging threats associated with these models across four fundamental dimensions: privacy, security, fairness, and responsibility. In this way, we construct an extensive map outlining the trustworthiness of these models, while also providing practical recommendations and identifying future directions. These efforts are crucial for promoting the trustworthy deployment of these models, ultimately benefiting society as a whole.Comment: draft versio

    Interfacial energy release rates of SiN/GaAs film/substrate systems determined using a cyclic loading dual-indentation method

    Get PDF
    Our previous study developed a dual-indentation method for testing the interfacial energy release rate, Gin, of the SiN/GaAs film/substrate systems. However, for the film/substrate systems with relatively high interfacial toughness, the dual-indentation method was unable to generate interfacial delamination. In this study, a cyclic loading dual-indentation method was proposed, in which the first monotonic loading in the dual-indentation method was replaced by cyclic loading. It was demonstrated that cyclic loading was effective at inducing delamination in relatively "tough" SiN/GaAs interfaces that were unable to be delaminated by dual-indentation method. The Gin values obtained from the cyclic loading indentation were in good agreement with those obtained from the dual-indentation method for the less tough interfaces. The delamination mechanism in the cyclic loading indentation was attributed to the hardening effect on the films induced by cyclic loading, permitting sufficient elastic strain energy to be accumulated to initiate the delamination

    To Healthier Ethereum: A Comprehensive and Iterative Smart Contract Weakness Enumeration

    Full text link
    With the increasing popularity of cryptocurrencies and blockchain technology, smart contracts have become a prominent feature in developing decentralized applications. However, these smart contracts are susceptible to vulnerabilities that hackers can exploit, resulting in significant financial losses. In response to this growing concern, various initiatives have emerged. Notably, the SWC vulnerability list played an important role in raising awareness and understanding of smart contract weaknesses. However, the SWC list lacks maintenance and has not been updated with new vulnerabilities since 2020. To address this gap, this paper introduces the Smart Contract Weakness Enumeration (SWE), a comprehensive and practical vulnerability list up until 2023. We collect 273 vulnerability descriptions from 86 top conference papers and journal papers, employing open card sorting techniques to deduplicate and categorize these descriptions. This process results in the identification of 40 common contract weaknesses, which are further classified into 20 sub-research fields through thorough discussion and analysis. SWE provides a systematic and comprehensive list of smart contract vulnerabilities, covering existing and emerging vulnerabilities in the last few years. Moreover, SWE is a scalable, continuously iterative program. We propose two update mechanisms for the maintenance of SWE. Regular updates involve the inclusion of new vulnerabilities from future top papers, while irregular updates enable individuals to report new weaknesses for review and potential addition to SWE

    On the Robustness of Split Learning against Adversarial Attacks

    Full text link
    Split learning enables collaborative deep learning model training while preserving data privacy and model security by avoiding direct sharing of raw data and model details (i.e., sever and clients only hold partial sub-networks and exchange intermediate computations). However, existing research has mainly focused on examining its reliability for privacy protection, with little investigation into model security. Specifically, by exploring full models, attackers can launch adversarial attacks, and split learning can mitigate this severe threat by only disclosing part of models to untrusted servers.This paper aims to evaluate the robustness of split learning against adversarial attacks, particularly in the most challenging setting where untrusted servers only have access to the intermediate layers of the model.Existing adversarial attacks mostly focus on the centralized setting instead of the collaborative setting, thus, to better evaluate the robustness of split learning, we develop a tailored attack called SPADV, which comprises two stages: 1) shadow model training that addresses the issue of lacking part of the model and 2) local adversarial attack that produces adversarial examples to evaluate.The first stage only requires a few unlabeled non-IID data, and, in the second stage, SPADV perturbs the intermediate output of natural samples to craft the adversarial ones. The overall cost of the proposed attack process is relatively low, yet the empirical attack effectiveness is significantly high, demonstrating the surprising vulnerability of split learning to adversarial attacks.Comment: accepted by ECAI 2023, camera-ready versio

    Progressive Denoising Model for Fine-Grained Text-to-Image Generation

    Full text link
    Recently, vector quantized autoregressive (VQ-AR) models have shown remarkable results in text-to-image synthesis by equally predicting discrete image tokens from the top left to bottom right in the latent space. Although the simple generative process surprisingly works well, is this the best way to generate the image? For instance, human creation is more inclined to the outline-to-fine of an image, while VQ-AR models themselves do not consider any relative importance of each component. In this paper, we present a progressive denoising model for high-fidelity text-to-image image generation. The proposed method takes effect by creating new image tokens from coarse to fine based on the existing context in a parallel manner and this procedure is recursively applied until an image sequence is completed. The resulting coarse-to-fine hierarchy makes the image generation process intuitive and interpretable. Extensive experiments demonstrate that the progressive model produces significantly better results when compared with the previous VQ-AR method in FID score across a wide variety of categories and aspects. Moreover, the text-to-image generation time of traditional AR increases linearly with the output image resolution and hence is quite time-consuming even for normal-size images. In contrast, our approach allows achieving a better trade-off between generation quality and speed.Comment: Technique report. arXiv admin note: text overlap with arXiv:2206.10789 by other author

    Refiner: Data Refining against Gradient Leakage Attacks in Federated Learning

    Full text link
    Recent works have brought attention to the vulnerability of Federated Learning (FL) systems to gradient leakage attacks. Such attacks exploit clients' uploaded gradients to reconstruct their sensitive data, thereby compromising the privacy protection capability of FL. In response, various defense mechanisms have been proposed to mitigate this threat by manipulating the uploaded gradients. Unfortunately, empirical evaluations have demonstrated limited resilience of these defenses against sophisticated attacks, indicating an urgent need for more effective defenses. In this paper, we explore a novel defensive paradigm that departs from conventional gradient perturbation approaches and instead focuses on the construction of robust data. Intuitively, if robust data exhibits low semantic similarity with clients' raw data, the gradients associated with robust data can effectively obfuscate attackers. To this end, we design Refiner that jointly optimizes two metrics for privacy protection and performance maintenance. The utility metric is designed to promote consistency between the gradients of key parameters associated with robust data and those derived from clients' data, thus maintaining model performance. Furthermore, the privacy metric guides the generation of robust data towards enlarging the semantic gap with clients' data. Theoretical analysis supports the effectiveness of Refiner, and empirical evaluations on multiple benchmark datasets demonstrate the superior defense effectiveness of Refiner at defending against state-of-the-art attacks.Comment: under revie
    • …
    corecore