1,041 research outputs found
AWEncoder: Adversarial Watermarking Pre-trained Encoders in Contrastive Learning
As a self-supervised learning paradigm, contrastive learning has been widely
used to pre-train a powerful encoder as an effective feature extractor for
various downstream tasks. This process requires numerous unlabeled training
data and computational resources, which makes the pre-trained encoder become
valuable intellectual property of the owner. However, the lack of a priori
knowledge of downstream tasks makes it non-trivial to protect the intellectual
property of the pre-trained encoder by applying conventional watermarking
methods. To deal with this problem, in this paper, we introduce AWEncoder, an
adversarial method for watermarking the pre-trained encoder in contrastive
learning. First, as an adversarial perturbation, the watermark is generated by
enforcing the training samples to be marked to deviate respective location and
surround a randomly selected key image in the embedding space. Then, the
watermark is embedded into the pre-trained encoder by further optimizing a
joint loss function. As a result, the watermarked encoder not only performs
very well for downstream tasks, but also enables us to verify its ownership by
analyzing the discrepancy of output provided using the encoder as the backbone
under both white-box and black-box conditions. Extensive experiments
demonstrate that the proposed work enjoys pretty good effectiveness and
robustness on different contrastive learning algorithms and downstream tasks,
which has verified the superiority and applicability of the proposed work.Comment: https://scholar.google.com/citations?user=IdiF7M0AAAAJ&hl=e
Audio Signal Processing Using Time-Frequency Approaches: Coding, Classification, Fingerprinting, and Watermarking
Audio signals are information rich nonstationary signals that play an important role in our day-to-day communication, perception of environment, and entertainment. Due to its non-stationary nature, time- or frequency-only approaches are inadequate in analyzing these signals. A joint time-frequency (TF) approach would be a better choice to efficiently process these signals. In this digital era, compression, intelligent indexing for content-based retrieval, classification, and protection of digital audio content are few of the areas that encapsulate a majority of the audio signal processing applications. In this paper, we present a comprehensive array of TF methodologies that successfully address applications in all of the above mentioned areas. A TF-based audio coding scheme with novel psychoacoustics model, music classification, audio classification of environmental sounds, audio fingerprinting, and audio watermarking will be presented to demonstrate the advantages of using time-frequency approaches in analyzing and extracting information from audio signals.</p
Towards Possibilities & Impossibilities of AI-generated Text Detection: A Survey
Large Language Models (LLMs) have revolutionized the domain of natural
language processing (NLP) with remarkable capabilities of generating human-like
text responses. However, despite these advancements, several works in the
existing literature have raised serious concerns about the potential misuse of
LLMs such as spreading misinformation, generating fake news, plagiarism in
academia, and contaminating the web. To address these concerns, a consensus
among the research community is to develop algorithmic solutions to detect
AI-generated text. The basic idea is that whenever we can tell if the given
text is either written by a human or an AI, we can utilize this information to
address the above-mentioned concerns. To that end, a plethora of detection
frameworks have been proposed, highlighting the possibilities of AI-generated
text detection. But in parallel to the development of detection frameworks,
researchers have also concentrated on designing strategies to elude detection,
i.e., focusing on the impossibilities of AI-generated text detection. This is a
crucial step in order to make sure the detection frameworks are robust enough
and it is not too easy to fool a detector. Despite the huge interest and the
flurry of research in this domain, the community currently lacks a
comprehensive analysis of recent developments. In this survey, we aim to
provide a concise categorization and overview of current work encompassing both
the prospects and the limitations of AI-generated text detection. To enrich the
collective knowledge, we engage in an exhaustive discussion on critical and
challenging open questions related to ongoing research on AI-generated text
detection
Cyber Security
This open access book constitutes the refereed proceedings of the 17th International Annual Conference on Cyber Security, CNCERT 2021, held in Beijing, China, in AJuly 2021. The 14 papers presented were carefully reviewed and selected from 51 submissions. The papers are organized according to the following topical sections: ​data security; privacy protection; anomaly detection; traffic analysis; social network security; vulnerability detection; text classification
- …