1,267 research outputs found

    The DKU-DUKEECE System for the Manipulation Region Location Task of ADD 2023

    Full text link
    This paper introduces our system designed for Track 2, which focuses on locating manipulated regions, in the second Audio Deepfake Detection Challenge (ADD 2023). Our approach involves the utilization of multiple detection systems to identify splicing regions and determine their authenticity. Specifically, we train and integrate two frame-level systems: one for boundary detection and the other for deepfake detection. Additionally, we employ a third VAE model trained exclusively on genuine data to determine the authenticity of a given audio clip. Through the fusion of these three systems, our top-performing solution for the ADD challenge achieves an impressive 82.23% sentence accuracy and an F1 score of 60.66%. This results in a final ADD score of 0.6713, securing the first rank in Track 2 of ADD 2023.Comment: The DKU-DukeECE system description to Task 2 of Audio Deepfake Detection Challenge (ADD 2023

    DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature

    Full text link
    The increasing fluency and widespread usage of large language models (LLMs) highlight the desirability of corresponding tools aiding detection of LLM-generated text. In this paper, we identify a property of the structure of an LLM's probability function that is useful for such detection. Specifically, we demonstrate that text sampled from an LLM tends to occupy negative curvature regions of the model's log probability function. Leveraging this observation, we then define a new curvature-based criterion for judging if a passage is generated from a given LLM. This approach, which we call DetectGPT, does not require training a separate classifier, collecting a dataset of real or generated passages, or explicitly watermarking generated text. It uses only log probabilities computed by the model of interest and random perturbations of the passage from another generic pre-trained language model (e.g., T5). We find DetectGPT is more discriminative than existing zero-shot methods for model sample detection, notably improving detection of fake news articles generated by 20B parameter GPT-NeoX from 0.81 AUROC for the strongest zero-shot baseline to 0.95 AUROC for DetectGPT. See https://ericmitchell.ai/detectgpt for code, data, and other project information.Comment: ICML 202

    Using Four Learning Algorithms for Evaluating Questionable Uniform Resource Locators (URLs)

    Get PDF
    Malicious Uniform Resource Locator (URL) is a common and serious threat to cyber security. Malicious URLs host unsolicited contents (spam, phishing, drive-by exploits, etc.) and lure unsuspecting internet users to become victims of scams such as monetary loss, theft, loss of information privacy and unexpected malware installation. This phenomenon has resulted in the increase of cybercrime on social media via transfer of malicious URLs. This situation prompted an efficient and reliable classification of a web-page based on the information contained in the URL to have a clear understanding of the nature and status of the site to be accessed. It is imperative to detect and act on URLs shared on social media platform in a timely manner. Though researchers have carried out similar researches in the past, there are however conflicting results regarding the conclusions drawn at the end of their experimentations. Against this backdrop, four machine learning algorithms:Naïve Bayes Algorithm, K-means Algorithm, Decision Tree Algorithm and Logistic Regression Algorithm were selected for classification of fake and vulnerable URLs. The implementation of algorithms was implemented with Java programming language. Through statistical analysis and comparison made on the four algorithms, Naïve Bayes algorithm is the most efficient and effective based on the metrics used

    Can faking be measured with dedicated validity scales? Within Subject Trifactor Mixture Modeling applied to BIDR responses

    Get PDF
    A sample of 516 participants responded to the Balanced Inventory of Desirable Responding (BIDR) under answer honest and instructed faking conditions in a within-subjects design. We analyse these data with a novel application of trifactor modeling that models the two substantive factors measured by the BIDR – Self-Deceptive Enhancement (SDE) and Impression Management (IM), condition-related common factors and item specific factors. The model permits examination of invariance and change within subjects across conditions. Participants were able to significantly increase their SDE and IM in the instructed faking condition relative to the honest response condition. Mixture modeling confirmed the existence of a theoretical two-class solution comprised of approximately two thirds of ‘compliers’ and one third of ‘non-compliers’. Factor scores had good determinacy and correlations with observed scores were near unity for continuous scoring, supporting observed score interpretations of BIDR scales in high stakes settings. Correlations were somewhat lower for the dichotomous scoring protocol. Overall, results show that the BIDR scales function similarly as measures of socially desirable functioning in low and high stakes conditions. We discuss conditions under which we expect these results will and will not generalise to other validity scales

    Toward Automatic Fake News Classification

    Get PDF
    The interaction of technology with humans have many adverse effects. The rapid growth and outreach of the social media and the Web have led to the dissemination of questionable and untrusted content among a wider audience, which has negatively influenced their lives and judgment. Different election campaigns around the world highlighted how \u27\u27fake news\u27\u27 - misinformation that looks genuine - can be targeted towards specific communities to manipulate and confuse them. Ever since, automatic fake news detection has gained widespread attention from the scientific community. As a result, many research studies have been conducted to tackle the detection and spreading of fake news. While the first step of such tasks would be to classify claims associated based on their credibility, the next steps would involve identifying hidden patterns in style, syntax, and content of such news claims. We provide a comprehensive overview of what has already been done in this domain and other similar fields, and then propose a generalized method based on Deep Neural Networks to identify if a given claim is fake or genuine. By using different features like the authenticity of the source, perceived cognitive authority, style, and content-based factors, and natural language features, it is possible to predict fake news accurately. We have used a modular approach by combining techniques from information retrieval, natural language processing, and deep learning. Our classifier comprises two main sub-modules. The first sub-module uses the claim to retrieve relevant articles from the knowledge base which can then be used to verify the truth of the claim. It also uses word-level features for prediction. The second sub-module uses a deep neural network to learn the underlying style of fake content. Our experiments conducted on benchmark datasets show that for the given classification task we can obtain up to 82.4% accuracy by using a combination of two models; the first model was up to 72% accurate while the second model was around 81\% accurate. Our detection model has the potential to automatically detect and prevent the spread of fake news, thus, limiting the caustic influence of technology in the human lives

    Publicly Detectable Watermarking for Language Models

    Full text link
    We construct the first provable watermarking scheme for language models with public detectability or verifiability: we use a private key for watermarking and a public key for watermark detection. Our protocol is the first watermarking scheme that does not embed a statistical signal in generated text. Rather, we directly embed a publicly-verifiable cryptographic signature using a form of rejection sampling. We show that our construction meets strong formal security guarantees and preserves many desirable properties found in schemes in the private-key watermarking setting. In particular, our watermarking scheme retains distortion-freeness and model agnosticity. We implement our scheme and make empirical measurements over open models in the 7B parameter range. Our experiments suggest that our watermarking scheme meets our formal claims while preserving text quality
    corecore