797 research outputs found

    The iNaturalist Species Classification and Detection Dataset

    Get PDF
    Existing image classification datasets used in computer vision tend to have a uniform distribution of images across object categories. In contrast, the natural world is heavily imbalanced, as some species are more abundant and easier to photograph than others. To encourage further progress in challenging real world conditions we present the iNaturalist species classification and detection dataset, consisting of 859,000 images from over 5,000 different species of plants and animals. It features visually similar species, captured in a wide variety of situations, from all over the world. Images were collected with different camera types, have varying image quality, feature a large class imbalance, and have been verified by multiple citizen scientists. We discuss the collection of the dataset and present extensive baseline experiments using state-of-the-art computer vision classification and detection models. Results show that current non-ensemble based methods achieve only 67% top one classification accuracy, illustrating the difficulty of the dataset. Specifically, we observe poor results for classes with small numbers of training examples suggesting more attention is needed in low-shot learning.Comment: CVPR 201

    Traceable and Authenticable Image Tagging for Fake News Detection

    Full text link
    To prevent fake news images from misleading the public, it is desirable not only to verify the authenticity of news images but also to trace the source of fake news, so as to provide a complete forensic chain for reliable fake news detection. To simultaneously achieve the goals of authenticity verification and source tracing, we propose a traceable and authenticable image tagging approach that is based on a design of Decoupled Invertible Neural Network (DINN). The designed DINN can simultaneously embed the dual-tags, \textit{i.e.}, authenticable tag and traceable tag, into each news image before publishing, and then separately extract them for authenticity verification and source tracing. Moreover, to improve the accuracy of dual-tags extraction, we design a parallel Feature Aware Projection Model (FAPM) to help the DINN preserve essential tag information. In addition, we define a Distance Metric-Guided Module (DMGM) that learns asymmetric one-class representations to enable the dual-tags to achieve different robustness performances under malicious manipulations. Extensive experiments, on diverse datasets and unseen manipulations, demonstrate that the proposed tagging approach achieves excellent performance in the aspects of both authenticity verification and source tracing for reliable fake news detection and outperforms the prior works

    Large-Signal Stability Criteria in DC Power Grids with Distributed-Controlled Converters and Constant Power Loads

    Full text link
    The increasing adoption of power electronic devices may lead to large disturbance and destabilization of future power systems. However, stability criteria are still an unsolved puzzle, since traditional small-signal stability analysis is not applicable to power electronics-enabled power systems when a large disturbance occurs, such as a fault, a pulse power load, or load switching. To address this issue, this paper presents for the first time the rigorous derivation of the sufficient criteria for large-signal stability in DC microgrids with distributed-controlled DC-DC power converters. A novel type of closed-loop converter controllers is designed and considered. Moreover, this paper is the first to prove that the well-known and frequently cited Brayton-Moser mixed potential theory (published in 1964) is incomplete. Case studies are carried out to illustrate the defects of Brayton-Moser mixed potential theory and verify the effectiveness of the proposed novel stability criteria

    Hyperbolic Face Anti-Spoofing

    Full text link
    Learning generalized face anti-spoofing (FAS) models against presentation attacks is essential for the security of face recognition systems. Previous FAS methods usually encourage models to extract discriminative features, of which the distances within the same class (bonafide or attack) are pushed close while those between bonafide and attack are pulled away. However, these methods are designed based on Euclidean distance, which lacks generalization ability for unseen attack detection due to poor hierarchy embedding ability. According to the evidence that different spoofing attacks are intrinsically hierarchical, we propose to learn richer hierarchical and discriminative spoofing cues in hyperbolic space. Specifically, for unimodal FAS learning, the feature embeddings are projected into the Poincar\'e ball, and then the hyperbolic binary logistic regression layer is cascaded for classification. To further improve generalization, we conduct hyperbolic contrastive learning for the bonafide only while relaxing the constraints on diverse spoofing attacks. To alleviate the vanishing gradient problem in hyperbolic space, a new feature clipping method is proposed to enhance the training stability of hyperbolic models. Besides, we further design a multimodal FAS framework with Euclidean multimodal feature decomposition and hyperbolic multimodal feature fusion & classification. Extensive experiments on three benchmark datasets (i.e., WMCA, PADISI-Face, and SiW-M) with diverse attack types demonstrate that the proposed method can bring significant improvement compared to the Euclidean baselines on unseen attack detection. In addition, the proposed framework is also generalized well on four benchmark datasets (i.e., MSU-MFSD, IDIAP REPLAY-ATTACK, CASIA-FASD, and OULU-NPU) with a limited number of attack types

    Characterization of the DNA-unwinding activity of human RECQ1, a helicase specifically stimulated by human replication protein A.

    Get PDF
    The RecQ helicases are involved in several aspects of DNA metabolism. Five members of the RecQ family have been found in humans, but only two of them have been carefully characterized, BLM and WRN. In this work, we describe the enzymatic characterization of RECQ1. The helicase has 3' to 5' polarity, cannot start the unwinding from a blunt-ended terminus, and needs a 3'-single-stranded DNA tail longer than 10 nucleotides to open the substrate. However, it was also able to unwind a blunt-ended duplex DNA with a "bubble" of 25 nucleotides in the middle, as previously observed for WRN and BLM. We show that only short DNA duplexes (30 bp) can be unwound by RECQ1 alone, but the addition of human replication protein A (hRPA) increases the processivity of the enzyme (100 bp). Our studies done with Escherichia coli single-strand binding protein (SSB) indicate that the helicase activity of RECQ1 is specifically stimulated by hRPA. This finding suggests that RECQ1 and hRPA may interact also in vivo and function together in DNA metabolism. Comparison of the present results with previous studies on WRN and BLM provides novel insight into the role of the N- and C-terminal domains of these helicases in determining their substrate specificity and in their interaction with hRPA

    Paint by Word

    Full text link
    We investigate the problem of zero-shot semantic image painting. Instead of painting modifications into an image using only concrete colors or a finite set of semantic concepts, we ask how to create semantic paint based on open full-text descriptions: our goal is to be able to point to a location in a synthesized image and apply an arbitrary new concept such as "rustic" or "opulent" or "happy dog." To do this, our method combines a state-of-the art generative model of realistic images with a state-of-the-art text-image semantic similarity network. We find that, to make large changes, it is important to use non-gradient methods to explore latent space, and it is important to relax the computations of the GAN to target changes to a specific region. We conduct user studies to compare our methods to several baselines.Comment: 10 pages, 9 figure

    Zero-shot sampling of adversarial entities in biomedical question answering

    Full text link
    The increasing depth of parametric domain knowledge in large language models (LLMs) is fueling their rapid deployment in real-world applications. In high-stakes and knowledge-intensive tasks, understanding model vulnerabilities is essential for quantifying the trustworthiness of model predictions and regulating their use. The recent discovery of named entities as adversarial examples in natural language processing tasks raises questions about their potential guises in other settings. Here, we propose a powerscaled distance-weighted sampling scheme in embedding space to discover diverse adversarial entities as distractors. We demonstrate its advantage over random sampling in adversarial question answering on biomedical topics. Our approach enables the exploration of different regions on the attack surface, which reveals two regimes of adversarial entities that markedly differ in their characteristics. Moreover, we show that the attacks successfully manipulate token-wise Shapley value explanations, which become deceptive in the adversarial setting. Our investigations illustrate the brittleness of domain knowledge in LLMs and reveal a shortcoming of standard evaluations for high-capacity models.Comment: 20 pages incl. appendix, under revie
    corecore