15,318 research outputs found

    DreamArtist: Towards Controllable One-Shot Text-to-Image Generation via Contrastive Prompt-Tuning

    Full text link
    Large-scale text-to-image generation models with an exponential evolution can currently synthesize high-resolution, feature-rich, high-quality images based on text guidance. However, they are often overwhelmed by words of new concepts, styles, or object entities that always emerge. Although there are some recent attempts to use fine-tuning or prompt-tuning methods to teach the model a new concept as a new pseudo-word from a given reference image set, these methods are not only still difficult to synthesize diverse and high-quality images without distortion and artifacts, but also suffer from low controllability. To address these problems, we propose a DreamArtist method that employs a learning strategy of contrastive prompt-tuning, which introduces both positive and negative embeddings as pseudo-words and trains them jointly. The positive embedding aggressively learns characteristics in the reference image to drive the model diversified generation, while the negative embedding introspects in a self-supervised manner to rectify the mistakes and inadequacies from positive embedding in reverse. It learns not only what is correct but also what should be avoided. Extensive experiments on image quality and diversity analysis, controllability analysis, model learning analysis and task expansion have demonstrated that our model learns not only concept but also form, content and context. Pseudo-words of DreamArtist have similar properties as true words to generate high-quality images

    Adversarially-Aware Robust Object Detector

    Full text link
    Object detection, as a fundamental computer vision task, has achieved a remarkable progress with the emergence of deep neural networks. Nevertheless, few works explore the adversarial robustness of object detectors to resist adversarial attacks for practical applications in various real-world scenarios. Detectors have been greatly challenged by unnoticeable perturbation, with sharp performance drop on clean images and extremely poor performance on adversarial images. In this work, we empirically explore the model training for adversarial robustness in object detection, which greatly attributes to the conflict between learning clean images and adversarial images. To mitigate this issue, we propose a Robust Detector (RobustDet) based on adversarially-aware convolution to disentangle gradients for model learning on clean and adversarial images. RobustDet also employs the Adversarial Image Discriminator (AID) and Consistent Features with Reconstruction (CFR) to ensure a reliable robustness. Extensive experiments on PASCAL VOC and MS-COCO demonstrate that our model effectively disentangles gradients and significantly enhances the detection robustness with maintaining the detection ability on clean images.Comment: ECCV2022 oral pape

    Physiological Responses in a Variable Environment: Relationships between Metabolism, Hsp and Thermotolerance in an Intertidal-Subtidal Species

    Get PDF
    Physiological responses to temperature reflect the evolutionary adaptations of organisms to their thermal environment and the capability of animals to tolerate thermal stress. Contrary to conventional metabolism theory, increasing environmental temperatures have been shown to reduce metabolic rate in rocky–eulittoral-fringe species inhabiting highly variable environments, possibly as a strategy for energy conservation. To study the physiological adaptations of an intertidal-subtidal species to the extreme and unpredictable heat stress of the intertidal zone, oxygen consumption rate and heat shock protein expression were quantified in the sea cucumber Apostichopus japonicus. Using simulate natural temperatures, the relationship between temperature, physiological performance (oxygen consumption and heat shock proteins) and thermotolerance were assessed. Depression of oxygen consumption rate and upregulation of heat shock protein genes (hsps) occurred in sequence when ambient temperature was increased from 24 to 30°C. Large-scale mortality of the sea cucumber occurred when temperatures rose beyond 30°C, suggesting that the upregulation of heat shock proteins and mortality are closely related to the depression of aerobic metabolism, a phenomenon that is in line with the concept of oxygen- and capacity-limited thermal tolerance (OCLTT). The physiologically-related thermotolerance of this sea cucumber should be an adaptation to its local environment

    Sherlock : a Semi-Automatic Framework for Quiz Generation Using a Hybrid Semantic Similarity Measure

    Get PDF
    Acknowledgments This work is supported by the BBC Connected Studio programme (http://www.bbc.co.uk/partnersandsuppliers/con nectedstudio/), the award made by the RCUK Digital Economy theme to the dot.rural Digital Economy Hub; award reference EP/G066051/1, the award made by UK Economic & Social Research Council (ESRC); award reference ES/M001628/1, National Natural Science Foundation of China (NSFC) under Grant No. 61373051, and the China National Science and Technology Pillar Program (Grant No. 2013BAH07F05). The authors would like to thank Ryan Hussey for the work on the user interface design and Tom Cass and James Ruston for the help in developing the Sherlock application. We are also grateful to Herm Baskerville for creating the editorial quizzes and Nava Tintarev for many helpful discussions on the human evaluation.Peer reviewedPublisher PD

    Efficient Adaptation of Large Vision Transformer via Adapter Re-Composing

    Full text link
    The advent of high-capacity pre-trained models has revolutionized problem-solving in computer vision, shifting the focus from training task-specific models to adapting pre-trained models. Consequently, effectively adapting large pre-trained models to downstream tasks in an efficient manner has become a prominent research area. Existing solutions primarily concentrate on designing lightweight adapters and their interaction with pre-trained models, with the goal of minimizing the number of parameters requiring updates. In this study, we propose a novel Adapter Re-Composing (ARC) strategy that addresses efficient pre-trained model adaptation from a fresh perspective. Our approach considers the reusability of adaptation parameters and introduces a parameter-sharing scheme. Specifically, we leverage symmetric down-/up-projections to construct bottleneck operations, which are shared across layers. By learning low-dimensional re-scaling coefficients, we can effectively re-compose layer-adaptive adapters. This parameter-sharing strategy in adapter design allows us to significantly reduce the number of new parameters while maintaining satisfactory performance, thereby offering a promising approach to compress the adaptation cost. We conduct experiments on 24 downstream image classification tasks using various Vision Transformer variants to evaluate our method. The results demonstrate that our approach achieves compelling transfer learning performance with a reduced parameter count. Our code is available at \href{https://github.com/DavidYanAnDe/ARC}{https://github.com/DavidYanAnDe/ARC}.Comment: Paper is accepted to NeurIPS 202
    corecore