2,203 research outputs found
How to Backdoor Diffusion Models?
Diffusion models are state-of-the-art deep learning empowered generative
models that are trained based on the principle of learning forward and reverse
diffusion processes via progressive noise-addition and denoising. To gain a
better understanding of the limitations and potential risks, this paper
presents the first study on the robustness of diffusion models against backdoor
attacks. Specifically, we propose BadDiffusion, a novel attack framework that
engineers compromised diffusion processes during model training for backdoor
implantation. At the inference stage, the backdoored diffusion model will
behave just like an untampered generator for regular data inputs, while falsely
generating some targeted outcome designed by the bad actor upon receiving the
implanted trigger signal. Such a critical risk can be dreadful for downstream
tasks and applications built upon the problematic model. Our extensive
experiments on various backdoor attack settings show that BadDiffusion can
consistently lead to compromised diffusion models with high utility and target
specificity. Even worse, BadDiffusion can be made cost-effective by simply
finetuning a clean pre-trained diffusion model to implant backdoors. We also
explore some possible countermeasures for risk mitigation. Our results call
attention to potential risks and possible misuse of diffusion models
Spectroscopic applications and frequency locking of THz photomixing with distributed-Bragg-reflector diode lasers in low-temperature-grown GaAs
A compact, narrow-linewidth, tunable source of THz radiation has been developed for spectroscopy and other high-resolution applications. Distributed-Bragg-reflector (DBR) diode lasers at 850 nm are used to pump a low-temperature-grown GaAs photomixer. Resonant optical feedback is employed to stabilize the center frequencies and narrow the linewidths of the DBR lasers. The heterodyne linewidth full-width at half-maximum of two optically locked DBR lasers is 50 kHz on the 20 ms time scale and 2 MHz over 10 s; free-running DBR lasers have linewidths of 40 and 90 MHz on such time scales. This instrument has been used to obtain rotational spectra of acetonitrile (CH3CN) at 313 GHz. Detection limits of 1 × 10^–4 Hz^1/2 (noise/total power) have been achieved, with the noise floor dominated by the detector's noise equivalent power
Lexical Retrieval Hypothesis in Multimodal Context
Multimodal corpora have become an essential language resource for language
science and grounded natural language processing (NLP) systems due to the
growing need to understand and interpret human communication across various
channels. In this paper, we first present our efforts in building the first
Multimodal Corpus for Languages in Taiwan (MultiMoco). Based on the corpus, we
conduct a case study investigating the Lexical Retrieval Hypothesis (LRH),
specifically examining whether the hand gestures co-occurring with speech
constants facilitate lexical retrieval or serve other discourse functions. With
detailed annotations on eight parliamentary interpellations in Taiwan Mandarin,
we explore the co-occurrence between speech constants and non-verbal features
(i.e., head movement, face movement, hand gesture, and function of hand
gesture). Our findings suggest that while hand gestures do serve as
facilitators for lexical retrieval in some cases, they also serve the purpose
of information emphasis. This study highlights the potential of the MultiMoco
Corpus to provide an important resource for in-depth analysis and further
research in multimodal communication studies
Exploring Affordance and Situated Meaning in Image Captions: A Multimodal Analysis
This paper explores the grounding issue regarding multimodal semantic
representation from a computational cognitive-linguistic view. We annotate
images from the Flickr30k dataset with five perceptual properties: Affordance,
Perceptual Salience, Object Number, Gaze Cueing, and Ecological Niche
Association (ENA), and examine their association with textual elements in the
image captions. Our findings reveal that images with Gibsonian affordance show
a higher frequency of captions containing 'holding-verbs' and 'container-nouns'
compared to images displaying telic affordance. Perceptual Salience, Object
Number, and ENA are also associated with the choice of linguistic expressions.
Our study demonstrates that comprehensive understanding of objects or events
requires cognitive attention, semantic nuances in language, and integration
across multiple modalities. We highlight the vital importance of situated
meaning and affordance grounding in natural language understanding, with the
potential to advance human-like interpretation in various scenarios.Comment: 10 pages, 9 figure
A dynamic model of auctions with buy-it-now: theory and evidence
In the ascending-price auctions with Yahoo!-type buy-it-now (BIN), we characterize and
derive the closed-form solution for the optimal bidding strategy of the bidder and the optimal
BIN price of the seller when they are both risk-averse. The seller is shown to be strictly
better o with the BIN option, while the bidders are better o only when their valuation is
greater than a threshold value. The theory also implies that the expected transaction price
is higher in an auction with an optimal BIN price than one without a BIN. This prediction
is conrmed by our data collected from Taiwan's Yahoo! auctions of Nikon digital cameras
A dynamic model of auctions with buy-it-now: theory and evidence
In the ascending-price auctions with Yahoo!-type buy-it-now (BIN), we characterize and
derive the closed-form solution for the optimal bidding strategy of the bidder and the optimal
BIN price of the seller when they are both risk-averse. The seller is shown to be strictly
better o with the BIN option, while the bidders are better o only when their valuation is
greater than a threshold value. The theory also implies that the expected transaction price
is higher in an auction with an optimal BIN price than one without a BIN. This prediction
is conrmed by our data collected from Taiwan's Yahoo! auctions of Nikon digital cameras
Rethinking Backdoor Attacks on Dataset Distillation: A Kernel Method Perspective
Dataset distillation offers a potential means to enhance data efficiency in
deep learning. Recent studies have shown its ability to counteract backdoor
risks present in original training samples. In this study, we delve into the
theoretical aspects of backdoor attacks and dataset distillation based on
kernel methods. We introduce two new theory-driven trigger pattern generation
methods specialized for dataset distillation. Following a comprehensive set of
analyses and experiments, we show that our optimization-based trigger design
framework informs effective backdoor attacks on dataset distillation. Notably,
datasets poisoned by our designed trigger prove resilient against conventional
backdoor attack detection and mitigation methods. Our empirical results
validate that the triggers developed using our approaches are proficient at
executing resilient backdoor attacks.Comment: 19 pages, 4 figure
Metrology Camera System of Prime Focus Spectrograph for Subaru Telescope
The Prime Focus Spectrograph (PFS) is a new optical/near-infrared multi-fiber
spectrograph designed for the prime focus of the 8.2m Subaru telescope. PFS
will cover a 1.3 degree diameter field with 2394 fibers to complement the
imaging capabilities of Hyper SuprimeCam. To retain high throughput, the final
positioning accuracy between the fibers and observing targets of PFS is
required to be less than 10um. The metrology camera system (MCS) serves as the
optical encoder of the fiber motors for the configuring of fibers. MCS provides
the fiber positions within a 5um error over the 45 cm focal plane. The
information from MCS will be fed into the fiber positioner control system for
the closed loop control. MCS will be located at the Cassegrain focus of Subaru
telescope in order to to cover the whole focal plane with one 50M pixel Canon
CMOS camera. It is a 380mm Schmidt type telescope which generates a uniform
spot size with a 10 micron FWHM across the field for reasonable sampling of
PSF. Carbon fiber tubes are used to provide a stable structure over the
operating conditions without focus adjustments. The CMOS sensor can be read in
0.8s to reduce the overhead for the fiber configuration. The positions of all
fibers can be obtained within 0.5s after the readout of the frame. This enables
the overall fiber configuration to be less than 2 minutes. MCS will be
installed inside a standard Subaru Cassgrain Box. All components that generate
heat are located inside a glycol cooled cabinet to reduce the possible image
motion due to heat. The optics and camera for MCS have been delivered and
tested. The mechanical parts and supporting structure are ready as of spring
2016. The integration of MCS will start in the summer of 2016.Comment: 11 pages, 15 figures. SPIE proceeding. arXiv admin note: text overlap
with arXiv:1408.287
Tensed Ontology Based on Simple Partial Logic
Simple partial logic (=SPL) is, broadly speaking, an extensional logic which allows for the truth-value gap. First I give a system of propositional SPL by partializing classical logic, as well as extending it with several non-classical truth-functional operators. Second I show a way based on SPL to construct a system of tensed ontology, by representing tensed statements as two kinds of necessary statements in a linear model that consists of the present and future worlds. Finally I compare that way with other two ways based on Łukasiewicz’s three-valued logic and branching temporal logic
- …