354 research outputs found
Optimizing Two Sided Promotion for IS Enabled Transportation Network: A Conditional Bayesian Learning Model
This paper investigates whether taxi apps provide attribute value for taxi driver, and how two-sided sales promotion interacted with consumer learning about attribute value to influence taxi drivers’ decision of adoption of taxi app. We propose a conditional Bayesian learning model to allow learning about multiple attributes. We find the evidence of taxi driver’s learning about attribute of app, transaction successful rate and the probability of earning cash back from app provider. We also find measurable evidence that sales promotion during product introduction has indirect effect through learning
LayoutMask: Enhance Text-Layout Interaction in Multi-modal Pre-training for Document Understanding
Visually-rich Document Understanding (VrDU) has attracted much research
attention over the past years. Pre-trained models on a large number of document
images with transformer-based backbones have led to significant performance
gains in this field. The major challenge is how to fusion the different
modalities (text, layout, and image) of the documents in a unified model with
different pre-training tasks. This paper focuses on improving text-layout
interactions and proposes a novel multi-modal pre-training model, LayoutMask.
LayoutMask uses local 1D position, instead of global 1D position, as layout
input and has two pre-training objectives: (1) Masked Language Modeling:
predicting masked tokens with two novel masking strategies; (2) Masked Position
Modeling: predicting masked 2D positions to improve layout representation
learning. LayoutMask can enhance the interactions between text and layout
modalities in a unified model and produce adaptive and robust multi-modal
representations for downstream tasks. Experimental results show that our
proposed method can achieve state-of-the-art results on a wide variety of VrDU
problems, including form understanding, receipt understanding, and document
image classification.Comment: Accepted by ACL 2023 main conferenc
Photonic Modes Prediction via Multi-Modal Diffusion Model
The concept of photonic modes is the cornerstone in optics and photonics,
which can describe the propagation of the light. The Maxwell's equations play
the role in calculating the mode field based on the structure information,
while this process needs a great deal of computations, especially in the handle
with a three-dimensional model. To overcome this obstacle, we introduce the
Multi-Modal Diffusion model to predict the photonic modes in one certain
structure. The Contrastive Language-Image Pre-training (CLIP) model is used to
build the connections between photonic structures and the corresponding modes.
Then we exemplify Stable Diffusion (SD) model to realize the function of
optical fields generation from structure information. Our work introduces
Multi-Modal deep learning to construct complex mapping between structural
information and light field as high-dimensional vectors, and generates light
field images based on this mapping
A Novel Noise Injection-based Training Scheme for Better Model Robustness
Noise injection-based method has been shown to be able to improve the
robustness of artificial neural networks in previous work. In this work, we
propose a novel noise injection-based training scheme for better model
robustness. Specifically, we first develop a likelihood ratio method to
estimate the gradient with respect to both synaptic weights and noise levels
for stochastic gradient descent training. Then, we design an approximation for
the vanilla noise injection-based training method to reduce memory and improve
computational efficiency. Next, we apply our proposed scheme to spiking neural
networks and evaluate the performance of classification accuracy and robustness
on MNIST and Fashion-MNIST datasets. Experiment results show that our proposed
method achieves a much better performance on adversarial robustness and
slightly better performance on original accuracy, compared with the
conventional gradient-based training method
- …