654 research outputs found
Learning to Guide Decoding for Image Captioning
Recently, much advance has been made in image captioning, and an
encoder-decoder framework has achieved outstanding performance for this task.
In this paper, we propose an extension of the encoder-decoder framework by
adding a component called guiding network. The guiding network models the
attribute properties of input images, and its output is leveraged to compose
the input of the decoder at each time step. The guiding network can be plugged
into the current encoder-decoder framework and trained in an end-to-end manner.
Hence, the guiding vector can be adaptively learned according to the signal
from the decoder, making itself to embed information from both image and
language. Additionally, discriminative supervision can be employed to further
improve the quality of guidance. The advantages of our proposed approach are
verified by experiments carried out on the MS COCO dataset.Comment: AAAI-1
Adaptive Local Steps Federated Learning with Differential Privacy Driven by Convergence Analysis
Federated Learning (FL) is a distributed machine learning technique that
allows model training among multiple devices or organizations without sharing
data. However, while FL ensures that the raw data is not directly accessible to
external adversaries, adversaries can still obtain some statistical information
about the data through differential attacks. Differential Privacy (DP) has been
proposed, which adds noise to the model or gradients to prevent adversaries
from inferring private information from the transmitted parameters. We
reconsider the framework of differential privacy federated learning in
resource-constrained scenarios (privacy budget and communication resources). We
analyze the convergence of federated learning with differential privacy (DPFL)
on resource-constrained scenarios and propose an Adaptive Local Steps
Differential Privacy Federated Learning (ALS-DPFL) algorithm. We experiment our
algorithm on the FashionMNIST and Cifar-10 datasets and achieve quite good
performance relative to previous work
Recommended from our members
De-Emphasize Direct Presence
The following paper reveals some aspects of my thoughts about art. The works discussed are featured in my M.F.A. exhibition. All works are mainly based on the ideas of absence, self-reference and utilization in art practice, even though each piece approaches the subject from differing angles. My dissatisfaction with preconceived notions in the contemporary art, rooted in art history, has shifted my focus from concerns of the direct, physical presence of artworks to the indirect or indecisive elements of their context. From this position I have felt free to explore the paradox of self-reference that is involved in performance. In addition, by transferring art works to functional objects, I have found a way to infuse everyday life with my art, and vice-versa. The ambiguity of interpreting artworks with language means that I present this paper with photographic documentation of my artwork. Combined, this will give a clear indication of the thrust of my graduate studies and the current theatrical direction of my art
A bearing fault detection method with low-dimensional compressed measurements of vibration signal
The traditional bearing fault detection method is achieved often by sampling the bearing vibration data under the Shannon sampling theorem. Then the information of the bearing state can be extracted from the vibration data, which is used in fault detection. A long-term and continuous monitoring needs to sample and store large amounts of raw vibration signals, which will burden the data storage and transmission greatly. For this problem, a new bearing fault detection method based on compressed sensing is presented, which just needs to sample and store a small amount of compressed observation data and uses these data directly to achieve the fault detection. Firstly, an over-complete dictionary is trained, on which the vibration signals corresponded to normal state can be decomposed sparsely. Then, the bearing fault detection can be achieved based on the difference of the sparse representation errors between the compressed signals in normal state and fault state on this dictionary. The fault detection results of the proposed method with different parameters are analyzed. The effectiveness of the method is validated by the experimental tests
AU-Supervised Convolutional Vision Transformers for Synthetic Facial Expression Recognition
The paper describes our proposed methodology for the six basic expression
classification track of Affective Behavior Analysis in-the-wild (ABAW)
Competition 2022. In Learing from Synthetic Data(LSD) task, facial expression
recognition (FER) methods aim to learn the representation of expression from
the artificially generated data and generalise to real data. Because of the
ambiguous of the synthetic data and the objectivity of the facial Action Unit
(AU), we resort to the AU information for performance boosting, and make
contributions as follows. First, to adapt the model to synthetic scenarios, we
use the knowledge from pre-trained large-scale face recognition data. Second,
we propose a conceptually-new framework, termed as AU-Supervised Convolutional
Vision Transformers (AU-CVT), which clearly improves the performance of FER by
jointly training auxiliary datasets with AU or pseudo AU labels. Our AU-CVT
achieved F1 score as , accuracy as on the validation set. The
source code of our work is publicly available online:
https://github.com/msy1412/ABAW
- …