21,387 research outputs found
A Quantum Probability Driven Framework for Joint Multi-Modal Sarcasm, Sentiment and Emotion Analysis
Sarcasm, sentiment, and emotion are three typical kinds of spontaneous
affective responses of humans to external events and they are tightly
intertwined with each other. Such events may be expressed in multiple
modalities (e.g., linguistic, visual and acoustic), e.g., multi-modal
conversations. Joint analysis of humans' multi-modal sarcasm, sentiment, and
emotion is an important yet challenging topic, as it is a complex cognitive
process involving both cross-modality interaction and cross-affection
correlation. From the probability theory perspective, cross-affection
correlation also means that the judgments on sarcasm, sentiment, and emotion
are incompatible. However, this exposed phenomenon cannot be sufficiently
modelled by classical probability theory due to its assumption of
compatibility. Neither do the existing approaches take it into consideration.
In view of the recent success of quantum probability (QP) in modeling human
cognition, particularly contextual incompatible decision making, we take the
first step towards introducing QP into joint multi-modal sarcasm, sentiment,
and emotion analysis. Specifically, we propose a QUantum probabIlity driven
multi-modal sarcasm, sEntiment and emoTion analysis framework, termed QUIET.
Extensive experiments on two datasets and the results show that the
effectiveness and advantages of QUIET in comparison with a wide range of the
state-of-the-art baselines. We also show the great potential of QP in
multi-affect analysis
A deep learning framework for quality assessment and restoration in video endoscopy
Endoscopy is a routine imaging technique used for both diagnosis and
minimally invasive surgical treatment. Artifacts such as motion blur, bubbles,
specular reflections, floating objects and pixel saturation impede the visual
interpretation and the automated analysis of endoscopy videos. Given the
widespread use of endoscopy in different clinical applications, we contend that
the robust and reliable identification of such artifacts and the automated
restoration of corrupted video frames is a fundamental medical imaging problem.
Existing state-of-the-art methods only deal with the detection and restoration
of selected artifacts. However, typically endoscopy videos contain numerous
artifacts which motivates to establish a comprehensive solution.
We propose a fully automatic framework that can: 1) detect and classify six
different primary artifacts, 2) provide a quality score for each frame and 3)
restore mildly corrupted frames. To detect different artifacts our framework
exploits fast multi-scale, single stage convolutional neural network detector.
We introduce a quality metric to assess frame quality and predict image
restoration success. Generative adversarial networks with carefully chosen
regularization are finally used to restore corrupted frames.
Our detector yields the highest mean average precision (mAP at 5% threshold)
of 49.0 and the lowest computational time of 88 ms allowing for accurate
real-time processing. Our restoration models for blind deblurring, saturation
correction and inpainting demonstrate significant improvements over previous
methods. On a set of 10 test videos we show that our approach preserves an
average of 68.7% which is 25% more frames than that retained from the raw
videos.Comment: 14 page
- …