3,013 research outputs found
Looking Beyond a Clever Narrative: Visual Context and Attention are Primary Drivers of Affect in Video Advertisements
Emotion evoked by an advertisement plays a key role in influencing brand
recall and eventual consumer choices. Automatic ad affect recognition has
several useful applications. However, the use of content-based feature
representations does not give insights into how affect is modulated by aspects
such as the ad scene setting, salient object attributes and their interactions.
Neither do such approaches inform us on how humans prioritize visual
information for ad understanding. Our work addresses these lacunae by
decomposing video content into detected objects, coarse scene structure, object
statistics and actively attended objects identified via eye-gaze. We measure
the importance of each of these information channels by systematically
incorporating related information into ad affect prediction models. Contrary to
the popular notion that ad affect hinges on the narrative and the clever use of
linguistic and social cues, we find that actively attended objects and the
coarse scene structure better encode affective information as compared to
individual scene objects or conspicuous background elements.Comment: Accepted for publication in the Proceedings of 20th ACM International
Conference on Multimodal Interaction, Boulder, CO, US
Responsible and Representative Multimodal Data Acquisition and Analysis: On Auditability, Benchmarking, Confidence, Data-Reliance & Explainability
The ethical decisions behind the acquisition and analysis of audio, video or
physiological human data, harnessed for (deep) machine learning algorithms, is
an increasing concern for the Artificial Intelligence (AI) community. In this
regard, herein we highlight the growing need for responsible, and
representative data collection and analysis, through a discussion of modality
diversification. Factors such as Auditability, Benchmarking, Confidence,
Data-reliance, and Explainability (ABCDE), have been touched upon within the
machine learning community, and here we lay out these ABCDE sub-categories in
relation to the acquisition and analysis of multimodal data, to weave through
the high priority ethical concerns currently under discussion for AI. To this
end, we propose how these five subcategories can be included in early planning
of such acquisition paradigms.Comment: 4 page
Dimensionality Reduction in Deep Learning for Chest X-Ray Analysis of Lung Cancer
Efficiency of some dimensionality reduction techniques, like lung
segmentation, bone shadow exclusion, and t-distributed stochastic neighbor
embedding (t-SNE) for exclusion of outliers, is estimated for analysis of chest
X-ray (CXR) 2D images by deep learning approach to help radiologists identify
marks of lung cancer in CXR. Training and validation of the simple
convolutional neural network (CNN) was performed on the open JSRT dataset
(dataset #01), the JSRT after bone shadow exclusion - BSE-JSRT (dataset #02),
JSRT after lung segmentation (dataset #03), BSE-JSRT after lung segmentation
(dataset #04), and segmented BSE-JSRT after exclusion of outliers by t-SNE
method (dataset #05). The results demonstrate that the pre-processed dataset
obtained after lung segmentation, bone shadow exclusion, and filtering out the
outliers by t-SNE (dataset #05) demonstrates the highest training rate and best
accuracy in comparison to the other pre-processed datasets.Comment: 6 pages, 14 figure
- …