18 research outputs found
Multimodal Deep Features Fusion For Video Memorability Prediction
This paper describes a multimodal feature fusion approach for predicting the short and long term video memorability where the goal to design a system that automatically predicts scores reflecting the probability of a video being remembered. The approach performs early fusion of text, image, and video features. Text features are extracted using a Convolutional Neural Network (CNN), an FBResNet152 pre-trained on ImageNet is used to extract image
features and and video features are extracted using 3DResNet152 pre-trained on Kinetics 400.We use Fisher Vectors to obtain a single vector associated with each video that overcomes the need for using a non-fixed global vector representation for handling temporal information. The fusion approach demonstrates good predictive performance and regression superiority in terms of correlation over standard features
Gaze-Based Human-Robot Interaction by the Brunswick Model
We present a new paradigm for human-robot interaction based on social signal processing, and in particular on the Brunswick model. Originally, the Brunswick model copes with face-to-face dyadic interaction, assuming that the interactants are communicating through a continuous exchange of non verbal social signals, in addition to the spoken messages. Social signals have to be interpreted, thanks to a proper recognition phase that considers visual and audio information. The Brunswick model allows to quantitatively evaluate the quality of the interaction using statistical tools which measure how effective is the recognition phase. In this paper we cast this theory when one of the interactants is a robot; in this case, the recognition phase performed by the robot and the human have to be revised w.r.t. the original model. The model is applied to Berrick, a recent open-source low-cost robotic head platform, where the gazing is the social signal to be considered
Human-Machine Interfaces for Service Robotics
L'abstract è presente nell'allegato / the abstract is in the attachmen
Mapping (Dis-)Information Flow about the MH17 Plane Crash
Digital media enables not only fast sharing of information, but also
disinformation. One prominent case of an event leading to circulation of
disinformation on social media is the MH17 plane crash. Studies analysing the
spread of information about this event on Twitter have focused on small,
manually annotated datasets, or used proxys for data annotation. In this work,
we examine to what extent text classifiers can be used to label data for
subsequent content analysis, in particular we focus on predicting pro-Russian
and pro-Ukrainian Twitter content related to the MH17 plane crash. Even though
we find that a neural classifier improves over a hashtag based baseline,
labeling pro-Russian and pro-Ukrainian content with high precision remains a
challenging problem. We provide an error analysis underlining the difficulty of
the task and identify factors that might help improve classification in future
work. Finally, we show how the classifier can facilitate the annotation task
for human annotators
Cross-Modality Feature Learning for Three-Dimensional Brain Image Synthesis
Multi-modality medical imaging is increasingly used for comprehensive assessment of complex diseases in either diagnostic examinations or as part of medical research trials. Different imaging modalities provide complementary information about living tissues. However, multi-modal examinations are not always possible due to adversary factors such as patient discomfort, increased cost, prolonged scanning time and scanner unavailability. In addition, in large imaging studies, incomplete records are not uncommon owing to image artifacts, data corruption or data loss, which compromise the potential of multi-modal acquisitions. Moreover, independently of how well an imaging system is, the performance of the imaging equipment usually comes to a certain limit through different physical devices. Additional interferences arise (particularly for medical imaging systems), for example, limited acquisition times, sophisticated and costly equipment and patients with severe medical conditions, which also cause image degradation. The acquisitions can be considered as the degraded version of the original high-quality images.
In this dissertation, we explore the problems of image super-resolution and cross-modality synthesis for one Magnetic Resonance Imaging (MRI) modality from an image of another MRI modality of the same subject using an image synthesis framework for reconstructing the missing/complex modality data. We develop models and techniques that allow us to connect the domain of source modality data and the domain of target modality data, enabling transformation between elements of
the two domains. In particular, we first introduce the models that project both source modality data and target modality data into a common multi-modality feature space in a supervised setting. This common space then allows us to connect cross-modality features that depict a relationship between each other, and we can impose the learned association function that synthesizes any target modality image. Moreover, we develop a weakly-supervised method that takes a few registered multi-modality image pairs as training data and generates the desired modality data without being constrained a large number of multi-modality images collection of well-processed (\textit{e.g.}, skull-stripped and strictly registered) brain data. Finally, we propose an approach that provides a generic way of learning a dual mapping between source and target domains while considering both visually high-fidelity synthesis and task-practicability. We demonstrate that this model can be used to take any arbitrary modality and efficiently synthesize the desirable modality data in an unsupervised manner.
We show that these proposed models advance the state-of-the-art on image super-resolution and cross-modality synthesis tasks that need jointly processing of multi-modality images and that we can design the algorithms in ways to generate the practically beneficial data to medical image analysis
Computational Methods for Medical and Cyber Security
Over the past decade, computational methods, including machine learning (ML) and deep learning (DL), have been exponentially growing in their development of solutions in various domains, especially medicine, cybersecurity, finance, and education. While these applications of machine learning algorithms have been proven beneficial in various fields, many shortcomings have also been highlighted, such as the lack of benchmark datasets, the inability to learn from small datasets, the cost of architecture, adversarial attacks, and imbalanced datasets. On the other hand, new and emerging algorithms, such as deep learning, one-shot learning, continuous learning, and generative adversarial networks, have successfully solved various tasks in these fields. Therefore, applying these new methods to life-critical missions is crucial, as is measuring these less-traditional algorithms' success when used in these fields
Evaluating the fairness of identification parades with measures of facial similarity
Bibliography: pages 239-248.This thesis addresses a practical problem. The problem concerns the evaluation of 'identification parades', or 'lineups', which are frequently used by police to secure evidence of identification. It is well recognised that this evidence is frequently unreliable, and has led on occasion to tragic miscarriages of justice. A review of South African law is conducted and reported in the thesis, and shows that the legal treatment of identification parades centres on the requirement that parades should be composed of people of similar appearance to the suspect. I argue that it is not possible, in practice, to assess whether this requirement has been met and that this is a significant failing. Psychological work on identification parades includes the development of measures of parade fairness, and the investigation of alternate lineup structures. Measures of parade fairness suggested in the literature are indirectly derived, though; and I argue that they fail to address the question of physical similarity. In addition, I develop ways of reasoning inferentially (statistically) with measures of parade fairness, and suggest a new measure of parade fairness. The absence of a direct measure of similarity constitutes the rationale for the empirical component of the thesis. I propose a measure of facial similarity, in which the similarity of two faces is defined as the Euclidean distance between them in a principal component space, or representational basis. (The space is determined by treating a set of digitized faces as numerical vectors, and by submitting these to principal component analysis). A similar definition is provided for 'facial distinctiveness', namely as the distance of a face from the origin or centroid of the space. The validity of the proposed similarity measure is investigated in several ways, in a total of seven studies, involving approximately 700 subjects. 350 frontal face images and 280 profile face images were collected for use as experimental materials, and as the source for the component space underlying the similarity measure. The weight of the evidence, particularly from a set of similarity rating tasks, suggests that the measure corresponds reasonably well to perceptions of facial similarity. Results from a mock witness experiment showed that it is also strongly, and monotonically related to standard measures of lineup fairness. Evidence from several investigations of the distinctiveness measure, on the other hand, showed that it does not appear to be related to perceptions of facial distinctiveness. An additional empirical investigation examined the relation between target-foil similarity and identification performance. Performance was greater for lineups of low similarity, both when the perpetrator was present, and when the perpetrator was absent. The consequences of this for the understanding of lineup construction and evaluation are discussed
Can Upward Brand Extensions be an Opportunity for Marketing Managers During the Covid-19 Pandemic and Beyond?
Early COVID-19 research has guided current managerial practice by introducing
more products across different product categories as consumers tried to avoid
perceived health risks from food shortages, i.e. horizontal brand extensions. For
example, Leon, a fast-food restaurant in the UK, introduced a new range of ready
meal products. However, when the food supply stabilised, availability may no
longer be a concern for consumers. Instead, job losses could be a driver of higher
perceived financial risks. Meanwhile, it remains unknown whether the perceived
health or financial risks play a more significant role on consumers’ consumptions.
Our preliminary survey shows perceived health risks outperform perceived
financial risks to positively influence purchase intention during COVID-19. We
suggest such a result indicates an opportunity for marketers to consider
introducing premium priced products, i.e. upward brand extensions. The risk-as�feelings and signalling theories were used to explain consumer choice under risk may adopt affective heuristic processing, using minimal cognitive efforts to
evaluate products. Based on this, consumers are likely to be affected by the salient
high-quality and reliable product cue of upward extension signalled by its
premium price level, which may attract consumers to purchase when they have
high perceived health risks associated with COVID-19. Addressing this, a series of
experimental studies confirm that upward brand extensions (versus normal new
product introductions) can positively moderate the positive effect between
perceived health risks associated with COVID-19 and purchase intention. Such an
effect can be mediated by affective heuristic information processing. The results
contribute to emergent COVID-19 literature and managerial practice during the
pandemic but could also inform post-pandemic thinking around vertical brand
extensions