638 research outputs found
Speaker diarization of multi-party conversations using participants role information: political debates and professional meetings
Speaker Diarization aims at inferring who spoke when in an audio stream and involves two simultaneous unsupervised tasks: (1) the estimation of the number of speakers, and (2) the association of speech segments to each speaker. Most of the recent efforts in the domain have addressed the problem using machine learning techniques or statistical methods (for a review see [11]) ignoring the fact that the data consists of instances of human conversations
Personality in Computational Advertising: A Benchmark
In the last decade, new ways of shopping online have increased the
possibility of buying products and services more easily and faster
than ever. In this new context, personality is a key determinant
in the decision making of the consumer when shopping. A personâs
buying choices are influenced by psychological factors like
impulsiveness; indeed some consumers may be more susceptible
to making impulse purchases than others. Since affective metadata
are more closely related to the userâs experience than generic
parameters, accurate predictions reveal important aspects of userâs
attitudes, social life, including attitude of others and social identity.
This work proposes a highly innovative research that uses a personality
perspective to determine the unique associations among the
consumerâs buying tendency and advert recommendations. In fact,
the lack of a publicly available benchmark for computational advertising
do not allow both the exploration of this intriguing research
direction and the evaluation of recent algorithms. We present the
ADS Dataset, a publicly available benchmark consisting of 300 real
advertisements (i.e., Rich Media Ads, Image Ads, Text Ads) rated
by 120 unacquainted individuals, enriched with Big-Five usersâ
personality factors and 1,200 personal usersâ pictures
HMM-based Offline Recognition of Handwritten Words Crossed Out with Different Kinds of Strokes
In this work, we investigate the recognition of words that have been crossed-out by the writers and are thus degraded. The degradation consists of one or more ink strokes that span the whole word length and simulate the signs that writers use to cross out the words. The simulated strokes are superimposed to the original clean word images. We considered two types of strokes: wave-trajectory strokes created with splines curves and line-trajectory strokes generated with the delta-lognormal model of rapid line movements. The experiments have been performed using a recognition system based on hidden Markov models and the results show that the performance decrease is moderate for single writer data and light strokes, but severe for multiple writer data
Negotiating over mobile phones: calling or being called can make the difference
Mobile phones pervade our everyday life like no other technology, but the effects they have on one-to-one conversations are still relatively unknown. This paper focuses on how mobile phones influence negotiations, i.e., on discussions where two parties try to reach an agreement starting from opposing preferences. The experiments involve 60 pairs of unacquainted individuals (120 subjects). They must make a âyesâ or ânoâ decision on whether several objects increase the chances of survival in a polar environment or not. When the participants disagree about a given object (one says âyesâ and the other says ânoâ), they must try to convince one another and reach a common decision. Since the subjects discuss via phone, one of them (selected randomly) calls while the other is called. The results show that the caller convinces the receiver in 70 % of the cases ( p value = 0.005 according to a two-tailed binomial test). Gender, age, personality and conflict handling style, measured during the experiment, fail in explaining such a persuasiveness difference. Calling or being called appears to be the most important factor behind the observed result
When the words are not everything: the use of laughter, fillers, back-channel, silence, and overlapping speech in phone calls
This article presents an observational study on how some common conversational cues â laughter, fillers, back-channel, silence, and overlapping speech â are used during mobile phone conversations. The observations are performed over the SSPNet Mobile Corpus, a collection of 60 calls between pairs of unacquainted individuals (120 subjects for roughly 12 h of material in total). The results show that the temporal distribution of the social signals above is not uniform, but it rather reflects the social meaning they carry and convey. In particular, the results show significant use differences depending on factors such as gender, role (caller or receiver), topic, mode of interaction (agreement or disagreement), personality traits, and conflict handling style
Perceptual Control Theory for Engagement and Disengagement of Users in Public Spaces
This paper presents Perceptual Control Theory-a model that explains behaviour as an attempt to keep sensory inputs in a desired range and demonstrates that it can be used to develop an approach designed to make robots capable of human interaction. In particular, we present an approach that embodies the most salient features of the theory through a feedback loop. This approach has been implemented on a Pepper robot, and a preliminary experiment has been performed by deploying the robot in the entrance hall of a university building. The results show that the robot effectively engages and disengages the attention of people in 43% and 39% of cases, respectively. This result has been obtained in a fully natural setting where people were unaware of being involved in an experiment and therefore behaved spontaneously
Predicting continuous conflict perception with Bayesian Gaussian processes
Conflict is one of the most important phenomena of social life, but it is still largely neglected by the computing community. This work proposes an approach
that detects common conversational social signals (loudness, overlapping speech,
etc.) and predicts the conflict level perceived by human observers in continuous,
non-categorical terms. The proposed regression approach is fully Bayesian and it
adopts Automatic Relevance Determination to identify the social signals that influence most the outcome of the prediction. The experiments are performed over the SSPNet Conflict Corpus, a publicly available collection of 1430 clips extracted from televised political debates (roughly 12 hours of material for 138 subjects in total). The results show that it is possible to achieve a correlation close to 0.8 between actual and predicted conflict perception
Infinite Latent Feature Selection: A Probabilistic Latent Graph-Based Ranking Approach
Feature selection is playing an increasingly significant role with respect to
many computer vision applications spanning from object recognition to visual
object tracking. However, most of the recent solutions in feature selection are
not robust across different and heterogeneous set of data. In this paper, we
address this issue proposing a robust probabilistic latent graph-based feature
selection algorithm that performs the ranking step while considering all the
possible subsets of features, as paths on a graph, bypassing the combinatorial
problem analytically. An appealing characteristic of the approach is that it
aims to discover an abstraction behind low-level sensory data, that is,
relevancy. Relevancy is modelled as a latent variable in a PLSA-inspired
generative process that allows the investigation of the importance of a feature
when injected into an arbitrary set of cues. The proposed method has been
tested on ten diverse benchmarks, and compared against eleven state of the art
feature selection methods. Results show that the proposed approach attains the
highest performance levels across many different scenarios and difficulties,
thereby confirming its strong robustness while setting a new state of the art
in feature selection domain.Comment: Accepted at the IEEE International Conference on Computer Vision
(ICCV), 2017, Venice. Preprint cop
The pictures we like are our image: continuous mapping of favorite pictures into self-assessed and attributed personality traits
Flickr allows its users to tag the pictures they like as âfavoriteâ. As a result, many users of the popular photo-sharing platform produce galleries of favorite pictures. This article proposes new approaches, based on Computational Aesthetics, capable to infer the personality traits of Flickr users from the galleries above. In particular, the approaches map low-level features extracted from the pictures into numerical scores corresponding to the Big-Five Traits, both self-assessed and attributed. The experiments were performed over 60,000 pictures tagged as favorite by 300 users (the PsychoFlickr Corpus). The results show that it is possible to predict beyond chance both self-assessed and attributed traits. In line with the state-of-the art of Personality Computing, these latter are predicted with higher effectiveness (correlation up to 0.68 between actual and predicted traits)
Modulating the Non-Verbal Social Signals of a Humanoid Robot
In this demonstration we present a repertoire of social signals generated by the humanoid robot Pepper in the context of the EU-funded project MuMMER. The aim of this research is to provide the robot with the expressive capabilities required to interact with people in real-world public spaces such as shopping malls-and being able to control the non-verbal behaviour of such a robot is key to engaging with humans in an effective way. We propose an approach to modulating the non-verbal social signals of the robot based on systematically varying the amplitude and speed of the joint motions and gathering user evaluations of the resulting gestures. We anticipate that the humans' perception of the robot behaviour will be influenced by these modulations
- âŠ