300 research outputs found
Auditory dialog analysis and understanding by generative modelling of interactional dynamics
In the last few years, the interest in the analysis of human behavioral schemes has dramatically grown, in particular for the interpretation of the communication modalities called social signals. They represent well defined interaction patterns, possibly unconscious, characterizing different conversational situations and behaviors in general. In this paper, we illustrate an automatic system based on a generative structure able to analyze conversational scenarios. The generative model is composed by integrating a Gaussian mixture model and the (observed) influence model, and it is fed with a novel kind of simple low-level auditory social signals, which are termed steady conversational periods (SCPs). These are built on duration of continuous slots of silence or speech, taking also into account conversational turn-taking. The interactional dynamics built upon the transitions among SCPs provide a behavioral blueprint of conversational settings without relying on segmental or continuous phonetic features. Our contribution here is to show the effectiveness of our model when applied on dialogs classification and clustering tasks, considering dialogs between adults and between children and adults, in both flat and arguing discussions, and showing excellent performances also in comparison with state-of-the-art frameworks
Embodied creativity: a process continuum from artistic creation to creative participation
This thesis breaks new ground by attending to two contemporary developments in art and science. In art, computer-mediated interactive artworks comprise creative engagement between collaborating practitioners and a creatively participating audience, erasing all notions of a dividing line between them. The procedural character of this type of communicative real-time interaction replaces the concept of a finished artwork with a ‘field of artistic communication’. In science, the field of creativity research investigates creative thought as mental operations that combine and reorganise extant knowledge structures. A recent paradigm shift in cognition research acknowledges that cognition is embodied. Neither embodiment in cognition nor the ‘field of artistic communication’ in interactive art have been assimilated by creativity research.
This thesis takes an interdisciplinary approach to examine the embodied cognitive processes in a ‘field of artistic communication’ using a media artwork called Sim-Suite as a case study research strategy. This interactive installation, created and exhibited in an authentic real-world context, engages three people to play on wobble-boards. The thesis argues that creative processes related to Sim-Suite operate within a continuum, encompassing collaborative artistic creation and cooperative creative participation. This continuum is investigated via mixed methods, conducting studies with qualitative and quantitative analysis. These are interpreted through a theoretical lens of embodied cognition principles, the 4E approaches.
The results obtained demonstrate that embodied cognitive processes in Sim-Suite’s ‘field of artistic communication’ function on a continuum. We give an account of the creative process continuum relating our findings to the ‘embedded-extended-enactive lens’, empirical studies in embodied cognition and creativity research. Within this context a number of topics and sub-themes are identified. We discuss embodied communication, aspects of agency, forms of coordination, levels of evaluative processes and empathetic foundation. The thesis makes conceptual, empirical and methodological contributions to creativity research
Towards a complete multiple-mechanism account of predictive language processing [Commentary on Pickering & Garrod]
Although we agree with Pickering & Garrod (P&G) that prediction-by-simulation and prediction-by-association are important mechanisms of anticipatory language processing, this commentary suggests that they: (1) overlook other potential mechanisms that might underlie prediction in language processing, (2) overestimate the importance of prediction-by-association in early childhood, and (3) underestimate the complexity and significance of several factors that might mediate prediction during language processing
An integrated theory of language production and comprehension
Currently, production and comprehension are regarded as quite distinct in accounts of language processing. In rejecting this dichotomy, we instead assert that producing and understanding are interwoven, and that this interweaving is what enables people to predict themselves and each other. We start by noting that production and comprehension are forms of action and action perception. We then consider the evidence for interweaving in action, action perception, and joint action, and explain such evidence in terms of prediction. Specifically, we assume that actors construct forward models of their actions before they execute those actions, and that perceivers of others' actions covertly imitate those actions, then construct forward models of those actions. We use these accounts of action, action perception, and joint action to develop accounts of production, comprehension, and interactive language. Importantly, they incorporate well-defined levels of linguistic representation (such as semantics, syntax, and phonology). We show (a) how speakers and comprehenders use covert imitation and forward modeling to make predictions at these levels of representation, (b) how they interweave production and comprehension processes, and (c) how they use these predictions to monitor the upcoming utterances. We show how these accounts explain a range of behavioral and neuroscientific data on language processing and discuss some of the implications of our proposal
The osteopath-parent-child triad in osteopathic care in the first 2 years of life: a qualitative study
BackgroundEnactivism and active inference are two important concepts in the field of osteopathy. While enactivism emphasizes the role of the body and the environment in shaping our experiences and understanding of the world, active inference emphasizes the role of action and perception in shaping our experiences and understanding of the world. Together, these frameworks provide a unique perspective on the practice of osteopathy, and how it can be used to facilitate positive change in patients. Since the neonatal period is a crucial time for development, osteopaths should aim to create a therapeutic relationship. Arguably, through participatory sense-making, osteopaths can help the baby build a generative model (with positive priors) to deal with stress and needs throughout their life.AimSince the literature considers that interactions with the environment, which enact the patients’ experiences, depending on contextual factors and communication between patient and caregiver, this research explored whether there is a correspondence between the indications in the literature and clinical practice in the management of the mother/parent–child dyad during osteopathic care on children aged 0 to 2 years old.MethodsSemi-structured interviews were conducted with a purposive sample of nine osteopaths with experience in the field of pediatrics. Interviews were transcribed verbatim, and constructivist grounded theory was used to conceptualize, collect and analyze data. Codes and categories were actively constructed through an interpretive/constructionist paradigm.ResultsThe core category was the idea of the pediatric osteopath as a support for the family, not only for the child. Four additional categories were identified: (1) Preparing a safe environment for both children and parents, (2) Communication, (3) Attachment and synchrony, and (4) Synchronization.ConclusionThrough participatory sense-making, osteopaths manage contextual factors to establish an effective therapeutic alliance through the osteopath-parent–child triad to facilitate the construction of the child’s internal generative model to promote healthy development. The therapeutic encounter is considered an encounter between embodied subjects, occurring within a field of affordances (ecological niche) that allows the interlocutors to actively participate in creating new meanings through interpersonal synchronization. Participatory sense-making and the establishment of a therapeutic alliance through the osteopath-parent–child triad are crucial to promote healthy development in the child
Accessing spoken interaction through dialogue processing [online]
Zusammenfassung
Unser Leben, unsere Leistungen und unsere Umgebung, alles wird
derzeit durch Schriftsprache dokumentiert. Die rasante
Fortentwicklung der technischen Möglichkeiten Audio, Bilder und
Video aufzunehmen, abzuspeichern und wiederzugeben kann genutzt
werden um die schriftliche Dokumentation von menschlicher
Kommunikation, zum Beispiel Meetings, zu unterstützen, zu
ergänzen oder gar zu ersetzen. Diese neuen Technologien können
uns in die Lage versetzen Information aufzunehmen, die
anderweitig verloren gehen, die Kosten der Dokumentation zu
senken und hochwertige Dokumente mit audiovisuellem Material
anzureichern. Die Indizierung solcher Aufnahmen stellt die
Kerntechnologie dar um dieses Potential auszuschöpfen. Diese
Arbeit stellt effektive Alternativen zu schlüsselwortbasierten
Indizes vor, die Suchraumeinschränkungen bewirken und teilweise
mit einfachen Mitteln zu berechnen sind.
Die Indizierung von Sprachdokumenten kann auf verschiedenen
Ebenen erfolgen: Ein Dokument gehört stilistisch einer
bestimmten Datenbasis an, welche durch sehr einfache Merkmale
bei hoher Genauigkeit automatisch bestimmt werden kann.
Durch diese Art von Klassifikation kann eine Reduktion des
Suchraumes um einen Faktor der Größenordnung 410 erfolgen. Die
Anwendung von thematischen Merkmalen zur Textklassifikation
bei einer Nachrichtendatenbank resultiert in einer Reduktion um
einen Faktor 18. Da Sprachdokumente sehr lang sein können müssen
sie in thematische Segmente unterteilt werden. Ein neuer
probabilistischer Ansatz sowie neue Merkmale (Sprecherinitia
tive und Stil) liefern vergleichbare oder bessere Resultate als
traditionelle schlüsselwortbasierte Ansätze. Diese thematische
Segmente können durch die vorherrschende Aktivität
charakterisiert werden (erzählen, diskutieren, planen, ...),
die durch ein neuronales Netz detektiert werden kann. Die
Detektionsraten sind allerdings begrenzt da auch Menschen
diese Aktivitäten nur ungenau bestimmen. Eine maximale
Reduktion des Suchraumes um den Faktor 6 ist bei den verwendeten
Daten theoretisch möglich. Eine thematische Klassifikation
dieser Segmente wurde ebenfalls auf einer Datenbasis
durchgeführt, die Detektionsraten für diesen Index sind jedoch
gering.
Auf der Ebene der einzelnen Äußerungen können Dialogakte wie
Aussagen, Fragen, Rückmeldungen (aha, ach ja, echt?, ...) usw.
mit einem diskriminativ trainierten Hidden Markov Model erkannt
werden. Dieses Verfahren kann um die Erkennung von kurzen Folgen
wie Frage/AntwortSpielen erweitert werden (Dialogspiele).
Dialogakte und spiele können eingesetzt werden um
Klassifikatoren für globale Sprechstile zu bauen. Ebenso
könnte ein Benutzer sich an eine bestimmte Dialogaktsequenz
erinnern und versuchen, diese in einer grafischen
Repräsentation wiederzufinden.
In einer Studie mit sehr pessimistischen Annahmen konnten
Benutzer eines aus vier ähnlichen und gleichwahrscheinlichen
Gesprächen mit einer Genauigkeit von ~ 43% durch eine graphische
Repräsentation von Aktivität bestimmt.
Dialogakte könnte in diesem Szenario ebenso nützlich sein, die
Benutzerstudie konnte aufgrund der geringen Datenmenge darüber
keinen endgültigen Aufschluß geben. Die Studie konnte allerdings
für detailierte Basismerkmale wie Formalität und
Sprecheridentität keinen Effekt zeigen.
Abstract
Written language is one of our primary means for documenting our
lives, achievements, and environment. Our capabilities to
record, store and retrieve audio, still pictures, and video are
undergoing a revolution and may support, supplement or even
replace written documentation. This technology enables us to
record information that would otherwise be lost, lower the cost
of documentation and enhance highquality documents with
original audiovisual material.
The indexing of the audio material is the key technology to
realize those benefits. This work presents effective
alternatives to keyword based indices which restrict the search
space and may in part be calculated with very limited resources.
Indexing speech documents can be done at a various levels:
Stylistically a document belongs to a certain database which can
be determined automatically with high accuracy using very simple
features. The resulting factor in search space reduction is in
the order of 410 while topic classification yielded a factor
of 18 in a news domain.
Since documents can be very long they need to be segmented into
topical regions. A new probabilistic segmentation framework as
well as new features (speaker initiative and style) prove to be
very effective compared to traditional keyword based methods. At
the topical segment level activities (storytelling, discussing,
planning, ...) can be detected using a machine learning approach
with limited accuracy; however even human annotators do not
annotate them very reliably. A maximum search space reduction
factor of 6 is theoretically possible on the databases used. A
topical classification of these regions has been attempted
on one database, the detection accuracy for that index, however,
was very low.
At the utterance level dialogue acts such as statements,
questions, backchannels (aha, yeah, ...), etc. are being
recognized using a novel discriminatively trained HMM procedure.
The procedure can be extended to recognize short sequences such
as question/answer pairs, so called dialogue games.
Dialog acts and games are useful for building classifiers for
speaking style. Similarily a user may remember a certain dialog
act sequence and may search for it in a graphical
representation.
In a study with very pessimistic assumptions users are able to
pick one out of four similar and equiprobable meetings correctly
with an accuracy ~ 43% using graphical activity information.
Dialogue acts may be useful in this situation as well but the
sample size did not allow to draw final conclusions. However the
user study fails to show any effect for detailed basic features
such as formality or speaker identity
Recommended from our members
Nonlinear Dynamics In Musical Interactions
This thesis examines nonlinear dynamical processes in musical tools, identifying certain roles that they play in creative interactions with existing tools, and investigates the roles they might play in digital tools. Nonlinear dynamical processes are fundamental in the everyday physical world. They lie at the core of many acoustic instruments, playing a particularly significant role in bowed and blown instruments.
Two major studies are presented that approach these issues from different perspectives. Firstly a set of comparative studies explore the ways in which musicians engage with systems that do and do not incorporate nonlinear dynamical processes. Secondly, interviews with a range of musicians engaged in contemporary musical practices — particularly free improvisation — are used to investigate the role of nonlinear dynamical processes in instrumental interactions in relation to unpredictability and creative exploration.
Evidence is presented demonstrating that nonlinear dynamical processes can be drawn on as resources for exploration over long time periods. An approach to creative interaction that explicitly draws on the properties of nonlinear dynamical processes is uncovered and connected to material-oriented notions of creative processes. Nonlinear dynamics are shown to facilitate a productive ‘‘sweet spot’’ between unpredictability and complexity on the one hand, and detailed, sensitive, deterministic control, coupled with the potential to repeat and develop particular actions on the other. The importance of timing in interactions with nonlinear dynamical processes is highlighted as being significant in creating explorable interactions, particularly close to critical thresholds.
A distinction is raised between instantaneous unpredictabilities that emerge from the interaction with the tool (interactional ), and unpredictabilities that result from the unexpected implications of the conjunction of otherwise anticipated elements (combinatorial). While the usefulness of the latter in creative interactions is frequently acknowledged in HCI research, the former is often excluded, or seen as a hinderance or obstruction. Engagements with nonlinear dynamical processes in existing musical instruments and practices provide clear evidence of the utility of both nonlinear dynamics, and interactional surprises more generally, suggesting that they can be of use in other domains where creative exploration is a concern
Dialogue and the machine: an interactional perspective on computer dialogue models, mediation and artifacts
The topic of this thesis is the notion of dialogue and how machines have not only influenced
the development of our understanding of this fundamental human social activity but also the
possibilities for engaging in mediated dialogue. In particular, the concern is with its adoption
and distortion from a computational point of view. An interactional perspective is developed
that provides insight into the problems and limitations of computer dialogue models, motivates
the investigation of the achievement of dialogue mediated 'through' machines, and informs
the conception and design of computer systems (or artifacts) that support the metaphor of
dialogue 'with' machines.
To motivate a reconstruction of the notion of dialogue and a different understanding of the
status of machines in terms of action, a critical analysis of computer models of dialogue,
concerning theory, data and implementation, is given. In general, computer models lack a
consideration of interaction as a constitutive domain, assume the interchange model of
dialogue, promote a sanitised view of data, and are a poor foundation for the design of
machines that are to engage in dialogue-like behaviour with a user. An alternative
interactional perspective is derived from hermeneutics and ethnomethodology in which it is
argued that the machine is an intelligible - not intelligent - artifact, and communicative activity
is circumstantial, situated and interactively constituted. Instead of reifying dialogue as the
repeated exchange of discrete messages between isolated cognitive processors (the
interchange model), dialogue is understood here to be the collection of practices in which
parties are mutually engaged in coordinating communicative actions and achieving shared
understanding out of the materials at hand. The empirical methodology of the thesis comes
from conversation analysis and forms the basis for the investigation of the achievement of
dialogue 'through' machines.
A detailed audio-visual study of a particular computer-mediated communication modality is
presented. Parties engaged in cooperatively constructing mutual orientation in dialogue (in
a virtual dialogue space) were recorded and features of their conduct were rendered for
analysis with the aid of a notation system specially developed for this study. The findings
are that the computer-mediated dialogue activity is a skilled, interactive accomplishment in
which dialogic presence, monitoring and participation are contingently created and
maintained. An emergent transformation of the dialogue activity demonstrates the situated
work of constructing participation, a process that is shaped by the dynamics of that activity.
A brief study of copresent collaboration documents two further features: the embodiment of
actions and their complementarity. The consequences of the interactional perspective and
the empirical study for computer models and dialogue 'with' machines are discussed.
Suggestions are also made about an alternative use of computer modelling for dialogue
'between' machines, and about the future of dialogue mediation and artifacts
- …