51,422 research outputs found

    A best view selection in meetings through attention analysis using a multi-camera network

    Get PDF
    Human activity analysis is an essential task in ambient intelligence and computer vision. The main focus lies in the automatic analysis of ongoing activities from a multi-camera network. One possible application is meeting analysis which explores the dynamics in meetings using low-level data and inferring high-level activities. However, the detection of such activities is still very challenging due to the often corrupted or imprecise low-level data. In this paper, we present an approach to understand the dynamics in meetings using a multi-camera network, consisting of fixed ambient and portable close-up cameras. As a particular application we are aiming to find the most informative video stream, for example as a representative view for a remote participant. Our contribution is threefold: at first, we estimate the extrinsic parameters of the portable close-up cameras based on head positions. Secondly, we find common overlapping areas based on the consensus of people’s orientation. And thirdly, the most informative view for a remote participant is estimated using common overlapping areas. We evaluated our proposed approach and compared it to a motion estimation method. Experimental results show that we can reach an accuracy of 74% compared to manually selected views

    Refining personal and social presence in virtual meetings

    Get PDF
    Virtual worlds show promise for conducting meetings and conferences without the need for physical travel. Current experience suggests the major limitation to the more widespread adoption and acceptance of virtual conferences is the failure of existing environments to provide a sense of immersion and engagement, or of ‘being there’. These limitations are largely related to the appearance and control of avatars, and to the absence of means to convey non-verbal cues of facial expression and body language. This paper reports on a study involving the use of a mass-market motion sensor (Kinect™) and the mapping of participant action in the real world to avatar behaviour in the virtual world. This is coupled with full-motion video representation of participant’s faces on their avatars to resolve both identity and facial expression issues. The outcomes of a small-group trial meeting based on this technology show a very positive reaction from participants, and the potential for further exploration of these concepts

    First impressions: A survey on vision-based apparent personality trait analysis

    Get PDF
    © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Personality analysis has been widely studied in psychology, neuropsychology, and signal processing fields, among others. From the past few years, it also became an attractive research area in visual computing. From the computational point of view, by far speech and text have been the most considered cues of information for analyzing personality. However, recently there has been an increasing interest from the computer vision community in analyzing personality from visual data. Recent computer vision approaches are able to accurately analyze human faces, body postures and behaviors, and use these information to infer apparent personality traits. Because of the overwhelming research interest in this topic, and of the potential impact that this sort of methods could have in society, we present in this paper an up-to-date review of existing vision-based approaches for apparent personality trait recognition. We describe seminal and cutting edge works on the subject, discussing and comparing their distinctive features and limitations. Future venues of research in the field are identified and discussed. Furthermore, aspects on the subjectivity in data labeling/evaluation, as well as current datasets and challenges organized to push the research on the field are reviewed.Peer ReviewedPostprint (author's final draft

    Tracking and modeling focus of attention in meetings [online]

    Get PDF
    Abstract This thesis addresses the problem of tracking the focus of attention of people. In particular, a system to track the focus of attention of participants in meetings is developed. Obtaining knowledge about a person\u27s focus of attention is an important step towards a better understanding of what people do, how and with what or whom they interact or to what they refer. In meetings, focus of attention can be used to disambiguate the addressees of speech acts, to analyze interaction and for indexing of meeting transcripts. Tracking a user\u27s focus of attention also greatly contributes to the improvement of human­computer interfaces since it can be used to build interfaces and environments that become aware of what the user is paying attention to or with what or whom he is interacting. The direction in which people look; i.e., their gaze, is closely related to their focus of attention. In this thesis, we estimate a subject\u27s focus of attention based on his or her head orientation. While the direction in which someone looks is determined by head orientation and eye gaze, relevant literature suggests that head orientation alone is a su#cient cue for the detection of someone\u27s direction of attention during social interaction. We present experimental results from a user study and from several recorded meetings that support this hypothesis. We have developed a Bayesian approach to model at whom or what someone is look­ ing based on his or her head orientation. To estimate head orientations in meetings, the participants\u27 faces are automatically tracked in the view of a panoramic camera and neural networks are used to estimate their head orientations from pre­processed images of their faces. Using this approach, the focus of attention target of subjects could be correctly identified during 73% of the time in a number of evaluation meet­ ings with four participants. In addition, we have investigated whether a person\u27s focus of attention can be pre­dicted from other cues. Our results show that focus of attention is correlated to who is speaking in a meeting and that it is possible to predict a person\u27s focus of attention based on the information of who is talking or was talking before a given moment. We have trained neural networks to predict at whom a person is looking, based on information about who was speaking. Using this approach we were able to predict who is looking at whom with 63% accuracy on the evaluation meetings using only information about who was speaking. We show that by using both head orientation and speaker information to estimate a person\u27s focus, the accuracy of focus detection can be improved compared to just using one of the modalities for focus estimation. To demonstrate the generality of our approach, we have built a prototype system to demonstrate focus­aware interaction with a household robot and other smart appliances in a room using the developed components for focus of attention tracking. In the demonstration environment, a subject could interact with a simulated household robot, a speech­enabled VCR or with other people in the room, and the recipient of the subject\u27s speech was disambiguated based on the user\u27s direction of attention. Zusammenfassung Die vorliegende Arbeit beschäftigt sich mit der automatischen Bestimmung und Ver­folgung des Aufmerksamkeitsfokus von Personen in Besprechungen. Die Bestimmung des Aufmerksamkeitsfokus von Personen ist zum Verständnis und zur automatischen Auswertung von Besprechungsprotokollen sehr wichtig. So kann damit beispielsweise herausgefunden werden, wer zu einem bestimmten Zeitpunkt wen angesprochen hat beziehungsweise wer wem zugehört hat. Die automatische Bestim­mung des Aufmerksamkeitsfokus kann desweiteren zur Verbesserung von Mensch-Maschine­Schnittstellen benutzt werden. Ein wichtiger Hinweis auf die Richtung, in welche eine Person ihre Aufmerksamkeit richtet, ist die Kopfstellung der Person. Daher wurde ein Verfahren zur Bestimmung der Kopfstellungen von Personen entwickelt. Hierzu wurden künstliche neuronale Netze benutzt, welche als Eingaben vorverarbeitete Bilder des Kopfes einer Person erhalten, und als Ausgabe eine Schätzung der Kopfstellung berechnen. Mit den trainierten Netzen wurde auf Bilddaten neuer Personen, also Personen, deren Bilder nicht in der Trainingsmenge enthalten waren, ein mittlerer Fehler von neun bis zehn Grad für die Bestimmung der horizontalen und vertikalen Kopfstellung erreicht. Desweiteren wird ein probabilistischer Ansatz zur Bestimmung von Aufmerksamkeits­zielen vorgestellt. Es wird hierbei ein Bayes\u27scher Ansatzes verwendet um die A­posterior iWahrscheinlichkeiten verschiedener Aufmerksamkteitsziele, gegeben beobachteter Kopfstellungen einer Person, zu bestimmen. Die entwickelten Ansätze wurden auf mehren Besprechungen mit vier bis fünf Teilnehmern evaluiert. Ein weiterer Beitrag dieser Arbeit ist die Untersuchung, inwieweit sich die Blickrich­tung der Besprechungsteilnehmer basierend darauf, wer gerade spricht, vorhersagen läßt. Es wurde ein Verfahren entwickelt um mit Hilfe von neuronalen Netzen den Fokus einer Person basierend auf einer kurzen Historie der Sprecherkonstellationen zu schätzen. Wir zeigen, dass durch Kombination der bildbasierten und der sprecherbasierten Schätzung des Aufmerksamkeitsfokus eine deutliche verbesserte Schätzung erreicht werden kann. Insgesamt wurde mit dieser Arbeit erstmals ein System vorgestellt um automatisch die Aufmerksamkeit von Personen in einem Besprechungsraum zu verfolgen. Die entwickelten Ansätze und Methoden können auch zur Bestimmung der Aufmerk­samkeit von Personen in anderen Bereichen, insbesondere zur Steuerung von comput­erisierten, interaktiven Umgebungen, verwendet werden. Dies wird an einer Beispielapplikation gezeigt

    Alcohol, assault and licensed premises in inner-city areas

    Get PDF
    This report contains eight linked feasibility studies conducted in Cairns during 2010. These exploratory studies examine the complex challenges of compiling and sharing information about incidents of person-to-person violence in a late night entertainment precinct (LNEP). The challenges were methodological as well as logistical and ethical. The studies look at how information can be usefully shared, while preserving the confidentiality of those involved. They also examine how information can be compiled from routinely collected sources with little or no additional resources, and then shared by the agencies that are providing and using the information.Although the studies are linked, they are also stand-alone and so can be published in peer-reviewed literature. Some have already been published, or are ‘in press’ or have been submitted for review. Others require the NDLERF board’s permission to be published as they include data related more directly to policing, or they include information provided by police.The studies are incorporated into the document under section headings. In each section, they are introduced and then presented in their final draft form. The final published form of each paper, however, is likely to be different from the draft because of journal and reviewer requirements. The content, results and implications of each study are discussed in summaries included in each section.Funded by the National Drug Law Enforcement Research Fund, an initiative of the National Drug StrategyAlan R Clough (PhD) School of Public Health, Tropical Medicine and Rehabilitation Sciences James Cook UniversityCharmaine S Hayes-Jonkers (BPsy, BSocSci (Hon1)) James Cook University, Cairns.Edward S Pointing (BPsych) James Cook University, Cairns

    Tracking Gaze and Visual Focus of Attention of People Involved in Social Interaction

    Get PDF
    The visual focus of attention (VFOA) has been recognized as a prominent conversational cue. We are interested in estimating and tracking the VFOAs associated with multi-party social interactions. We note that in this type of situations the participants either look at each other or at an object of interest; therefore their eyes are not always visible. Consequently both gaze and VFOA estimation cannot be based on eye detection and tracking. We propose a method that exploits the correlation between eye gaze and head movements. Both VFOA and gaze are modeled as latent variables in a Bayesian switching state-space model. The proposed formulation leads to a tractable learning procedure and to an efficient algorithm that simultaneously tracks gaze and visual focus. The method is tested and benchmarked using two publicly available datasets that contain typical multi-party human-robot and human-human interactions.Comment: 15 pages, 8 figures, 6 table

    Multi-agency training and the artist (Sharing our experience, Practitioner-led research 2008-2009; PLR0809/032)

    Get PDF
    The Multi-Agency Team Project approached issues of multi-agency training indirectly by using an artist as a catalyst in a group exercise examining movement and sound in relation to early childhood. The aim of the research was to run an experiential non-traditional training programme based on using an artist as a catalyst to promote inter-agency dialogue in one setting, Woodlands Park Nursery and Children’s Centre, and to analyse the findings. Eleven participants used this common experiential focus to frame collective research both as a focus group and as individual fieldworkers. The research demonstrated shared professional discourse but also collected judgements relevant to policy issues based on collaborative professional reflection triggered by the exercise. The findings are presented theoretically in terms of critical discourse analysis using the interpretation-supporting software ATLASti. We next take a further look at the role play exercise in which the group constituted itself as a ‘House of Commons Select Committee’ before summarizing what theoretical insights might be brought to bear and attempting to draw some provisional conclusions. Some evidence is presented suggesting there is a degree of tension and ambiguity between alterative models of multi-agency working

    Digital Dissemination Platform of Transportation Engineering Education Materials Founded in Adoption Research

    Get PDF
    INE/AUTC 14.0

    F-formation Detection: Individuating Free-standing Conversational Groups in Images

    Full text link
    Detection of groups of interacting people is a very interesting and useful task in many modern technologies, with application fields spanning from video-surveillance to social robotics. In this paper we first furnish a rigorous definition of group considering the background of the social sciences: this allows us to specify many kinds of group, so far neglected in the Computer Vision literature. On top of this taxonomy, we present a detailed state of the art on the group detection algorithms. Then, as a main contribution, we present a brand new method for the automatic detection of groups in still images, which is based on a graph-cuts framework for clustering individuals; in particular we are able to codify in a computational sense the sociological definition of F-formation, that is very useful to encode a group having only proxemic information: position and orientation of people. We call the proposed method Graph-Cuts for F-formation (GCFF). We show how GCFF definitely outperforms all the state of the art methods in terms of different accuracy measures (some of them are brand new), demonstrating also a strong robustness to noise and versatility in recognizing groups of various cardinality.Comment: 32 pages, submitted to PLOS On
    corecore