320 research outputs found

    Robust Modeling of Epistemic Mental States and Their Applications in Assistive Technology

    Get PDF
    This dissertation presents the design and implementation of EmoAssist: Emotion-Enabled Assistive Tool to Enhance Dyadic Conversation for the Blind . The key functionalities of the system are to recognize behavioral expressions and to predict 3-D affective dimensions from visual cues and to provide audio feedback to the visually impaired in a natural environment. Prior to describing the EmoAssist, this dissertation identifies and advances research challenges in the analysis of the facial features and their temporal dynamics with Epistemic Mental States in dyadic conversation. A number of statistical analyses and simulations were performed to get the answer of important research questions about the complex interplay between facial features and mental states. It was found that the non-linear relations are mostly prevalent rather than the linear ones. Further, the portable prototype of assistive technology that can aid blind individual to understand his/her interlocutor\u27s mental states has been designed based on the analysis. A number of challenges related to the system, communication protocols, error-free tracking of face and robust modeling of behavioral expressions /affective dimensions were addressed to make the EmoAssist effective in a real world scenario. In addition, orientation-sensor information from the phone was used to correct image alignment to improve the robustness in real life deployment. It was observed that the EmoAssist can predict affective dimensions with acceptable accuracy (Maximum Correlation-Coefficient for valence: 0.76, arousal: 0.78, and dominance: 0.76) in natural conversation. The overall minimum and maximum response-times are (64.61 milliseconds) and (128.22 milliseconds), respectively. The integration of sensor information for correcting the orientation has helped in significant improvement (16% in average) of accuracy in recognizing behavioral expressions. A user study with ten blind people shows that the EmoAssist is highly acceptable to them (Average acceptability rating using Likert: 6.0 where 1 and 7 are the lowest and highest possible ratings, respectively) in social interaction

    Probabilistic modelling and inference of human behaviour from mobile phone time series

    No full text
    With an estimated 4.1 billion subscribers around the world, the mobile phone offers a unique opportunity to sense and understand human behaviour from location, co-presence and communication data. While the benefit of modelling this unprecedented amount of data is widely recognised, a number of challenges impede the development of accurate behaviour models. In this thesis, we identify and address two modelling problems and show that their consideration improves the accuracy of behaviour inference. We first examine the modelling of long-range dependencies in human behaviour. Human behaviour models only take into account short-range dependencies in mobile phone time series. Using information theory, we quantify long-range dependencies in mobile phone time series for the first time, demonstrate that they exhibit periodic oscillations and introduce novel tools to analyse them. We further show that considering what the user did 24 hours earlier improves accuracy when predicting user behaviour five hours or longer in advance. The second problem that we address is the modelling of temporal variations in human behaviour. The time spent by a user on an activity varies from one day to the next. In order to recognise behaviour patterns despite temporal variations, we establish a methodological connection between human behaviour modelling and biological sequence alignment. This connection allows us to compare, cluster and model behaviour sequences and introduce novel features for behaviour recognition which improve its accuracy. The experiments presented in this thesis have been conducted on the largest publicly available mobile phone dataset labelled in an unsupervised fashion and are entirely repeatable. Furthermore, our techniques only require cellular data which can easily be recorded by today's mobile phones and could benefit a wide range of applications including life logging, health monitoring, customer profiling and large-scale surveillance

    Soiveillance: Self-Consciousness and the Social Network in Hideaki Anno’s Love & Pop

    Get PDF
    This article analyses the surveillance aesthetic of Hideaki Anno’s 1998 film Love & Pop. It is proposed that the film communicates the concept of “soiveillance”—a watching (veillance) that is of one’s self (soi). What underpins soiveillance is the paranoia associated with social surveillance (Marwick 2012), specifically the self-consciousness involved in the image sharing that constructs the virtual self of the social media user (Willett 2009). With its theme of enjo kosai—or “paid-dates” between adult males and female teenagers—Love & Pop’s communication of soiveillance further illuminates the impact of one’s gender status within the social network, and the manner in which real-world patriarchy and misogyny pass into the virtual construction of selves. The methodology used to argue these points rests on a reconfigured take on the term “scopophilia” within the study of visual media. Scopophilia, rethought as a love of vision itself, aligns with Murakami’s (2000) theory of the superflat on three key points: the acknowledgment that emerging technologies have created new image-functions and image-structures that require a broadening of our theoretical vocabulary; an atemporal approach to the reading of images, such that a late-nineties film like Anno’s can provide important insights into 21st century concerns; and a recognition of intermedial convergence, which allows us to read the activity of online video sharing as a form of narrative equivalent to the sequencing of shots within a cinematic montage

    Data Privacy & National Security: A Rubik’s Cube of Challenges and Opportunities That Are Inextricably Linked

    Get PDF
    Traditionally, issues relating to information privacy have been viewed in a set of distinct, and not always helpful, stovepipes—or, as my former government colleagues often said, tongue-in-cheek, in other contexts—separate “cylinders of excellence.” Thanks to the convergence of technologies and information, the once-separate realms of personal data privacy, consumer protection, and national security are increasingly interconnected. As Congress and national policymakers consider proposals for federal data privacy legislation, regulation of social media platforms, and how to prevent abuses of foreign intelligence and homeland security powers, they should be examining each of these challenges in light of the others, actively looking for synergies and overlap in the protections they may be considering for protection of personal data, individual privacy, and civil liberties.

    Interfaces de fala silenciosa multimodais para português europeu com base na articulação

    Get PDF
    Doutoramento conjunto MAPi em InformáticaThe concept of silent speech, when applied to Human-Computer Interaction (HCI), describes a system which allows for speech communication in the absence of an acoustic signal. By analyzing data gathered during different parts of the human speech production process, Silent Speech Interfaces (SSI) allow users with speech impairments to communicate with a system. SSI can also be used in the presence of environmental noise, and in situations in which privacy, confidentiality, or non-disturbance are important. Nonetheless, despite recent advances, performance and usability of Silent Speech systems still have much room for improvement. A better performance of such systems would enable their application in relevant areas, such as Ambient Assisted Living. Therefore, it is necessary to extend our understanding of the capabilities and limitations of silent speech modalities and to enhance their joint exploration. Thus, in this thesis, we have established several goals: (1) SSI language expansion to support European Portuguese; (2) overcome identified limitations of current SSI techniques to detect EP nasality (3) develop a Multimodal HCI approach for SSI based on non-invasive modalities; and (4) explore more direct measures in the Multimodal SSI for EP acquired from more invasive/obtrusive modalities, to be used as ground truth in articulation processes, enhancing our comprehension of other modalities. In order to achieve these goals and to support our research in this area, we have created a multimodal SSI framework that fosters leveraging modalities and combining information, supporting research in multimodal SSI. The proposed framework goes beyond the data acquisition process itself, including methods for online and offline synchronization, multimodal data processing, feature extraction, feature selection, analysis, classification and prototyping. Examples of applicability are provided for each stage of the framework. These include articulatory studies for HCI, the development of a multimodal SSI based on less invasive modalities and the use of ground truth information coming from more invasive/obtrusive modalities to overcome the limitations of other modalities. In the work here presented, we also apply existing methods in the area of SSI to EP for the first time, noting that nasal sounds may cause an inferior performance in some modalities. In this context, we propose a non-invasive solution for the detection of nasality based on a single Surface Electromyography sensor, conceivable of being included in a multimodal SSI.O conceito de fala silenciosa, quando aplicado a interação humano-computador, permite a comunicação na ausência de um sinal acústico. Através da análise de dados, recolhidos no processo de produção de fala humana, uma interface de fala silenciosa (referida como SSI, do inglês Silent Speech Interface) permite a utilizadores com deficiências ao nível da fala comunicar com um sistema. As SSI podem também ser usadas na presença de ruído ambiente, e em situações em que privacidade, confidencialidade, ou não perturbar, é importante. Contudo, apesar da evolução verificada recentemente, o desempenho e usabilidade de sistemas de fala silenciosa tem ainda uma grande margem de progressão. O aumento de desempenho destes sistemas possibilitaria assim a sua aplicação a áreas como Ambientes Assistidos. É desta forma fundamental alargar o nosso conhecimento sobre as capacidades e limitações das modalidades utilizadas para fala silenciosa e fomentar a sua exploração conjunta. Assim, foram estabelecidos vários objetivos para esta tese: (1) Expansão das linguagens suportadas por SSI com o Português Europeu; (2) Superar as limitações de técnicas de SSI atuais na deteção de nasalidade; (3) Desenvolver uma abordagem SSI multimodal para interação humano-computador, com base em modalidades não invasivas; (4) Explorar o uso de medidas diretas e complementares, adquiridas através de modalidades mais invasivas/intrusivas em configurações multimodais, que fornecem informação exata da articulação e permitem aumentar a nosso entendimento de outras modalidades. Para atingir os objetivos supramencionados e suportar a investigação nesta área procedeu-se à criação de uma plataforma SSI multimodal que potencia os meios para a exploração conjunta de modalidades. A plataforma proposta vai muito para além da simples aquisição de dados, incluindo também métodos para sincronização de modalidades, processamento de dados multimodais, extração e seleção de características, análise, classificação e prototipagem. Exemplos de aplicação para cada fase da plataforma incluem: estudos articulatórios para interação humano-computador, desenvolvimento de uma SSI multimodal com base em modalidades não invasivas, e o uso de informação exata com origem em modalidades invasivas/intrusivas para superar limitações de outras modalidades. No trabalho apresentado aplica-se ainda, pela primeira vez, métodos retirados do estado da arte ao Português Europeu, verificando-se que sons nasais podem causar um desempenho inferior de um sistema de fala silenciosa. Neste contexto, é proposta uma solução para a deteção de vogais nasais baseada num único sensor de eletromiografia, passível de ser integrada numa interface de fala silenciosa multimodal

    Understanding egocentric human actions with temporal decision forests

    Get PDF
    Understanding human actions is a fundamental task in computer vision with a wide range of applications including pervasive health-care, robotics and game control. This thesis focuses on the problem of egocentric action recognition from RGB-D data, wherein the world is viewed through the eyes of the actor whose hands describe the actions. The main contributions of this work are its findings regarding egocentric actions as described by hands in two application scenarios and a proposal of a new technique that is based on temporal decision forests. The thesis first introduces a novel framework to recognise fingertip writing in mid-air in the context of human-computer interaction. This framework detects whether the user is writing and tracks the fingertip over time to generate spatio-temporal trajectories that are recognised by using a Hough forest variant that encourages temporal consistency in prediction. A problem with using such forest approach for action recognition is that the learning of temporal dynamics is limited to hand-crafted temporal features and temporal regression, which may break the temporal continuity and lead to inconsistent predictions. To overcome this limitation, the thesis proposes transition forests. Besides any temporal information that is encoded in the feature space, the forest automatically learns the temporal dynamics during training, and it is exploited in inference in an online and efficient manner achieving state-of-the-art results. The last contribution of this thesis is its introduction of the first RGB-D benchmark to allow for the study of egocentric hand-object actions with both hand and object pose annotations. This study conducts an extensive evaluation of different baselines, state-of-the art approaches and temporal decision forest models using colour, depth and hand pose features. Furthermore, it extends the transition forest model to incorporate data from different modalities and demonstrates the benefit of using hand pose features to recognise egocentric human actions. The thesis concludes by discussing and analysing the contributions and proposing a few ideas for future work.Open Acces
    corecore