1,110 research outputs found

    Learning midlevel image features for natural scene and texture classification

    Get PDF
    This paper deals with coding of natural scenes in order to extract semantic information. We present a new scheme to project natural scenes onto a basis in which each dimension encodes statistically independent information. Basis extraction is performed by independent component analysis (ICA) applied to image patches culled from natural scenes. The study of the resulting coding units (coding filters) extracted from well-chosen categories of images shows that they adapt and respond selectively to discriminant features in natural scenes. Given this basis, we define global and local image signatures relying on the maximal activity of filters on the input image. Locally, the construction of the signature takes into account the spatial distribution of the maximal responses within the image. We propose a criterion to reduce the size of the space of representation for faster computation. The proposed approach is tested in the context of texture classification (111 classes), as well as natural scenes classification (11 categories, 2037 images). Using a common protocol, the other commonly used descriptors have at most 47.7% accuracy on average while our method obtains performances of up to 63.8%. We show that this advantage does not depend on the size of the signature and demonstrate the efficiency of the proposed criterion to select ICA filters and reduce the dimensio

    Multi-sensor human action recognition with particular application to tennis event-based indexing

    Get PDF
    The ability to automatically classify human actions and activities using vi- sual sensors or by analysing body worn sensor data has been an active re- search area for many years. Only recently with advancements in both fields and the ubiquitous nature of low cost sensors in our everyday lives has auto- matic human action recognition become a reality. While traditional sports coaching systems rely on manual indexing of events from a single modality, such as visual or inertial sensors, this thesis investigates the possibility of cap- turing and automatically indexing events from multimodal sensor streams. In this work, we detail a novel approach to infer human actions by fusing multimodal sensors to improve recognition accuracy. State of the art visual action recognition approaches are also investigated. Firstly we apply these action recognition detectors to basic human actions in a non-sporting con- text. We then perform action recognition to infer tennis events in a tennis court instrumented with cameras and inertial sensing infrastructure. The system proposed in this thesis can use either visual or inertial sensors to au- tomatically recognise the main tennis events during play. A complete event retrieval system is also presented to allow coaches to build advanced queries, which existing sports coaching solutions cannot facilitate, without an inordi- nate amount of manual indexing. The event retrieval interface is evaluated against a leading commercial sports coaching tool in terms of both usability and efficiency

    Using visual lifelogs to automatically characterise everyday activities

    Get PDF
    Visual lifelogging is the term used to describe recording our everyday lives using wearable cameras, for applications which are personal to us and do not involve sharing our recorded data. Current applications of visual lifelogging are built around remembrance or searching for specific events from the past. The purpose of the work reported here is to extend this to allow us to characterise and measure the occurrence of everyday activities of the wearer and in so doing to gain insights into the wearer's everyday behaviour. The methods we use are to capture everyday activities using a wearable camera called SenseCam, and to use an algorithm we have developed which indexes lifelog images by the occurrence of basic semantic concepts. We then use data reduction techniques to automatically generate a profile of the wearer's everyday behaviour and activities. Our algorithm has been evaluated on a large set of concepts investigated from 13 users in a user experiment, and for a group of 16 popular everyday activities we achieve an average F-score of 0.90. Our conclusions are that the the technique we have presented for unobtrusively and ambiently characterising everyday behaviour and activities across individuals is of sufficient accuracy to be usable in a range of applications

    Multimodal Affect Recognition: Current Approaches and Challenges

    Get PDF
    Many factors render multimodal affect recognition approaches appealing. First, humans employ a multimodal approach in emotion recognition. It is only fitting that machines, which attempt to reproduce elements of the human emotional intelligence, employ the same approach. Second, the combination of multiple-affective signals not only provides a richer collection of data but also helps alleviate the effects of uncertainty in the raw signals. Lastly, they potentially afford us the flexibility to classify emotions even when one or more source signals are not possible to retrieve. However, the multimodal approach presents challenges pertaining to the fusion of individual signals, dimensionality of the feature space, and incompatibility of collected signals in terms of time resolution and format. In this chapter, we explore the aforementioned challenges while presenting the latest scholarship on the topic. Hence, we first discuss the various modalities used in affect classification. Second, we explore the fusion of modalities. Third, we present publicly accessible multimodal datasets designed to expedite work on the topic by eliminating the laborious task of dataset collection. Fourth, we analyze representative works on the topic. Finally, we summarize the current challenges in the field and provide ideas for future research directions

    ZATLAB : recognizing gestures for artistic performance interaction

    Get PDF
    Most artistic performances rely on human gestures, ultimately resulting in an elaborate interaction between the performer and the audience. Humans, even without any kind of formal analysis background in music, dance or gesture are typically able to extract, almost unconsciously, a great amount of relevant information from a gesture. In fact, a gesture contains so much information, why not use it to further enhance a performance? Gestures and expressive communication are intrinsically connected, and being intimately attached to our own daily existence, both have a central position in our (nowadays) technological society. However, the use of technology to understand gestures is still somehow vaguely explored, it has moved beyond its first steps but the way towards systems fully capable of analyzing gestures is still long and difficult (Volpe, 2005). Probably because, if on one hand, the recognition of gestures is somehow a trivial task for humans, on the other hand, the endeavor of translating gestures to the virtual world, with a digital encoding is a difficult and illdefined task. It is necessary to somehow bridge this gap, stimulating a constructive interaction between gestures and technology, culture and science, performance and communication. Opening thus, new and unexplored frontiers in the design of a novel generation of multimodal interactive systems. This work proposes an interactive, real time, gesture recognition framework called the Zatlab System (ZtS). This framework is flexible and extensible. Thus, it is in permanent evolution, keeping up with the different technologies and algorithms that emerge at a fast pace nowadays. The basis of the proposed approach is to partition a temporal stream of captured movement into perceptually motivated descriptive features and transmit them for further processing in Machine Learning algorithms. The framework described will take the view that perception primarily depends on the previous knowledge or learning. Just like humans do, the framework will have to learn gestures and their main features so that later it can identify them. It is however planned to be flexible enough to allow learning gestures on the fly. This dissertation also presents a qualitative and quantitative experimental validation of the framework. The qualitative analysis provides the results concerning the users acceptability of the framework. The quantitative validation provides the results about the gesture recognizing algorithms. The use of Machine Learning algorithms in these tasks allows the achievement of final results that compare or outperform typical and state-of-the-art systems. In addition, there are also presented two artistic implementations of the framework, thus assessing its usability amongst the artistic performance domain. Although a specific implementation of the proposed framework is presented in this dissertation and made available as open source software, the proposed approach is flexible enough to be used in other case scenarios, paving the way to applications that can benefit not only the performative arts domain, but also, probably in the near future, helping other types of communication, such as the gestural sign language for the hearing impaired.Grande parte das apresentações artísticas são baseadas em gestos humanos, ultimamente resultando numa intricada interação entre o performer e o público. Os seres humanos, mesmo sem qualquer tipo de formação em música, dança ou gesto são capazes de extrair, quase inconscientemente, uma grande quantidade de informações relevantes a partir de um gesto. Na verdade, um gesto contém imensa informação, porque não usá-la para enriquecer ainda mais uma performance? Os gestos e a comunicação expressiva estão intrinsecamente ligados e estando ambos intimamente ligados à nossa própria existência quotidiana, têm uma posicão central nesta sociedade tecnológica actual. No entanto, o uso da tecnologia para entender o gesto está ainda, de alguma forma, vagamente explorado. Existem já alguns desenvolvimentos, mas o objetivo de sistemas totalmente capazes de analisar os gestos ainda está longe (Volpe, 2005). Provavelmente porque, se por um lado, o reconhecimento de gestos é de certo modo uma tarefa trivial para os seres humanos, por outro lado, o esforço de traduzir os gestos para o mundo virtual, com uma codificação digital é uma tarefa difícil e ainda mal definida. É necessário preencher esta lacuna de alguma forma, estimulando uma interação construtiva entre gestos e tecnologia, cultura e ciência, desempenho e comunicação. Abrindo assim, novas e inexploradas fronteiras na concepção de uma nova geração de sistemas interativos multimodais . Este trabalho propõe uma framework interativa de reconhecimento de gestos, em tempo real, chamada Sistema Zatlab (ZtS). Esta framework é flexível e extensível. Assim, está em permanente evolução, mantendo-se a par das diferentes tecnologias e algoritmos que surgem num ritmo acelerado hoje em dia. A abordagem proposta baseia-se em dividir a sequência temporal do movimento humano nas suas características descritivas e transmiti-las para posterior processamento, em algoritmos de Machine Learning. A framework descrita baseia-se no facto de que a percepção depende, principalmente, do conhecimento ou aprendizagem prévia. Assim, tal como os humanos, a framework terá que aprender os gestos e as suas principais características para que depois possa identificá-los. No entanto, esta está prevista para ser flexível o suficiente de forma a permitir a aprendizagem de gestos de forma dinâmica. Esta dissertação apresenta também uma validação experimental qualitativa e quantitativa da framework. A análise qualitativa fornece os resultados referentes à aceitabilidade da framework. A validação quantitativa fornece os resultados sobre os algoritmos de reconhecimento de gestos. O uso de algoritmos de Machine Learning no reconhecimento de gestos, permite a obtençãoc¸ ˜ao de resultados finais que s˜ao comparaveis ou superam outras implementac¸ ˜oes do mesmo g´enero. Al ´em disso, s˜ao tamb´em apresentadas duas implementac¸ ˜oes art´ısticas da framework, avaliando assim a sua usabilidade no dom´ınio da performance art´ıstica. Apesar duma implementac¸ ˜ao espec´ıfica da framework ser apresentada nesta dissertac¸ ˜ao e disponibilizada como software open-source, a abordagem proposta ´e suficientemente flex´ıvel para que esta seja usada noutros cen´ arios. Abrindo assim, o caminho para aplicac¸ ˜oes que poder˜ao beneficiar n˜ao s´o o dom´ınio das artes performativas, mas tamb´em, provavelmente num futuro pr ´oximo, outros tipos de comunicac¸ ˜ao, como por exemplo, a linguagem gestual usada em casos de deficiˆencia auditiva

    FRBR, Facets, and Moving Images: A Literature Review

    Get PDF
    Annotated bibliography on resources related to FBRB, facets and moving images

    ECLAP 2012 Conference on Information Technologies for Performing Arts, Media Access and Entertainment

    Get PDF
    It has been a long history of Information Technology innovations within the Cultural Heritage areas. The Performing arts has also been enforced with a number of new innovations which unveil a range of synergies and possibilities. Most of the technologies and innovations produced for digital libraries, media entertainment and education can be exploited in the field of performing arts, with adaptation and repurposing. Performing arts offer many interesting challenges and opportunities for research and innovations and exploitation of cutting edge research results from interdisciplinary areas. For these reasons, the ECLAP 2012 can be regarded as a continuation of past conferences such as AXMEDIS and WEDELMUSIC (both pressed by IEEE and FUP). ECLAP is an European Commission project to create a social network and media access service for performing arts institutions in Europe, to create the e-library of performing arts, exploiting innovative solutions coming from the ICT

    Moving sounds and sonic moves : exploring interaction quality of embodied music mediation technologies through a user-centered perspective

    Get PDF
    This research project deals with the user-experience related to embodied music mediation technologies. More specifically, adoption and policy problems surrounding new media (art) are considered, which arise from the usability issues that to date pervade new interfaces for musical expression. Since the emergence of new wireless mediators and control devices for musical expression, there is an explicit aspiration of the creative industries and various research centers to embed such technologies into different areas of the cultural industries. The number of applications and their uses have exponentially increased over the last decade. Conversely, many of the applications to date still suffer from severe usability problems, which not only hinder the adoption by the cultural sector, but also make culture participants take a rather cautious, hesitant, or even downright negative stance towards these technologies. Therefore, this thesis takes a vantage point that is in part sociological in nature, yet has a link to cultural studies as well. It combines this with a musicological frame of reference to which it introduces empirical user-oriented approaches, predominantly taken from the field of human-computer-interaction studies. This interdisciplinary strategy is adopted to cope with the complex nature of digital embodied music controlling technologies. Within the Flanders cultural (and creative) industries, opportunities of systems affiliated with embodied interaction are created and examined. This constitutes an epistemological jigsaw that looks into 1) “which stakeholders require what various levels of involvement, what interactive means and what artistic possibilities?”, 2) “the way in which artistic aspirations, cultural prerequisites and operational necessities of (prospective) users can be defined?”, 3) “how functional, artistic and aesthetic requirements can be accommodated?”, and 4) “how quality of use and quality of experience can be achieved, quantified, evaluated and, eventually, improved?”. Within this multi-facetted problem, the eventual aim is to assess the applicability of the foresaid technology, both from a theoretically and empirically sound basis, and to facilitate widening and enhancing the adoption of said technologies. Methodologically, this is achieved by 1) applied experimentation, 2) interview techniques, 3) self-reporting and survey research, 4) usability evaluation of existing devices, and 5) human-computer interaction methods applied – and attuned – to the specific case of embodied music mediation technologies. Within that scope, concepts related to usability, flow, presence, goal assessment and game enjoyment are scrutinized and applied, and both task- and experience-oriented heuristics and metrics are developed and tested. In the first part, covering three chapters, the general context of the thesis is given. In the first chapter, an introduction to the topic is offered and the current problems are enumerated. In the second chapter, a broader theoretical background is presented of the concepts that underpin the project, namely 1) the paradigm of embodiment and its connection to musicology, 2) a state of the arts concerning new interfaces for musical expression, 3) an introduction into HCI-usability and its application domain in systematic musicology, 4) an insight into user-centered digital design procedures, and 5) the challenges brought about by e-culture and digitization for the cultural-creative industries. In the third chapter, the state of the arts concerning the available methodologies related to the thesis’ endeavor is discussed, a set of literature-based design guidelines are enumerated and from this a conceptual model is deduced which is gradually presented throughout the thesis, and fully deployed in the “SoundField”-project (as described in Chapter 9). The following chapters, contained in the second part of the thesis, give a quasi-chronological overview of how methodological concepts have been applied throughout the empirical case studies, aimed specifically at the exploration of the various aspects of the complex status quaestionis. In the fourth chapter, a series of application-based tests, predominantly revolving around interface evaluation, illustrate the complex relation between gestural interfaces and meaningful musical expression, advocating a more user-centered development approach to be adopted. In the fifth chapter, a multi-purpose questionnaire dubbed “What Moves You” is discussed, which aimed at creating a survey of the (prospective) end-users of embodied music mediation technologies. Therefore, it primarily focused on cultural background, musical profile and preferences, views on embodied interaction, literacy of and attitudes towards new technology and participation in digital culture. In the sixth chapter, the ethnographical studies that accompanied the exhibition of two interactive art pieces, entitled "Heart as an Ocean" & "Lament", are discussed. In these studies, the use of interview and questionnaire methodologies together with the presentation and reception of interactive art pieces, are probed. In the seventh chapter, the development of the collaboratively controlled music-game “Sync-In-Team” is presented, in which interface evaluation, presence, game enjoyment and goal assessment are the pivotal topics. In the eighth chapter, two usability studies are considered, that were conducted on prototype systems/interfaces, namely a heuristic evaluation of the “Virtual String” and a usability metrics evaluation on the “Multi-Level Sonification Tool”. The findings of these two studies in conjunction with the exploratory studies performed in association with the interactive art pieces, finally gave rise to the “SoundField”-project, which is recounted in full throughout the ninth chapter. The integrated participatory design and evaluation method, presented in the conceptual model is fully applied over the course of the “SoundField”-project, in which technological opportunities and ecological validity and applicability are investigated through user-informed development of numerous use cases. The third and last part of the thesis renders the final conclusions of this research project. The tenth chapter sets out with an epilogue in which a brief overview is given on how the state of the arts has evolved since the end of the project (as the research ended in 2012, but the research field has obviously moved on), and attempts to consolidate the implications of the research studies with some of the realities of the Flemish cultural-creative industries. Chapter eleven continues by discussing the strengths and weaknesses of the conceptual model throughout the various stages of the project. Also, it comprises the evaluation of the hypotheses, how the assumptions that were made held up, and how the research questions eventually could be assessed. Finally, the twelfth and last chapter concludes with the most important findings of the project. Also, it discusses some of the implications on cultural production, artistic research policy and offers an outlook on future research beyond the scope of the “SoundField” project

    Toward an Emotional Individual Motor Signature

    Get PDF
    Bodily expression of felt emotion has been documented in the literature. However, it is often associated with high motor variability between individuals. This study aimed to identify individual motor signature (IMS) of emotions. IMS is a new method of motion analysis and visualization able to capture the subtle differences in the way each of us moves, seen as a kinematic fingerprint. We hypothesized that the individual motor signature would be different depending on the induced emotional state and that an emotional motor signature of joy and sadness common to all participants would emerge. For that purpose, we elicited these emotions (joy, sadness, and a neutral control emotion) in 26 individuals using an autobiographical memory paradigm, before they performed a motor improvization task (e.g., the mirror game). We extracted the individual motor signature under each emotional condition. Participants completed a self-report emotion before and after each trial. Comparing the similarity indexes of intra- and inter-emotional condition signatures, we confirmed our hypothesis and showed the existence of a specific motor signature for joy and sadness, allowing us to introduce the notion of emotional individual motor signature (EIMS). Our study indicates that EIMS can reinforce emotion discrimination and constitutes the first step in modeling emotional behavior during individual task performances or social interactions
    corecore