305 research outputs found

    Detecting violent excerpts in movies using audio

    Get PDF
    This thesis addresses the problem of automatically detecting violence in movie excerpts, based on audio and video features. A solution to this problem is relevant for a number of applications, including preventing children from being exposed to violence in the existing media, which may avoid the development of violent behavior. We analyzed and extracted audio and video features directly from the movie excerpt and used them to classify the movie excerpt as violent or non-violent. In order to find the best feature set and to achieve the best performance, our experiments use two different machine learning classifiers: Support Vector Machines (SVM) and Neural Networks (NN). We used a balanced subset of the existing ACCEDE database of movie excerpts containing 880 movie excerpts manually tagged as violent or non-violent. During an early experimental stage, using the features originally included in the ACCEDE database, we tested the use of audio features alone, video features alone and combinations of audio and video features. These results provided our baseline for further experiments using alternate audio features, extracted using available toolkits, and alternate video features, extracted using our own methods. Our most relevant conclusions are as follows: 1) audio features can be easily extracted using existing tools and have a strong impact in the system performance; 2) in terms of video features, features related with motion and shot transitions on a scene seem to have a better impact when compared with features related with color or luminance; 3) the best results are achieved by combining audio and video features. In general, the SVM classifier seems to work better for this problem, despite the performance of both classifiers being similar for the best feature setEsta tese aborda o problema da deteção de violência em excertos de filmes, com base em características extraídas do audio e do video. A resolução deste problema é relevante para um vasto leque de aplicações, incluindo evitar ou monitorizar a exposição de crianças à violência que existe nos vários tipos de média, o que pode evitar que estas desenvolvam comportamentos violentos. Analisámos e extraímos características áudio e vídeo diretamente do excerto de filme e usámo-las para classificar excertos de filme como violentos ou não violentos. De forma a encontrar o melhor conjunto de caracteristicas e atingir a melhor performance, as nossas experiências utilizam dois classificadores, nomeadamente: Support Vector Machines (SVM) e Redes Neuronais(NN). Foi usado um conjunto balanceado de excertos de filmes, retirado da base de dados ACCEDE, conjunto esse, que contém 880 excertos de filme, anotados manualmente como violentos ou não violentos. Durante as primeiras experiências, usando características incluídas na base de dados ACCEDE, testámos caracteristicas áudio e características vídeo, individualmente, e combinações de características áudio e vídeo. Estes resultados estabeleceram o ponto de partida para as experiências que os seguiram, usando outras características áudio, extraídas através de ferramentas disponíveis, e outras características vídeo, extraídas através dos nossos próprios métodos. As conclusões mais relevantes a que chegámos são as seguintes: 1) características áudio podem ser facilmente extraídas usando ferramentas já existentes e têm grande impacto na performance do sistema; 2) em termos de características vídeo, caracteristicas relacionadas com o movimentos e transições entre planos numa cena, parecem ter mais impacto do que características relacionadas com cor e luminância; 3) Os melhores resultados ocorrem quando se combinam características áudio e vídeo, sendo que, em geral, o classificador SVM parece ser mais adequado para o problema, apesar da performance dos dois classificadores ser semelhante para o melhor conjunto de características a que chegámos

    Recent developments in openSMILE, the munich open-source multimedia feature extractor

    Full text link
    We present recent developments in the openSMILE feature extraction toolkit. Version 2.0 now unites feature extraction paradigms from speech, music, and general sound events with basic video features for multi-modal processing. De-scriptors from audio and video can be processed jointly in a single framework allowing for time synchronization of param-eters, on-line incremental processing as well as off-line and batch processing, and the extraction of statistical function-als (feature summaries), such as moments, peaks, regression parameters, etc. Postprocessing of the features includes sta-tistical classifiers such as support vector machine models or file export for popular toolkits such as Weka or HTK. Avail-able low-level descriptors include popular speech, music and video features including Mel-frequency and similar cepstral and spectral coefficients, Chroma, CENS, auditory model based loudness, voice quality, local binary pattern, color, and optical flow histograms. Besides, voice activity detection, pitch tracking and face detection are supported. openSMILE is implemented in C++, using standard open source libraries for on-line audio and video input. It is fast, runs on Unix and Windows platforms, and has a modular, component based architecture which makes extensions via plug-ins easy. openSMILE 2.0 is distributed under a research license and can be downloaded fro

    Virtual Reality Games for Motor Rehabilitation

    Get PDF
    This paper presents a fuzzy logic based method to track user satisfaction without the need for devices to monitor users physiological conditions. User satisfaction is the key to any product’s acceptance; computer applications and video games provide a unique opportunity to provide a tailored environment for each user to better suit their needs. We have implemented a non-adaptive fuzzy logic model of emotion, based on the emotional component of the Fuzzy Logic Adaptive Model of Emotion (FLAME) proposed by El-Nasr, to estimate player emotion in UnrealTournament 2004. In this paper we describe the implementation of this system and present the results of one of several play tests. Our research contradicts the current literature that suggests physiological measurements are needed. We show that it is possible to use a software only method to estimate user emotion

    Experimenting on difference: women, violence, and narrative in Zola's naturalism

    Get PDF
    This dissertation examines the role of women in four of Émile Zola's novels, in particular their privileged position as the conduits through which he exerted his "experimental" literary method. Zola has long been recognized as subjecting his female characters to extreme violence, but scholars have not yet thoroughly explored how the ways in which he represents this violence provide insight into the nature of his narrative practice. For Zola, literary fiction offers access to a scientific truth, and the female body and its capacity for procreation is the source material for his investigation. By subjecting his female characters to analysis and ultimately dissection, Zola violently exploits the creative potential of their bodies and builds a literary empire upon them. In La Curée, Zola presents one of his first experimental heroines, a bored and pampered wife whose identity is constructed through reflections and refractions via a series of mirrors, both visually and narratively. This multiplicity of interferences effaces the female voice and subjectivity while exploiting the visual appeal of the female body. Nana offers a counterpoint on the same theme, featuring a woman who, through the desirability of her body, reverses the paradigm and exerts control over those around her with masterful manipulation of optics and language. Nana's body inscrutably defies analysis and playfully disrupts gender constructs by assuming contradictory sexual characteristics that are only indirectly observable. Zola shifts his narrative focus from the women themselves to the broader notion of sexual difference in La Bête humaine, in which the female body signifies the difference that drives male desire and destabilizes civilized society. The representability of sex becomes increasingly problematic as female speech, filtered through the body, puts the reliability of language into question. The problematics of the legible body that Zola develops in these texts can be traced all the way back to Thérèse Raquin, in which he conducts a literary investigation into the relationship between bodies and texts. This short novel, Zola's first of the genre, is particularly interested in the different (pro)creative capacities of male and female bodies and the representational possibilities inherent in them

    Beyond “Brutality”: understanding the Italian Filone’s violent excesses

    Get PDF
    “Brutality” has long been held up by critics to be one of the defining features of the Italian filoni; a body of popular genre film cycles (peplum mythological epics, horror films, giallo thrillers, poliziotteschi crime dramas, westerns and others) released during a frenzied period of film production between the late 1950s and mid 1980s. A disproportionate emphasis on scenes of often extreme violence and spectacle can be traced across all of the cycles, resulting in a habitual “weakening” of narrative and disruption of the filmic continuities fundamental to mainstream cinema. This emphasis and the uneasy pleasures that it provides have led to a distinct ghettoisation of the filoni within English-language film criticism, with historical accounts of Italian cinema ignoring the films completely, dismissing them as “trash” or portraying them as parasitic counterfeits of “authentic” Hollywood genre films. Furthermore, such accounts typically fail to address the question of what it is that makes these films so violent, limiting their descriptions to blanket terms such as “brutal”, “exploitative” and “sadistic”, in the process reaffirming the idea that the filoni are simply not worthy of further study. As a result, the suggestion that the films could provide pleasures which are distinctly different from those established by mainstream cinema remains largely unaddressed. This thesis seeks to reconcile the gap between my own personal engagement with the films and the lack of attention that has been devoted to them within critical Anglo-American discourses. Drawing on the “paracinematic” approach highlighted by Sconce (1995), I seek to demonstrate that it is precisely in the filoni’s often violent deviations from mainstream cinema’s established continuities where their most remarkable features lie, using Thompson’s (1986) concept of “cinematic excess” to illustrate the films’ overwhelming prioritisation of formal elements that exceed the limits of narrative motivation. Using narrative and close textual analysis of a representative body of filoni to identify patterns of violence, spectacle and excess across the films’ structures, I shall also illustrate the benefits of using film theories outwith their original context to shed light on non-mainstream films like the filoni, drawing in particular on the work of musical theorists Altman (1978) and Mellencamp (1977) to identify a “dual focus” in the films between scenes of narrative and more excessive violent “numbers”. Combining my analysis of specific filoni with an examination of representative mainstream films and Anglo-American genre theory, I shall demonstrate that while the regulation of cinematic excess is vital to the narrative pleasures engendered by the latter (suspense, characterisation, drama), in the filoni such pleasures are typically debunked in favour of the more immediate pleasures and curiosities provoked by viewing (and listening to) spectacular and violent acts that threaten the continuities surrounding them. As my analysis chapters will indicate, the filoni are far more productively analysed using theories derived from early cinema: by drawing on Gunning’s (1986) concept of cinematic “attractions” – non-narrative spectacles which exhibit a similar emphasis on the primacy of the image and the pleasures that it provides – I shall illustrate how a central viewing pleasure prioritised by the filoni arises from the frequent revelation of the filmic apparatus during scenes of spectacle and violence, where spatio-temporal continuities are frequently abandoned. By going beyond the blanket generalisations of “brutality” that have resulted in the filoni’s habitual marginalisation within film studies, this thesis shall exemplify a long-overdue “closer” approach to the films that seeks to highlight their distinctive features, study their structures and investigate the specific (dis)continuities and (dis)pleasures that they provide, at the same time exploring the possibilities of exactly what is meant by “violence” in cinema

    Doctor of Philosophy

    Get PDF
    dissertationInfluenced by the continued growth of the interdisciplinary field of sound studies, my dissertation examines sounds and soundscapes in several prose works of Western American literature. Literary Soundscapes of the American West examines literary sounds-the collective, but varied, representations of sound, silence, and voice in literature-that represent intimate, affective, and always-changing relationships between people and places in the contemporary American West. I argue that Sherman Alexie, Cormac McCarthy, Terry Tempest Williams, and Charles Bowden use literary sounds to encourage-and potentially activate-what I call an audile mode of attention, which underscores sound as fundamental to people's understanding of place as well as their relationship to space generally. My analysis examines literary sounds that resonate in representations of specific Western locales: a Northwestern metropolis, the Southwestern redrock desert, and the U.S.-Mexican borderlands. Literary sounds do not operate identically in each of my primary texts. In fiction, such as Alexie's Indian Killer and McCarthy's The Crossing, representations of sound occupy an understated and subordinate position in the text. In contrast to these fictional works, Williams' Red and Bowden's Murder City demand that readers attend to sound because it represents local knowledge about pressing ethical concerns. In my analysis of contemporary Western literature, I employ critical regionalism, sound studies, and affect theory and argue that Alexie, McCarthy, Williams, and Bowden produce literary sounds that represent the tensions between various spatial scales (the personal, the local, the regional, and the global) in twentieth- and twenty-first century Western places. By combining the overlapping concerns of these three critical paradigms with my interest in representations of place in contemporary Western American literature, my dissertation evaluates the productive potential of excess in a selected body of literature. The particular excess that I consider here is made up of a relatively immaterial and transient form, sound and, to be more specific, sounds produced in literature. To say that sound, in everyday life or in literature, constitutes excess is not to suggest that it is not necessary to or always already resonant in our interpretations of and experiences with place and space. Rather, I argue that sounds produce excess by activating untapped potential and calling upon readers and listeners to identify in place those contingent truths and realities that escape our notice when we view place as a closed and contained form

    Motion and emotion : Semantic knowledge for hollywood film indexing

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Effects of errorless learning on the acquisition of velopharyngeal movement control

    Get PDF
    Session 1pSC - Speech Communication: Cross-Linguistic Studies of Speech Sound Learning of the Languages of Hong Kong (Poster Session)The implicit motor learning literature suggests a benefit for learning if errors are minimized during practice. This study investigated whether the same principle holds for learning velopharyngeal movement control. Normal speaking participants learned to produce hypernasal speech in either an errorless learning condition (in which the possibility for errors was limited) or an errorful learning condition (in which the possibility for errors was not limited). Nasality level of the participants’ speech was measured by nasometer and reflected by nasalance scores (in %). Errorless learners practiced producing hypernasal speech with a threshold nasalance score of 10% at the beginning, which gradually increased to a threshold of 50% at the end. The same set of threshold targets were presented to errorful learners but in a reversed order. Errors were defined by the proportion of speech with a nasalance score below the threshold. The results showed that, relative to errorful learners, errorless learners displayed fewer errors (50.7% vs. 17.7%) and a higher mean nasalance score (31.3% vs. 46.7%) during the acquisition phase. Furthermore, errorless learners outperformed errorful learners in both retention and novel transfer tests. Acknowledgment: Supported by The University of Hong Kong Strategic Research Theme for Sciences of Learning © 2012 Acoustical Society of Americapublished_or_final_versio

    Signals and Images in Sea Technologies

    Get PDF
    Life below water is the 14th Sustainable Development Goal (SDG) envisaged by the United Nations and is aimed at conserving and sustainably using the oceans, seas, and marine resources for sustainable development. It is not difficult to argue that signals and image technologies may play an essential role in achieving the foreseen targets linked to SDG 14. Besides increasing the general knowledge of ocean health by means of data analysis, methodologies based on signal and image processing can be helpful in environmental monitoring, in protecting and restoring ecosystems, in finding new sensor technologies for green routing and eco-friendly ships, in providing tools for implementing best practices for sustainable fishing, as well as in defining frameworks and intelligent systems for enforcing sea law and making the sea a safer and more secure place. Imaging is also a key element for the exploration of the underwater world for various scopes, ranging from the predictive maintenance of sub-sea pipelines and other infrastructure projects, to the discovery, documentation, and protection of sunken cultural heritage. The scope of this Special Issue encompasses investigations into techniques and ICT approaches and, in particular, the study and application of signal- and image-based methods and, in turn, exploration of the advantages of their application in the previously mentioned areas
    corecore