4,958 research outputs found
City-Identification of Flickr Videos Using Semantic Acoustic Features
City-identification of videos aims to determine the likelihood of a video
belonging to a set of cities. In this paper, we present an approach using only
audio, thus we do not use any additional modality such as images, user-tags or
geo-tags. In this manner, we show to what extent the city-location of videos
correlates to their acoustic information. Success in this task suggests
improvements can be made to complement the other modalities. In particular, we
present a method to compute and use semantic acoustic features to perform
city-identification and the features show semantic evidence of the
identification. The semantic evidence is given by a taxonomy of urban sounds
and expresses the potential presence of these sounds in the city- soundtracks.
We used the MediaEval Placing Task set, which contains Flickr videos labeled by
city. In addition, we used the UrbanSound8K set containing audio clips labeled
by sound- type. Our method improved the state-of-the-art performance and
provides a novel semantic approach to this tas
“Come you spirits unsex me!”: representations of the female executive in recent French film & fiction
This article analyses the representation of female executives in a corpus of French films and novels produced from 2000 on. The corpus includes a mixture of male and female directors and novelists, all of whom adopt broadly centre-left or left-wing positions that are highly critical of contemporary forms of globalised, neo-liberal capitalism. Yet each of these directors and novelists depicts powerful female executives in highly conservative terms, figuring them as ‘unsexed’ beings who have turned their backs on their ‘natural’ destinies as wives and mothers. Further, these films and novels all imply that neo-liberal capitalism could be defeated if women were just to return to their traditional roles as wives and mothers and if the patriarchal nuclear family could once again perform its proper role as the foundation of community and national integrity. The corpus thus offers depictions of a range of powerful women who are, alternately, punished, pitied, or tamed. This being the price that must apparently be paid, if French national integrity is to be preserved from what are figured as the inherently foreign forces of globalised capitalism. Having offered an inventory of these deeply conservative tropes, the article concludes by suggesting some possible reasons for their dispiriting recurrence
Experiments on the DCASE Challenge 2016: Acoustic Scene Classification and Sound Event Detection in Real Life Recording
In this paper we present our work on Task 1 Acoustic Scene Classi- fication
and Task 3 Sound Event Detection in Real Life Recordings. Among our experiments
we have low-level and high-level features, classifier optimization and other
heuristics specific to each task. Our performance for both tasks improved the
baseline from DCASE: for Task 1 we achieved an overall accuracy of 78.9%
compared to the baseline of 72.6% and for Task 3 we achieved a Segment-Based
Error Rate of 0.76 compared to the baseline of 0.91
AudioPairBank: Towards A Large-Scale Tag-Pair-Based Audio Content Analysis
Recently, sound recognition has been used to identify sounds, such as car and
river. However, sounds have nuances that may be better described by
adjective-noun pairs such as slow car, and verb-noun pairs such as flying
insects, which are under explored. Therefore, in this work we investigate the
relation between audio content and both adjective-noun pairs and verb-noun
pairs. Due to the lack of datasets with these kinds of annotations, we
collected and processed the AudioPairBank corpus consisting of a combined total
of 1,123 pairs and over 33,000 audio files. One contribution is the previously
unavailable documentation of the challenges and implications of collecting
audio recordings with these type of labels. A second contribution is to show
the degree of correlation between the audio content and the labels through
sound recognition experiments, which yielded results of 70% accuracy, hence
also providing a performance benchmark. The results and study in this paper
encourage further exploration of the nuances in audio and are meant to
complement similar research performed on images and text in multimedia
analysis.Comment: This paper is a revised version of "AudioSentibank: Large-scale
Semantic Ontology of Acoustic Concepts for Audio Content Analysis
- …
