264,498 research outputs found

    Audio-Visual Materials

    Get PDF
    published or submitted for publicatio

    Self-Supervised Audio-Visual Co-Segmentation

    Full text link
    Segmenting objects in images and separating sound sources in audio are challenging tasks, in part because traditional approaches require large amounts of labeled data. In this paper we develop a neural network model for visual object segmentation and sound source separation that learns from natural videos through self-supervision. The model is an extension of recently proposed work that maps image pixels to sounds. Here, we introduce a learning approach to disentangle concepts in the neural networks, and assign semantic categories to network feature channels to enable independent image segmentation and sound source separation after audio-visual training on videos. Our evaluations show that the disentangled model outperforms several baselines in semantic segmentation and sound source separation.Comment: Accepted to ICASSP 201

    A Means to an End, A Means to an End, A Means to an End: Repetition through two filmmakers and four films

    Get PDF
    Wong Kar-wai is an auteur whose body of work constitutes over 20+ short and feature films. Wong is a Hong Kong director with a unique audio-visual style. His audio-visual style works best to enhance and compliment his film’s themes of loss, love, and memory. Wong’s films are a body of work that can be compared to other filmmakers and examined under many different critical theory lenses. This essay analyzes Wong Kar-wai’s films using the lens of psychoanalysis. The focus is on the unofficial Wong Kar-wai trilogy: Days of Being Wild (1990), In the Mood for Love (2000), and 2046 (2004). Ultimately, this essay is the accompaniment to Ellipsis the short film that seeks to recreate the mood and tones of Wong Kar- wai’s work using his audio-visual style as a jumping point to think of ways I can develop my own style

    Contributions of temporal encodings of voicing, voicelessness, fundamental frequency, and amplitude variation to audiovisual and auditory speech perception

    Get PDF
    Auditory and audio-visual speech perception was investigated using auditory signals of invariant spectral envelope that temporally encoded the presence of voiced and voiceless excitation, variations in amplitude envelope and F-0. In experiment 1, the contribution of the timing of voicing was compared in consonant identification to the additional effects of variations in F-0 and the amplitude of voiced speech. In audio-visual conditions only, amplitude variation slightly increased accuracy globally and for manner features. F-0 variation slightly increased overall accuracy and manner perception in auditory and audio-visual conditions. Experiment 2 examined consonant information derived from the presence and amplitude variation of voiceless speech in addition to that from voicing, F-0, and voiced speech amplitude. Binary indication of voiceless excitation improved accuracy overall and for voicing and manner. The amplitude variation of voiceless speech produced only a small increment in place of articulation scores. A final experiment examined audio-visual sentence perception using encodings of voiceless excitation and amplitude variation added to a signal representing voicing and F-0. There was a contribution of amplitude variation to sentence perception, but not of voiceless excitation. The timing of voiced and voiceless excitation appears to be the major temporal cues to consonant identity. (C) 1999 Acoustical Society of America. [S0001-4966(99)01410-1]

    PENGEMBANGAN MEDIA PEMBELAJARAN AUDIO VISUAL PADA SISTEM PENDINGIN SEBAGAI UPAYA MENINGKATKAN PRESTASI BELAJAR SISWA DI SMK PERINDUSTRIAN YOGYAKARTA

    Get PDF
    Penelitian ini bertujuan: (1) Menghasilkan media pembelajaran dalam bentuk perangkat lunak komputer, sebagai media pembelajaran yang dapat meningkatkan prestasi belajar siswa di SMK Perindustrian Yogyakarta; (2) Mengetahui kualitas media audio visual yang sesuai dengan kriteria kualitas media pembelajaran otomotif untuk SMK yang baik; (3) Menghasilkan media pembelajaran audio visual pada sistem pendingin yang layak diimplementasikan sebagai media pembelajaran di SMK Perindustrian Yogyakarta. Penelitian ini menggunakan penelitian dan pengembangan (Research and Development) yang dilakukan di Jurusan Teknik Mekanik Otomotif di SMK Perindustrian Yogyakarta. Penelitian ini berupa pengembangan media pembelajaran audio visual mata pelajaran sistem pendingin. Pengumpulan data dilakukan menggunakan kuesioner (angket) dan soal tes (pretest & postest). Teknik analisis data dilakukan menggunakan analisis deskriptif kualitatif, dan uji coba penerapan media dilakukan dengan cara membandingkan hasil pretest & posttest dari dua kelompok yang menggunakan media audio visual dan yang tidak menggunakan media audio visual. Hasil penelitian ini adalah media pembelajaran audio visual. Pengembangan media pembelajaran dinyatakan sangat baik untuk digunakan berdasarkan uji kelayakan menurut ahli media pembelajaran dengan persentase total sebesar 81,6%, ahli materi dengan persentase total sebesar 85%, penilaian guru mata pelajaran dengan persentase total 90%, hasil uji kelompok kecil dengan persentase total sebesar 87,1% dan uji coba kelompok besar dengan persentase total sebesar 86,7%. Media pembelajaran dengan audio visual ini telah teruji keefektifannya untuk meningkatkan prestasi belajar. Dari hasil uji di atas dapat disimpulkan bahwa media pembelajaran dengan audio visual yang dikembangkan sangat baik digunakan sebagai pendukung pembelajaran untuk mata pelajaran sistem pendingin dan efektif untuk meningkatkan prestasi belajar siswa

    The CHORUS gap analysis on user-centered methodology for design and evaluation of multi-media information access systems

    Get PDF
    CHORUS is a Coordination Action, a specific type of project funded by the European commission under its research programmes, intended to bring together research projects with common goals, in the field of search technologies for digital audio-visual content, one of the strategic objectives of the current research frame program. CHORUS coordinates a number of research projects in the general area of audio-visual and multi-media information access and management. The most important single contribution of the CHORUS work plan will be to provide a survey of the field and a roadmap with a gap analysis for the realisation of viable audio-visual search engines by European partners. This is done by several means. CHORUS organises Think-Tanks with industrial participation, focussed workshops to treat specific questions, and more general conferences for academic discussions. CHORUS is now in its final phase, and is currently preparing its final report together with a final conference to mark its publication
    corecore