1,073 research outputs found

    BVI-VFI: A Video Quality Database for Video Frame Interpolation

    Full text link
    Video frame interpolation (VFI) is a fundamental research topic in video processing, which is currently attracting increased attention across the research community. While the development of more advanced VFI algorithms has been extensively researched, there remains little understanding of how humans perceive the quality of interpolated content and how well existing objective quality assessment methods perform when measuring the perceived quality. In order to narrow this research gap, we have developed a new video quality database named BVI-VFI, which contains 540 distorted sequences generated by applying five commonly used VFI algorithms to 36 diverse source videos with various spatial resolutions and frame rates. We collected more than 10,800 quality ratings for these videos through a large scale subjective study involving 189 human subjects. Based on the collected subjective scores, we further analysed the influence of VFI algorithms and frame rates on the perceptual quality of interpolated videos. Moreover, we benchmarked the performance of 33 classic and state-of-the-art objective image/video quality metrics on the new database, and demonstrated the urgent requirement for more accurate bespoke quality assessment methods for VFI. To facilitate further research in this area, we have made BVI-VFI publicly available at https://github.com/danier97/BVI-VFI-database

    Single View Modeling and View Synthesis

    Get PDF
    This thesis develops new algorithms to produce 3D content from a single camera. Today, amateurs can use hand-held camcorders to capture and display the 3D world in 2D, using mature technologies. However, there is always a strong desire to record and re-explore the 3D world in 3D. To achieve this goal, current approaches usually make use of a camera array, which suffers from tedious setup and calibration processes, as well as lack of portability, limiting its application to lab experiments. In this thesis, I try to produce the 3D contents using a single camera, making it as simple as shooting pictures. It requires a new front end capturing device rather than a regular camcorder, as well as more sophisticated algorithms. First, in order to capture the highly detailed object surfaces, I designed and developed a depth camera based on a novel technique called light fall-off stereo (LFS). The LFS depth camera outputs color+depth image sequences and achieves 30 fps, which is necessary for capturing dynamic scenes. Based on the output color+depth images, I developed a new approach that builds 3D models of dynamic and deformable objects. While the camera can only capture part of a whole object at any instance, partial surfaces are assembled together to form a complete 3D model by a novel warping algorithm. Inspired by the success of single view 3D modeling, I extended my exploration into 2D-3D video conversion that does not utilize a depth camera. I developed a semi-automatic system that converts monocular videos into stereoscopic videos, via view synthesis. It combines motion analysis with user interaction, aiming to transfer as much depth inferring work from the user to the computer. I developed two new methods that analyze the optical flow in order to provide additional qualitative depth constraints. The automatically extracted depth information is presented in the user interface to assist with user labeling work. In this thesis, I developed new algorithms to produce 3D contents from a single camera. Depending on the input data, my algorithm can build high fidelity 3D models for dynamic and deformable objects if depth maps are provided. Otherwise, it can turn the video clips into stereoscopic video

    Affect-based indexing and retrieval of multimedia data

    Get PDF
    Digital multimedia systems are creating many new opportunities for rapid access to content archives. In order to explore these collections using search, the content must be annotated with significant features. An important and often overlooked aspect o f human interpretation o f multimedia data is the affective dimension. The hypothesis o f this thesis is that affective labels o f content can be extracted automatically from within multimedia data streams, and that these can then be used for content-based retrieval and browsing. A novel system is presented for extracting affective features from video content and mapping it onto a set o f keywords with predetermined emotional interpretations. These labels are then used to demonstrate affect-based retrieval on a range o f feature films. Because o f the subjective nature o f the words people use to describe emotions, an approach towards an open vocabulary query system utilizing the electronic lexical database WordNet is also presented. This gives flexibility for search queries to be extended to include keywords without predetermined emotional interpretations using a word-similarity measure. The thesis presents the framework and design for the affectbased indexing and retrieval system along with experiments, analysis, and conclusions

    The Glitzy Glamour Glitter Girls: Drag Queens, Visual Ethnography and the Ciné Photo-essay

    Get PDF
    The central argument of the project The Glitzy Glamour Glitter Girls: Drag Queens,Visual Ethnography and the Ciné Photo-essay is that visual and cinematic essays created by artists can operate as a form of visual ethnography. The project, therefore, broadens social understandings of subcultures. The aim is to photograph and film a specific group of drag queens, following their lives and ideas over a 20-year period. The traditional approach of anthropology includes a historical distrust of the visual as scientific data. Within this research project, the practice and boundaries of visual ethnography will be mapped in the friction and fissures created from the intersection of art and social science in practice-led research. The research expands the classifications within the practice and theory of visual ethnography from the distinct genres of photography and film in visual research to include a new hybrid visual form that I designate as ciné moments and the ciné photo-essay. The project uses the archival material of a black-and-white photographic essay captured by the photographer turned filmmaker/researcher as a catalyst to create a new body of work using still and moving images. The research project rememorialisesphotographs taken in The Laneway, which is actually a series of laneways off Hill Street in Surry Hills in Sydney, Australia, during public street parties after gay and lesbian events such as Mardi Gras and Sleaze Ball. The original photographs depict three drag queens, known as the Glitzy Glamour Glitter Girls, and street culture that has disappeared. The photo-essay is an under-theorised subject area within the trajectory of documentary history, according to theorists Timothy Corrigan in The Essay Film: From Montaigne, After Marker (2011) and Philippe Mather in Stanley Kubrick at Look Magazine: Authorship and Genre in Photojournalism and Film (2013). The photo-essay, a series of still photographs that creates a narrative or statement, has been historically tied to print media and photojournalism. The demise of traditional print outlets in the media and the proliferation of online slideshows through the internet have created new horizons for the photo-essay to expand. This project will explore the still and moving essayistic in visual media to find the gaps and overlaps between the traditional and experimental aspects of visual ethnography. Lens-based visual artists and anthropologists Sarah Pink, Trinh T. Minh Ha and Anna Grimshaw expand the margins of the visual within the trajectory of ethnography and methodology. The Glitzy Glamour Glitter Girls: Drag Queens, Visual Ethnography and the Ciné Photo-essay weaves still and moving images into a hybrid form that lifts the mask off a subculture of drag queens in Sydney. The PhD thesis comprises an installed exhibition of still and moving images as a film projection (in an art gallery space) and an exegetical document of 50,000 words that will explicate the methodological process and disciplinary context of the research

    Leaf v. Nike

    Get PDF
    • …
    corecore