1,283 research outputs found

    Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields

    Full text link
    This work presents a first evaluation of using spatio-temporal receptive fields from a recently proposed time-causal spatio-temporal scale-space framework as primitives for video analysis. We propose a new family of video descriptors based on regional statistics of spatio-temporal receptive field responses and evaluate this approach on the problem of dynamic texture recognition. Our approach generalises a previously used method, based on joint histograms of receptive field responses, from the spatial to the spatio-temporal domain and from object recognition to dynamic texture recognition. The time-recursive formulation enables computationally efficient time-causal recognition. The experimental evaluation demonstrates competitive performance compared to state-of-the-art. Especially, it is shown that binary versions of our dynamic texture descriptors achieve improved performance compared to a large range of similar methods using different primitives either handcrafted or learned from data. Further, our qualitative and quantitative investigation into parameter choices and the use of different sets of receptive fields highlights the robustness and flexibility of our approach. Together, these results support the descriptive power of this family of time-causal spatio-temporal receptive fields, validate our approach for dynamic texture recognition and point towards the possibility of designing a range of video analysis methods based on these new time-causal spatio-temporal primitives.Comment: 29 pages, 16 figure

    FastCLIPstyler: Optimisation-free Text-based Image Style Transfer Using Style Representations

    Full text link
    In recent years, language-driven artistic style transfer has emerged as a new type of style transfer technique, eliminating the need for a reference style image by using natural language descriptions of the style. The first model to achieve this, called CLIPstyler, has demonstrated impressive stylisation results. However, its lengthy optimisation procedure at runtime for each query limits its suitability for many practical applications. In this work, we present FastCLIPstyler, a generalised text-based image style transfer model capable of stylising images in a single forward pass for arbitrary text inputs. Furthermore, we introduce EdgeCLIPstyler, a lightweight model designed for compatibility with resource-constrained devices. Through quantitative and qualitative comparisons with state-of-the-art approaches, we demonstrate that our models achieve superior stylisation quality based on measurable metrics while offering significantly improved runtime efficiency, particularly on edge devices.Comment: Accepted at the 2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024

    On-Bird Sound Recordings: Automatic Acoustic Recognition of Activities and Contexts

    Get PDF
    We introduce a novel approach to studying animal behaviour and the context in which it occurs, through the use of microphone backpacks carried on the backs of individual free-flying birds. These sensors are increasingly used by animal behaviour researchers to study individual vocalisations of freely behaving animals, even in the field. However such devices may record more than an animals vocal behaviour, and have the potential to be used for investigating specific activities (movement) and context (background) within which vocalisations occur. To facilitate this approach, we investigate the automatic annotation of such recordings through two different sound scene analysis paradigms: a scene-classification method using feature learning, and an event-detection method using probabilistic latent component analysis (PLCA). We analyse recordings made with Eurasian jackdaws (Corvus monedula) in both captive and field settings. Results are comparable with the state of the art in sound scene analysis; we find that the current recognition quality level enables scalable automatic annotation of audio logger data, given partial annotation, but also find that individual differences between animals and/or their backpacks limit the generalisation from one individual to another. we consider the interrelation of 'scenes' and 'events' in this particular task, and issues of temporal resolution

    Colour Image Segmentation using Texems

    Get PDF

    Efficient vanishing point detection method in unstructured road environments based on dark channel prior

    Get PDF
    Vanishing point detection is a key technique in the fields such as road detection, camera calibration and visual navigation. This study presents a new vanishing point detection method, which delivers efficiency by using a dark channel prior‐based segmentation method and an adaptive straight lines search mechanism in the road region. First, the dark channel prior information is used to segment the image into a series of regions. Then the straight lines are extracted from the region contours, and the straight lines in the road region are estimated by a vertical envelope and a perspective quadrilateral constraint. The vertical envelope roughly divides the whole image into sky region, vertical region and road region. The perspective quadrilateral constraint, as the authors defined herein, eliminates the vertical lines interference inside the road region to extract the approximate straight lines in the road region. Finally, the vanishing point is estimated by the meanshift clustering method, which are computed based on the proposed grouping strategies and the intersection principles. Experiments have been conducted with a large number of road images under different environmental conditions, and the results demonstrate that the authors’ proposed algorithm can estimate vanishing point accurately and efficiently in unstructured road scenes
    • 

    corecore