447 research outputs found

    Understanding video through the lens of language

    Get PDF
    The increasing abundance of video data online necessitates the development of systems capable of understanding such content. However, building these systems poses significant challenges, including the absence of scalable and robust supervision signals, computational complexity, and multimodal modelling. To address these issues, this thesis explores the role of language as a complementary learning signal for video, drawing inspiration from the success of self-supervised Large Language Models (LLMs) and image-language models. First, joint video-language representations are examined under the text-to-video retrieval task. This includes the study of pre-extracted multimodal features, the influence of contextual information, joint end-to-end learning of both image and video representations, and various frame aggregation methods for long-form videos. In doing so, state-of-the-art performance is achieved across a range of established video-text benchmarks. Second, this work explores the automatic generation of audio description (AD) – narrations describing the visual happenings in a video, for the benefit of visually impaired audiences. An LLM, prompted with multimodal information, including past predictions, and pretrained with partial data sources, is employed for the task. In the process, substantial advancements are achieved in the following areas: efficient speech transcription, long-form visual storytelling, referencing character names, and AD time-point prediction. Finally, audiovisual behaviour recognition is applied to the field of wildlife conservation and ethology. The approach is used to analyse vast video archives of wild primates, revealing insights into individual and group behaviour variations, with the potential for monitoring the effects of human pressures on animal habitats

    Assessment of cognitive development in four to eight year old children by means of drawing tasks

    Get PDF
    The present thesis explores the link between children's drawings and cognitive development. The aim of this study is to investigate the intellectual abilities of the child draughtsman with good depiction skills and to evaluate the merit of the drawing technique in the assessment of conceptual maturity. The standardised Goodenough-Harris Drawing Test (GHDT) of intellectual maturity was administered to 115 children between 4 to 8 years of age against criterion ability measures (Wechsler scales). Its psychometric properties are examined in respect to its norms and scales, its reliability and validity at different age levels and ranges of intelligence. Early theories in the area of pictorial representation were directed towards identifying features characteristic of different developmental periods (Kerschensteiner, 1905; Luquet, 1927/1977). At the same time Piaget and Inhelder (1948/1967) incorporated these stage theories into their model of spatial intelligence. Yet, the recent experimental study of children's drawings has disclosed a number of variables which interfere during the course of production, challenging the view that drawings can be seen as the royal route to access children's concepts. Stage theories are re-evaluated by means of fourteen experimental drawing tasks with various degree of difficulty. The tasks - administered to the same children tested with the standardised instruments -are spatial in nature and have been sampled from two widely researched areas related to the pictorial representation of partial occlusion and of spatial axes (horizontal/vertical). The acquisition of the pertinent spatial concepts by means of drawings is examined, considering competence-deficiency and competence-utilisation accounts of children's performance at different ages. Finally, overall perfomance on spatial tasks is compared with performance on conventional (Wechsler scales) and non-verbal (GHDT) measures of intellectual functioning, considering the optimum method to assess children's abilities by means of drawings. In general, drawing performance is reasonably sensitive to children's level of intelligence, yet the significance of drawing varies at different ages and ranges of IQ. Finally, the establishment of steadfast developmental trajectories falls short in the field of pictorial representation. The variable performance, particularly from the children at intermediate ages, suggests that the stages of intellectual or visual realism should be seen as relative and not as absolute

    Exploring spatial memory in children with autism and ADHD

    Get PDF
    The study investigates spatial memory in neurotypicals, ASD and ADHD children. In a reaction-time accuracy task, children (N = 117) were presented with a grid containing twenty-five individual places. In the presentation phase, children saw different categories of object-in-places which varied from technical to social role play toys. An interference object which was either the same or a different-object exemplar filled the delay between the presentation and test. At test, children were required to recall the location occupied by the object. Among the clinical and matched control groups tested, comparatively better place memory accuracy was evident in ASD children; however this was accompanied by longer place memory reaction times. Same-object presentation in the delay was improving place memory accuracy and speeding up reaction times of children, in comparison to a different-object exemplar. Technical objects were better remembered by the mainly male sample than roleplay and neutral objects, but this particular category of objects had the slowest reaction times. When the binding strategies as per Common Region Test (CRT) were included in the analyses, place memory accuracy was more accurate among systematic coders than unsystematic coders. Interestingly, place memory accuracy and reaction times of those who adopted systematic binding benefitted more from repetition (same-object delay) than those who coded unsystematically - a pattern found across most object categories. Thus, one could say that the repetition was helping to reinforce the object-place binding among systematic coders
    • …
    corecore