1,695 research outputs found

    The ToCAI DS for audio-visual documents. Structure and concepts

    Get PDF
    This document complements the description of the audio-visual (AV) description scheme (DS) called Table of Content-Analytical Index (TOCAI) proposed in MPEG-7 CFP that was evaluated in Lancaster (February 1999). This DS provides a hierarchical description of the time sequential structure of a multimedia document (suitable for browsing) together with an “analytical index” of AV objects of the document (suitable for retrieval). The TOCAI purposes and general characteristics are explained. The detailed structure of the DS is presented by means of UML notation as well, to clarify some issues that were not included in the original proposal. Some examples of XML instantiation are enclosed as well. Then an application example is shown. For an indication on how the TOCAI DS matches MPEG-7 requirements and evaluation criteria, refer to the original proposal submission

    Multimodal Indexing of Presentation Videos

    Get PDF
    This thesis presents four novel methods to help users efficiently and effectively retrieve information from unstructured and unsourced multimedia sources, in particular the increasing amount and variety of presentation videos such as those in e-learning, conference recordings, corporate talks, and student presentations. We demonstrate a system to summarize, index and cross-reference such videos, and measure the quality of the produced indexes as perceived by the end users. We introduce four major semantic indexing cues: text, speaker faces, graphics, and mosaics, going beyond standard tag based searches and simple video playbacks. This work aims at recognizing visual content "in the wild", where the system cannot rely on any additional information besides the video itself. For text, within a scene text detection and recognition framework, we present a novel locally optimal adaptive binarization algorithm, implemented with integral histograms. It determines of an optimal threshold that maximizes the between-classes variance within a subwindow, with computational complexity independent from the size of the window itself. We obtain character recognition rates of 74%, as validated against ground truth of 8 presentation videos spanning over 1 hour and 45 minutes, which almost doubles the baseline performance of an open source OCR engine. For speaker faces, we detect, track, match, and finally select a humanly preferred face icon per speaker, based on three quality measures: resolution, amount of skin, and pose. We register a 87% accordance (51 out of 58 speakers) between the face indexes automatically generated from three unstructured presentation videos of approximately 45 minutes each, and human preferences recorded through Mechanical Turk experiments. For diagrams, we locate graphics inside frames showing a projected slide, cluster them according to an on-line algorithm based on a combination of visual and temporal information, and select and color-correct their representatives to match human preferences recorded through Mechanical Turk experiments. We register 71% accuracy (57 out of 81 unique diagrams properly identified, selected and color-corrected) on three hours of videos containing five different presentations. For mosaics, we combine two existing suturing measures, to extend video images into in-the-world coordinate system. A set of frames to be registered into a mosaic are sampled according to the PTZ camera movement, which is computed through least square estimation starting from the luminance constancy assumption. A local features based stitching algorithm is then applied to estimate the homography among a set of video frames and median blending is used to render pixels in overlapping regions of the mosaic. For two of these indexes, namely faces and diagrams, we present two novel MTurk-derived user data collections to determine viewer preferences, and show that they are matched in selection by our methods. The net result work of this thesis allows users to search, inside a video collection as well as within a single video clip, for a segment of presentation by professor X on topic Y, containing graph Z

    MOSAiC Implementation Plan

    Get PDF
    This document is the second version of the Implementation Plan for the Multidisciplinary drifting Observatory for the Study of Arctic Climate (MOSAiC) initiative and lays out a vision of how associated observational, modeling, synthesis, and programmatic objectives can be manifested. The document was drafted during an international workshop in Potsdam in July 2015, and further developed during two additional workshops at AWI Potsdam in December 2015 and February 2016. Support for this planning activity has been provided by the IASC-ICARPIII process, the Alfred Wegener Institute Helmholtz Centre for Polar- and Marine Research, and the University of Colorado/ NOAA-ESRL-PSD. This document provides a framework for planning the logistics of the project, developing scientific observing teams, organizing scientific contributions, coordinating the use of resources, and ensuring MOSAiC’s legacy of data and products. A brief overview and summaries of key science questions are provided in Section 1. Section 2 includes an overview of specific observational requirements, while Section 3 describes the coordination and design of specific field assets. Practical logistics plans are outlined in Section 4. Links with current and future satellite programs and model activities are given in Sections 5 and 6. The MOSAiC data management strategy is given in Section 7. Links to other programs are outlined in Section 8. The appendix (Section 9) lists the parameters to be measured and the participating groups

    The Book of Thesis Books

    Get PDF
    89 pages : color illustrations ; 18 x 23 cm. This guide is intended to help future thesis writers understand the range of approaches to and content in RISD thesis books, locate some especially strong examples in the Library\u27s vast thesis collection, and imagine and plan their own theses. - from the introduction. Introduction by Jennifer Liese - Colophon. Contents: Academic thesis -- Monograph -- Project document -- Mosaic essay -- Artist\u27s book. Designed by Elizabeth Leeper (MFA Graphic Design 2017). Set in Parry and Parry Grotesque, by Artur Schmal, and printed by Lulu. --Colophon.https://digitalcommons.risd.edu/centerforartsandlanguage_thesiswriting/1000/thumbnail.jp

    Vision-based navigation with reality-based 3D maps

    Full text link
    This research is focused on developing vision-based navigation system for positioning and navigation in GPS degraded environments. The main research contributions are summarized as follows: a. A new concept of 3D map, which mainly consists of geo-referenced images, has been introduced. In this research, it provides the map-matching function for vision-based positioning. b. A method of vision-based positioning with use of photogrammetric methodologies has been proposed. It mainly obtains geometric information of the navigation environment from the 3D map through SIFT based image matching and uses photogrammetric space resection to solve the position in 6 degrees of freedom. The algorithms have been tested in an indoor environment. The accuracy has reached around 10 cm. c. A multi-level outlier detection scheme for the vision-based navigation system has been developed. It mainly combines RANSAC with data snooping. The former one deals with high percentage of mismatches, while data snooping removes outliers from different sources in the least squares adjustment for both 3D mapping and positioning solution. d. The deficiency of using RANSAC for outlier detection in image matching and homography estimation has been identified. In this research, a novel method which combines cross correlation with feature based image matching has been proposed. It is able to evaluate the RANSAC homography estimation and improve the image matching performance. The method has been successfully applied to the vision-based navigation solution to find corresponding view from the database and improve the final positioning accuracy. e. The positioning performance of the system has been evaluated through the analysis of mathematical model and experiments. The focus has been on various image matching conditions/methods and their impact on the system performance. The strength and weaknesses of the system have been revealed and investigated. f. The vision-based navigation system has been extended from indoor to outdoor with corresponding changes. Besides camera, it also takes advantage of multiple built-in sensors, including GPS receiver and a digital compass to assist visual methods in outdoor environments. Experiments demonstrate that such system can largely improve the position accuracy in areas where stand-alone GPS is affected and can be easily adopted on mobile devic

    A systematic review of the subjective wellbeing outcomes of engaging with visual arts for adults (“working-age”, 15-64 years) with diagnosed mental health conditions

    Get PDF
    The importance of the visual arts in contributing to the wellbeing of adults with mental health conditions has been little documented beyond some insightful and influential interventions and exploratory studies. Initiatives such as Arts on Prescription projects have, in the UK provided examples of the positive effects that engagement in artistic and creative activity can have, and some of these have been documented in small-scale studies of interventions. Most of the evidence has been perceived as positive but of limited scale. In this context, this review was carried out to examine in a more focused way the ‘subjective wellbeing’ (SWB) outcomes of engagement with the visual arts for adults with a background history of mental health conditions. SWB embraces both the positive and negative feelings that arise in individuals based on their view of the world, how they think about themselves and others, and what they do in the interactions and practices of everyday life. Adult subjects in the studies included in this review were of ‘working-age’ (15-64 years). The focus of the review and the precise research question were agreed at inception sessions of the research team, and in collaborative engagement with stakeholders in the areas of policy, service-delivery, project and evaluation commissioning, and research and scholarship in the spheres of the visual arts and mental health. Published studies from the past 10 years were studied for the review, and their findings synthesised and integrated into an evaluation of the state of knowledge in the field, in terms of the specifics of the research questions. We found that there is limited high-quality evidence, though case studies from the UK have provided important and consistent findings, corroborated by grey literature that has reported on interventions and projects. The review includes published findings based on data on/from 163 participants across four countries – Australia, Sweden, the UK, and the USA. Overall, female respondents outnumbered male respondents. A wide variety of wellbeing measures were used in some quantitative, statistical studies. In-depth interviews dominated the qualitative studies, giving voice to the experiences of individual subjects. The visual arts practices that featured in the studies included forms of painting or drawing, art appreciation with selected art forms, artmaking culminating in an exhibition, and more general creative and craft activities that included visual artefacts such as ceramics or sculpture. Evidence we include from recent unpublished reports (grey literature) was produced by or for visual arts organisations since 2014. Participants in the evaluations were both male and female and were engaged in UK-based arts interventions, many via community arts or ‘Arts on Prescription’ types of intervention. Overall, the evidence available in this review has shown that engagement in the visual arts for adults with mental health conditions can reduce reported levels of depression and anxiety; increase self-respect, self-worth and self-esteem; encourage and stimulate re-engagement with the wider, everyday social world; and support in participants a potential renegotiation of identity through practice-based forms of making or doing. The most effective ‘working ways to wellbeing’ are also confirmed in processes of implementation that ensure provision of secure safe-space and havens for interventions; that recognise the value of non-stigmatising settings; and that support and sustain collaborative facilitation of programmes and sessions. 4 Some negative dimensions of engagement with the visual arts were also identified, including stress and pressure felt to complete activities or commit to artmaking, and the very real fear that the end of an intervention would mean the return to a world of anxiety, decreasing confidence and social isolation. The review shows that for adults starting visual arts activities or programmes, the subjective wellbeing outcomes are, for the majority of participants, positive. This applies to men and women alike across the studies. The most convincing evidence has emerged from focused qualitative research designs, and makes clear that the most effective work in the field continues to lack the necessary resources and infrastructure that would ensure sustainable practices and interventions. Overall, there is some evidence of benefit in a weak field that could be strengthened by fuller monitoring of cohorts to evaluate the long-term effects of participants’ engagement with the visual arts

    Content-based video indexing for sports applications using integrated multi-modal approach

    Full text link
    This thesis presents a research work based on an integrated multi-modal approach for sports video indexing and retrieval. By combining specific features extractable from multiple (audio-visual) modalities, generic structure and specific events can be detected and classified. During browsing and retrieval, users will benefit from the integration of high-level semantic and some descriptive mid-level features such as whistle and close-up view of player(s). The main objective is to contribute to the three major components of sports video indexing systems. The first component is a set of powerful techniques to extract audio-visual features and semantic contents automatically. The main purposes are to reduce manual annotations and to summarize the lengthy contents into a compact, meaningful and more enjoyable presentation. The second component is an expressive and flexible indexing technique that supports gradual index construction. Indexing scheme is essential to determine the methods by which users can access a video database. The third and last component is a query language that can generate dynamic video summaries for smart browsing and support user-oriented retrievals

    An acquisition, curation and management workflow for sustainable, terabyte-scale marine image analysis

    Get PDF
    Optical imaging is a common technique in ocean research. Diving robots, towed cameras, drop-cameras and TV-guided sampling gear: all produce image data of the underwater environment. Technological advances like 4K cameras, autonomous robots, high-capacity batteries and LED lighting now allow systematic optical monitoring at large spatial scale and shorter time but with increased data volume and velocity. Volume and velocity are further increased by growing fleets and emerging swarms of autonomous vehicles creating big data sets in parallel. This generates a need for automated data processing to harvest maximum information. Systematic data analysis benefits from calibrated, geo-referenced data with clear metadata description, particularly for machine vision and machine learning. Hence, the expensive data acquisition must be documented, data should be curated as soon as possible, backed up and made publicly available. Here, we present a workflow towards sustainable marine image analysis. We describe guidelines for data acquisition, curation and management and apply it to the use case of a multi-terabyte deep-sea data set acquired by an autonomous underwater vehicle

    LITEMA REVIVAL OF A DISAPPEARING ART

    Get PDF
    Published ThesisLitema (pronounced di-tee-ma) is a Sesotho word that means ‘to plough’ or ‘cultivate’. It describes an indigenous mural art practised by women in Lesotho and the Free State province of South Africa. By beautifying freshly plastered homestead walls, Basotho mural artists acknowledge their natural and modern environments, whilst also celebrating seasonal and commemorative events such as Good Friday and Christmas, births, initiations, weddings and funerals. Embellishments comprise of paintings, engravings, relief sculpting and stone mosaics. Although this century-old art form has managed to survive the impact of, to mention a few, modernization, commercialization, and urbanization; early and current research shows that the tradition is both transitional and in decline. For Litema knowledge to survive, it is imperative that the current design is preserved, the art form revived, and the indigenous knowledge sustained. The objective of this study involves revisiting, conserving, promoting, and reintroducing the art form. A National Lotteries Trust Fund (NLDTF) grant awarded the Central University of Technology Free State (CUT) in 2005, enabled the implementation of eight Revival of Litema projects, which collectively strived to achieve these goals. The process involved the expansion of Litema knowledge through continued photographic and written documentation. The preservation and reintroduction of this knowledge presented in the form of a celebration of Litema during Heritage Month, a Litema website, an illustrated book, a design manual, a permanent mural and photographic installation, prototypical Litema products and a Litema DVD. Various ethical and creative considerations guided the assembly, presentation, and dissemination of data. This study contributes towards the lacuna in Litema research with a particular focus on artworks located in the Eastern Free State whilst building on the discourse around ethically appropriate indigenous knowledge research. It calls for the establishment of a visual and an oral archive devoted to, but not necessarily limited to Litema, in order to safeguard this fading façade in the landscape of South African art and heritage

    A low-cost remote sensing system for agricultural applications

    Get PDF
    This research develops a low cost remote sensing system for use in agricultural applications. The important features of the system are that it monitors the near infrared and it incorporates position and attitude measuring equipment allowing for geo-rectified images to be produced without the use of ground control points. The equipment is designed to be hand held and hence requires no structural modification to the aircraft. The portable remote sensing system consists of an inertia measurement unit (IMU), which is accelerometer based, a low-cost GPS device and a small format false colour composite digital camera. The total cost of producing such a system is below GBP 3000, which is far cheaper than equivalent existing systems. The design of the portable remote sensing device has eliminated bore sight misalignment errors from the direct geo-referencing process. A new processing technique has been introduced for the data obtained from these low-cost devices, and it is found that using this technique the image can be matched (overlaid) onto Ordnance Survey Master Maps at an accuracy compatible with precision agriculture requirements. The direct geo-referencing has also been improved by introducing an algorithm capable of correcting oblique images directly. This algorithm alters the pixels value, hence it is advised that image analysis is performed before image georectification. The drawback of this research is that the low-cost GPS device experienced bad checksum errors, which resulted in missing data. The Wide Area Augmented System (WAAS) correction could not be employed because the satellites could not be locked onto whilst flying. The best GPS data were obtained from the Garmin eTrex (15 m kinematic and 2 m static) instruments which have a highsensitivity receiver with good lock on capability. The limitation of this GPS device is the inability to effectively receive the P-Code wavelength, which is needed to gain the best accuracy when undertaking differential GPS processing. Pairing the carrier phase L1 with the pseudorange C/A-Code received, in order to determine the image coordinates by the differential technique, is still under investigation. To improve the position accuracy, it is recommended that a GPS base station should be established near the survey area, instead of using a permanent GPS base station established by the Ordnance Survey
    • 

    corecore