165 research outputs found

    A Non-Visual Photo Collection Browser based on Automatically Generated Text Descriptions

    Get PDF
    This study presents a textual photo collection browser that automatically and quickly analyses large personal photo collections and produces textual reports that can be accessed by blind users using either text-to-speech or Braille output devices. The textual photo browser exploits recent advances in image collection analysis and the strategy does not rely on manual image tagging. The reports produced by the textual image browser gives the user a gist about where, when and what the photographer was doing in the form of a story. Although yet crude, the strategy can give blind users a valuable overview about the contents of large image collections and individual images which otherwise are totally inaccessible without vision

    Where was that photo taken? : deriving geographical information from image collections based on temporal exposure attributes

    Get PDF
    This paper demonstrates a novel strategy for inferring approximate geographical information from the exposure information and temporal patterns of outdoor images in image collections. Image exposure is reliant on light and most photographs are therefore taken in daylight which again depends on the position of the sun. Clearly, the sun results in different lighting conditions at different geographical location and at different times of the day, and hence the observed intensity patterns can be used to deduce the approximate location of the photographer at the time the photographs were taken. Images taken inside or at night are temporally connected to the daylight images and the geographical information can therefore be transferred to related ‘‘dark’’ photographs. The strategy is efficient as it only considers meta information and not image contents. Large databases can therefore be indexed efficiently. Experimental results demonstrate that the current approach yields a longitudinal error of 15.7 and a latitudinal error of 30.5 for authentic image collections comprising a mixture of outdoor and indoor images. The strategy determined the correct hemisphere in all the tests. Although not as accurate as GPS receiver, the geographical information is sufficiently detailed to be useful. Applications include improved image retrieval, image browsing and automatic image tagging. The strategy does not require a GPS receiver and can be applied to the existing digital image collections

    A Configurable Photo Browser Framework for Large Image Collections

    Get PDF
    Image collections are growing at an exponential rate due to the wide availability of inexpensive digital cameras and storage. Current browsers organize photos mostly chronologically, or according to manual tags. For very large collections acquired over several years it can be difficult to locate a particular set of images – even for the owner. Although our visual memory is powerful, it is not always easy to recall all of one’s images. Moreover, it can be very time consuming to find particular images in other peoples image collections. This paper presents a prototype image browser and a plug-in pattern that allows classifiers to be implemented and easily integrated with the image browser such that the user can control the characteristics of the images that are browsed and irrelevant photos are filtered out. The filters can both be content based and based on meta-information. The current version is only employs meta-information which means that large image collections can be indexed efficiently

    Unsupervised and Fast Continent Classification of Digital Image Collections using Time

    Get PDF
    Advances in storage capacity means that digital cameras can store huge collections of digital photographs. Typically such images are given non-descriptive filenames names such as a unique identifier, often an integer. Consequently it is time-consuming and difficult to browse and retrieve images from large collections especially on small consumer electronics devices. A strategy for classifying images into geographical regions is presented which allows images to be coarsely sorted into the continent where they were taken. The strategy employs patterns in the time-stamps of images to identify events such as holiday and individual days, and to estimate the approximate longitude where the photographs were taken. Experimental evaluations demonstrate that the continent is correctly estimated for 89 % of the images in arbitrary collections and that the longitude is estimated with a mean error of 27.5 degrees. The strategy is relatively straightforward to implement, also in hardware, and computationally inexpensive

    Application of Cloud-Based Geospatial Technologies to Flowering Phenology and Environmental Education

    Get PDF
    Cloud-based geospatial technologies are rapidly improving the flow of information from the environment to end-users. Cloud-based photo storage websites were used to create and manage species and spatial-temporal metadata in a digital photographic inventory of plant flowering observations collected at Lake Issaqueena, SC from January, 2012 to December, 2014. Statistical analysis of species and temporal metadata revealed significant (p \u3c 0.05) inter-annual shifts in flowering time among several species during and after extreme high monthly temperature in March, 2012 and extreme high monthly total precipitation in July and August, 2013. An interactive ArcGIS Online map with sampling locations of flowering plants was developed and published. The interactive ArcGIS Online map enables web-based knowledge discovery of flowering phenology by allowing users to filter map contents, view plant pictures, navigate to additional plant information in the USDA PLANTS Database, and render spatial-temporal flowering patterns using the heat map view and time settings. The conceptual workflow for managing, integrating, and mapping plant flowering observations has numerous potential applications in species monitoring, allowing for higher volume and quality data to be collected and shared openly. A Cloud-based ESRI Story Map was developed for teaching Soil Forming Factors: Topography in undergraduate soil science education. Student evaluation of the ESRI Story Map was positive, and responses indicate students broadly preferred the ESRI Story Map as a stand-alone teaching module or as supplemental to PowerPoint slides. Teaching with ESRI Story Maps is very different than GIS education, and is well suited for fostering critical and spatial thinking because students do not need to possess prior skills in GIS software, allowing them to spend more time learning the topic at hand in interactive teaching modules. Teaching with ESRI Story Maps has enormous potential in soil science and other environmental disciplines, but more research is needed to develop specific teaching objectives and exercises using ESRI Story Maps

    Multimedia Annotation Interoperability Framework

    Get PDF
    Multimedia systems typically contain digital documents of mixed media types, which are indexed on the basis of strongly divergent metadata standards. This severely hamplers the inter-operation of such systems. Therefore, machine understanding of metadata comming from different applications is a basic requirement for the inter-operation of distributed Multimedia systems. In this document, we present how interoperability among metadata, vocabularies/ontologies and services is enhanced using Semantic Web technologies. In addition, it provides guidelines for semantic interoperability, illustrated by use cases. Finally, it presents an overview of the most commonly used metadata standards and tools, and provides the general research direction for semantic interoperability using Semantic Web technologies

    Determining the Geographical Location of Image Scenes based on Object Shadow Lengths

    Get PDF
    Many studies have addressed various applications of geo-spatial image tagging such as image retrieval, image organisation and browsing. Geo-spatial image tagging can be done manually or automatically with GPS enabled cameras that allow the current position of the photographer to be incorporated into the meta-data of an image. However, current GPS-equipment needs certain time to lock onto navigation satellites and these are therefore not suitable for spontaneous photography. Moreover, GPS units are still costly, energy hungry and not common in most digital cameras on sale. This study explores the potential of, and limitations associated with, extracting geo-spatial information from the image contents. The elevation of the sun is estimated indirectly from the contents of image collections by measuring the relative length of objects and their shadows in image scenes. The observed sun elevation and the creation time of the image is input into a celestial model to estimate the approximate geographical location of the photographer. The strategy is demonstrated on a set of manually measured photographs

    A Simple Content-based Strategy for Estimating the Geographical Location of a Webcam

    Get PDF
    This study proposes a strategy for determining the approximate geographical location of a webcam based on a sequence of images taken at regular intervals. For a time-stamped image sequence spanning 24 hours the approximate sunrise and sunset times are determined by classifying images into day and nighttime images based on the image intensity. Based on the sunrise and sunset times both the latitude and longitude of the webcam can be determined. Experimental data demonstrates the effectiveness of the strategy

    Multimodalities in Metadata: Gaia Gate

    Get PDF
    Metadata is information about objects. Existing metadata standards seldom describe details concerning an object’s context within an environment; this thesis proposes a new concept, external contextual metadata (ECM), examining metadata, digital photography, and mobile interface theory as context for a proposed multimodal framework of media that expresses the internal and external qualities of the digital object and how they might be employed in various use cases. The framework is binded to a digital image as a singular object. Information contained in these ‘images’ can then be processed by a renderer application to reinterpret the context that the image was captured, including non-visually. Two prototypes are developed through the process of designing a renderer for the new multimodal data framework: a proof-of-concept application and a demonstration of ‘figurative’ execution (titled ‘Gaia Gate’), followed by a critical design analysis of the resulting products

    Detecting semantic concepts in digital photographs: low-level features vs. non-homogeneous data fusion

    Get PDF
    Semantic concepts, such as faces, buildings, and other real world objects, are the most preferred instrument that humans use to navigate through and retrieve visual content from large multimedia databases. Semantic annotation of visual content in large collections is therefore essential if ease of access and use is to be ensured. Classification of images into broad categories such as indoor/outdoor, building/non-building, urban/landscape, people/no-people, etc., allows us to obtain the semantic labels without the full knowledge of all objects in the scene. Inferring the presence of high-level semantic concepts from low-level visual features is a research topic that has been attracting a significant amount of interest lately. However, the power of lowlevel visual features alone has been shown to be limited when faced with the task of semantic scene classification in heterogeneous, unconstrained, broad-topic image collections. Multi-modal fusion or combination of information from different modalities has been identified as one possible way of overcoming the limitations of single-mode approaches. In the field of digital photography, the incorporation of readily available camera metadata, i.e. information about the image capture conditions stored in the EXIF header of each image, along with the GPS information, offers a way to move towards a better understanding of the imaged scene. In this thesis we focus on detection of semantic concepts such as artificial text in video and large buildings in digital photographs, and examine how fusion of low-level visual features with selected camera metadata, using a Support Vector Machine as an integration device, affects the performance of the building detector in a genuine personal photo collection. We implemented two approaches to detection of buildings that combine content-based and the context-based information, and an approach to indoor/outdoor classification based exclusively on camera metadata. An outdoor detection rate of 85.6% was obtained using camera metadata only. The first approach to building detection, based on simple edge orientation-based features extracted at three different scales, has been tested on a dataset of 1720 outdoor images, with a classification accuracy of 88.22%. The second approach integrates the edge orientation-based features with the camera metadata-based features, both at the feature and at the decision level. The fusion approaches have been evaluated using an unconstrained dataset of 8000 genuine consumer photographs. The experiments demonstrate that the fusion approaches outperform the visual features-only approach by of 2-3% on average regardless of the operating point chosen, while all the performance measures are approximately 4% below the upper limit of performance. The early fusion approach consistently improves all performance measures
    corecore