165 research outputs found
A Non-Visual Photo Collection Browser based on Automatically Generated Text Descriptions
This study presents a textual photo collection
browser that automatically and quickly analyses large personal
photo collections and produces textual reports that can be
accessed by blind users using either text-to-speech or Braille
output devices. The textual photo browser exploits recent
advances in image collection analysis and the strategy does not
rely on manual image tagging. The reports produced by the
textual image browser gives the user a gist about where, when
and what the photographer was doing in the form of a story.
Although yet crude, the strategy can give blind users a
valuable overview about the contents of large image collections
and individual images which otherwise are totally inaccessible
without vision
Where was that photo taken? : deriving geographical information from image collections based on temporal exposure attributes
This paper demonstrates a novel strategy for
inferring approximate geographical information from the
exposure information and temporal patterns of outdoor
images in image collections. Image exposure is reliant on
light and most photographs are therefore taken in daylight
which again depends on the position of the sun. Clearly, the
sun results in different lighting conditions at different
geographical location and at different times of the day, and
hence the observed intensity patterns can be used to deduce
the approximate location of the photographer at the time
the photographs were taken. Images taken inside or at night
are temporally connected to the daylight images and the
geographical information can therefore be transferred to
related ‘‘dark’’ photographs. The strategy is efficient as it
only considers meta information and not image contents.
Large databases can therefore be indexed efficiently.
Experimental results demonstrate that the current approach
yields a longitudinal error of 15.7 and a latitudinal error of
30.5 for authentic image collections comprising a mixture
of outdoor and indoor images. The strategy determined the
correct hemisphere in all the tests. Although not as accurate
as GPS receiver, the geographical information is sufficiently
detailed to be useful. Applications include
improved image retrieval, image browsing and automatic
image tagging. The strategy does not require a GPS
receiver and can be applied to the existing digital image
collections
A Configurable Photo Browser Framework for Large Image Collections
Image collections are growing at an exponential rate due to the wide
availability of inexpensive digital cameras and storage. Current browsers
organize photos mostly chronologically, or according to manual tags. For very
large collections acquired over several years it can be difficult to locate a
particular set of images – even for the owner. Although our visual memory is
powerful, it is not always easy to recall all of one’s images. Moreover, it can be
very time consuming to find particular images in other peoples image
collections. This paper presents a prototype image browser and a plug-in pattern
that allows classifiers to be implemented and easily integrated with the image
browser such that the user can control the characteristics of the images that are
browsed and irrelevant photos are filtered out. The filters can both be content
based and based on meta-information. The current version is only employs
meta-information which means that large image collections can be indexed
efficiently
Unsupervised and Fast Continent Classification of Digital Image Collections using Time
Advances in storage capacity means that digital
cameras can store huge collections of digital photographs.
Typically such images are given non-descriptive filenames names
such as a unique identifier, often an integer. Consequently it is
time-consuming and difficult to browse and retrieve images from
large collections especially on small consumer electronics devices.
A strategy for classifying images into geographical regions is
presented which allows images to be coarsely sorted into the
continent where they were taken. The strategy employs patterns
in the time-stamps of images to identify events such as holiday
and individual days, and to estimate the approximate longitude
where the photographs were taken. Experimental evaluations
demonstrate that the continent is correctly estimated for 89 % of
the images in arbitrary collections and that the longitude is
estimated with a mean error of 27.5 degrees. The strategy is
relatively straightforward to implement, also in hardware, and
computationally inexpensive
Application of Cloud-Based Geospatial Technologies to Flowering Phenology and Environmental Education
Cloud-based geospatial technologies are rapidly improving the flow of information from the environment to end-users. Cloud-based photo storage websites were used to create and manage species and spatial-temporal metadata in a digital photographic inventory of plant flowering observations collected at Lake Issaqueena, SC from January, 2012 to December, 2014. Statistical analysis of species and temporal metadata revealed significant (p \u3c 0.05) inter-annual shifts in flowering time among several species during and after extreme high monthly temperature in March, 2012 and extreme high monthly total precipitation in July and August, 2013. An interactive ArcGIS Online map with sampling locations of flowering plants was developed and published. The interactive ArcGIS Online map enables web-based knowledge discovery of flowering phenology by allowing users to filter map contents, view plant pictures, navigate to additional plant information in the USDA PLANTS Database, and render spatial-temporal flowering patterns using the heat map view and time settings. The conceptual workflow for managing, integrating, and mapping plant flowering observations has numerous potential applications in species monitoring, allowing for higher volume and quality data to be collected and shared openly. A Cloud-based ESRI Story Map was developed for teaching Soil Forming Factors: Topography in undergraduate soil science education. Student evaluation of the ESRI Story Map was positive, and responses indicate students broadly preferred the ESRI Story Map as a stand-alone teaching module or as supplemental to PowerPoint slides. Teaching with ESRI Story Maps is very different than GIS education, and is well suited for fostering critical and spatial thinking because students do not need to possess prior skills in GIS software, allowing them to spend more time learning the topic at hand in interactive teaching modules. Teaching with ESRI Story Maps has enormous potential in soil science and other environmental disciplines, but more research is needed to develop specific teaching objectives and exercises using ESRI Story Maps
Multimedia Annotation Interoperability Framework
Multimedia systems typically contain digital documents of mixed media types, which are indexed on the basis of strongly divergent metadata standards. This severely hamplers the inter-operation of such systems. Therefore, machine understanding of metadata comming from different applications is a basic requirement for the inter-operation of distributed Multimedia systems. In this document, we present how interoperability among metadata, vocabularies/ontologies and services is enhanced using Semantic Web technologies. In addition, it provides guidelines for semantic interoperability, illustrated by use cases. Finally, it presents an overview of the most commonly used metadata standards and tools, and provides the general research direction for semantic interoperability using Semantic Web technologies
Determining the Geographical Location of Image Scenes based on Object Shadow Lengths
Many studies have addressed various applications
of geo-spatial image tagging such as image retrieval,
image organisation and browsing. Geo-spatial image
tagging can be done manually or automatically with GPS
enabled cameras that allow the current position of the
photographer to be incorporated into the meta-data of an
image. However, current GPS-equipment needs certain time
to lock onto navigation satellites and these are therefore not
suitable for spontaneous photography. Moreover, GPS units
are still costly, energy hungry and not common in most
digital cameras on sale. This study explores the potential of,
and limitations associated with, extracting geo-spatial
information from the image contents. The elevation of the
sun is estimated indirectly from the contents of image
collections by measuring the relative length of objects and
their shadows in image scenes. The observed sun elevation
and the creation time of the image is input into a celestial
model to estimate the approximate geographical location of
the photographer. The strategy is demonstrated on a set of
manually measured photographs
A Simple Content-based Strategy for Estimating the Geographical Location of a Webcam
This study proposes a strategy for determining the approximate geographical location of a webcam based on a sequence of images taken at regular intervals. For a time-stamped image sequence spanning 24 hours the approximate sunrise and sunset times are determined by classifying images into day and nighttime images based on the image intensity. Based on the sunrise and sunset times both the latitude and longitude of the webcam can be determined. Experimental data demonstrates the effectiveness of the strategy
Multimodalities in Metadata: Gaia Gate
Metadata is information about objects. Existing metadata standards seldom describe details concerning an object’s context within an environment; this thesis proposes a new concept, external contextual metadata (ECM), examining metadata, digital photography, and mobile interface theory as context for a proposed multimodal framework of media that expresses the internal and external qualities of the digital object and how they might be employed in various use cases. The framework is binded to a digital image as a singular object. Information contained in these ‘images’ can then be processed by a renderer application to reinterpret the context that the image was captured, including non-visually. Two prototypes are developed through the process of designing a renderer for the new multimodal data framework: a proof-of-concept application and a demonstration of ‘figurative’ execution (titled ‘Gaia Gate’), followed by a critical design analysis of the resulting products
Detecting semantic concepts in digital photographs: low-level features vs. non-homogeneous data fusion
Semantic concepts, such as faces, buildings, and other real world objects, are the most preferred instrument that humans use to navigate through and retrieve visual content from large multimedia databases. Semantic annotation of visual content in large collections is therefore essential if ease of access and use is to be ensured. Classification of images into broad categories such as indoor/outdoor, building/non-building, urban/landscape, people/no-people, etc., allows us to obtain the semantic labels without the full knowledge of all objects in the scene.
Inferring the presence of high-level semantic concepts from low-level visual features is a research
topic that has been attracting a significant amount of interest lately. However, the power of lowlevel visual features alone has been shown to be limited when faced with the task of semantic scene classification in heterogeneous, unconstrained, broad-topic image collections. Multi-modal fusion or combination of information from different modalities has been identified as one possible way of overcoming the limitations of single-mode approaches. In the field of digital photography, the incorporation of readily available camera metadata, i.e. information about the image capture conditions stored in the EXIF header of each image, along with the GPS information, offers a way to move towards a better understanding of the imaged scene.
In this thesis we focus on detection of semantic concepts such as artificial text in video and large buildings in digital photographs, and examine how fusion of low-level visual features with selected camera metadata, using a Support Vector Machine as an integration device, affects the performance of the building detector in a genuine personal photo collection. We implemented two approaches to detection of buildings that combine content-based and the context-based information, and an approach to indoor/outdoor classification based exclusively on camera metadata. An outdoor detection rate of 85.6% was obtained using camera metadata only. The first approach to building detection, based on simple edge orientation-based features extracted at three different scales, has been tested on a dataset of 1720 outdoor images, with a classification accuracy of 88.22%. The second approach integrates the edge orientation-based features with the camera metadata-based features, both at the feature and at the decision level. The fusion approaches have been evaluated using an unconstrained dataset of 8000 genuine consumer photographs. The experiments demonstrate that the fusion approaches outperform the visual features-only approach by of 2-3% on average regardless of the operating point chosen, while all the performance measures are approximately 4% below the upper limit of performance. The early fusion approach consistently improves all performance measures
- …