7,480 research outputs found
Visualization of and Access to CloudSat Vertical Data through Google Earth
Online tools, pioneered by the Google Earth (GE), are facilitating the way in which scientists and general public interact with geospatial data in real three dimensions. However, even in Google Earth, there is no method for depicting vertical geospatial data derived from remote sensing satellites as an orbit curtain seen from above. Here, an effective solution is proposed to automatically render the vertical atmospheric data on Google Earth. The data are first processed through the Giovanni system, then, processed to be 15-second vertical data images. A generalized COLLADA model is devised based on the 15-second vertical data profile. Using the designed COLLADA models and satellite orbit coordinates, a satellite orbit model is designed and implemented in KML format to render the vertical atmospheric data in spatial and temporal ranges vividly. The whole orbit model consists of repeated model slices. The model slices, each representing 15 seconds of vertical data, are placed on the CloudSat orbit based on the size, scale, and angle with the longitude line that are precisely and separately calculated on the fly for each slice according to the CloudSat orbit coordinates. The resulting vertical scientific data can be viewed transparently or opaquely on Google Earth. Not only is the research bridged the science and data with scientists and the general public in the most popular way, but simultaneous visualization and efficient exploration of the relationships among quantitative geospatial data, e.g. comparing the vertical data profiles with MODIS and AIRS precipitation data, becomes possible
Video browsing interfaces and applications: a review
We present a comprehensive review of the state of the art in video browsing and retrieval systems, with special emphasis on interfaces and applications. There has been a significant increase in activity (e.g., storage, retrieval, and sharing) employing video data in the past decade, both for personal and professional use. The ever-growing amount of video content available for human consumption and the inherent characteristics of video data—which, if presented in its raw format, is rather unwieldy and costly—have become driving forces for the development of more effective solutions to present video contents and allow rich user interaction. As a result, there are many contemporary research efforts toward developing better video browsing solutions, which we summarize. We review more than 40 different video browsing and retrieval interfaces and classify them into three groups: applications that use video-player-like interaction, video retrieval applications, and browsing solutions based on video surrogates. For each category, we present a summary of existing work, highlight the technical aspects of each solution, and compare them against each other
Strategies for image visualisation and browsing
PhDThe exploration of large information spaces has remained a challenging task even
though the proliferation of database management systems and the state-of-the art
retrieval algorithms is becoming pervasive. Signi cant research attention in the
multimedia domain is focused on nding automatic algorithms for organising digital
image collections into meaningful structures and providing high-semantic image
indices. On the other hand, utilisation of graphical and interactive methods from
information visualisation domain, provide promising direction for creating e cient
user-oriented systems for image management. Methods such as exploratory browsing
and query, as well as intuitive visual overviews of image collection, can assist
the users in nding patterns and developing the understanding of structures and
content in complex image data-sets.
The focus of the thesis is combining the features of automatic data processing
algorithms with information visualisation. The rst part of this thesis focuses on
the layout method for displaying the collection of images indexed by low-level visual
descriptors. The proposed solution generates graphical overview of the data-set as
a combination of similarity based visualisation and random layout approach.
Second part of the thesis deals with problem of visualisation and exploration for
hierarchical organisation of images. Due to the absence of the semantic information,
images are considered the only source of high-level information. The content preview
and display of hierarchical structure are combined in order to support image
retrieval. In addition to this, novel exploration and navigation methods are proposed
to enable the user to nd the way through database structure and retrieve
the content.
On the other hand, semantic information is available in cases where automatic
or semi-automatic image classi ers are employed. The automatic annotation of
image items provides what is referred to as higher-level information. This type
of information is a cornerstone of multi-concept visualisation framework which is
developed as a third part of this thesis. This solution enables dynamic generation
of user-queries by combining semantic concepts, supported by content overview and
information ltering.
Comparative analysis and user tests, performed for the evaluation of the proposed
solutions, focus on the ways information visualisation a ects the image content
exploration and retrieval; how e cient and comfortable are the users when
using di erent interaction methods and the ways users seek for information through
di erent types of database organisation
SMAN : Stacked Multi-Modal Attention Network for cross-modal image-text retrieval
This article focuses on tackling the task of the cross-modal image-text retrieval which has been an interdisciplinary topic in both computer vision and natural language processing communities. Existing global representation alignment-based methods fail to pinpoint the semantically meaningful portion of images and texts, while the local representation alignment schemes suffer from the huge computational burden for aggregating the similarity of visual fragments and textual words exhaustively. In this article, we propose a stacked multimodal attention network (SMAN) that makes use of the stacked multimodal attention mechanism to exploit the fine-grained interdependencies between image and text, thereby mapping the aggregation of attentive fragments into a common space for measuring cross-modal similarity. Specifically, we sequentially employ intramodal information and multimodal information as guidance to perform multiple-step attention reasoning so that the fine-grained correlation between image and text can be modeled. As a consequence, we are capable of discovering the semantically meaningful visual regions or words in a sentence which contributes to measuring the cross-modal similarity in a more precise manner. Moreover, we present a novel bidirectional ranking loss that enforces the distance among pairwise multimodal instances to be closer. Doing so allows us to make full use of pairwise supervised information to preserve the manifold structure of heterogeneous pairwise data. Extensive experiments on two benchmark datasets demonstrate that our SMAN consistently yields competitive performance compared to state-of-the-art methods
MusA: Using Indoor Positioning and Navigation to Enhance Cultural Experiences in a museum
In recent years there has been a growing interest into the use of multimedia mobile guides in museum environments. Mobile devices have the capabilities to detect the user context and to provide pieces of information suitable to help visitors discovering and following the logical and emotional connections that develop during the visit. In this scenario, location based services (LBS) currently represent an asset, and the choice of the technology to determine users' position, combined with the definition of methods that can effectively convey information, become key issues in the design process. In this work, we present MusA (Museum Assistant), a general framework for the development of multimedia interactive guides for mobile devices. Its main feature is a vision-based indoor positioning system that allows the provision of several LBS, from way-finding to the contextualized communication of cultural contents, aimed at providing a meaningful exploration of exhibits according to visitors' personal interest and curiosity. Starting from the thorough description of the system architecture, the article presents the implementation of two mobile guides, developed to respectively address adults and children, and discusses the evaluation of the user experience and the visitors' appreciation of these application
From Keyword Search to Exploration: How Result Visualization Aids Discovery on the Web
A key to the Web's success is the power of search. The elegant way in which search results are returned is usually remarkably effective. However, for exploratory search in which users need to learn, discover, and understand novel or complex topics, there is substantial room for improvement. Human computer interaction researchers and web browser designers have developed novel strategies to improve Web search by enabling users to conveniently visualize, manipulate, and organize their Web search results. This monograph offers fresh ways to think about search-related cognitive processes and describes innovative design approaches to browsers and related tools. For instance, while key word search presents users with results for specific information (e.g., what is the capitol of Peru), other methods may let users see and explore the contexts of their requests for information (related or previous work, conflicting information), or the properties that associate groups of information assets (group legal decisions by lead attorney). We also consider the both traditional and novel ways in which these strategies have been evaluated. From our review of cognitive processes, browser design, and evaluations, we reflect on the future opportunities and new paradigms for exploring and interacting with Web search results
Beyond 2D-grids: a dependence maximization view on image browsing
Ideally, one would like to perform image search using an intuitive and friendly approach. Many existing image search engines, however, present users with sets of images arranged in some default order on the screen, typically the relevance to a query, only. While this certainly has its advantages, arguably, a more flexible and intuitive way would be to sort images into arbitrary structures such as grids, hierarchies, or spheres so that images that are visually or semantically alike are placed together. This paper focuses on designing such a navigation system for image browsers. This is a challenging task because arbitrary layout structure makes it difficult -- if not impossible -- to compute cross-similarities between images and structure coordinates, the main ingredient of traditional layouting approaches. For this reason, we resort to a recently developed machine learning technique: kernelized sorting. It is a general technique for matching pairs of objects from different domains without requiring cross-domain similarity measures and hence elegantly allows sorting images into arbitrary structures. Moreover, we extend it so that some images can be preselected for instance forming the tip of the hierarchy allowing to subsequently navigate through the search results in the lower levels in an intuitive way
Exploring EEG for Object Detection and Retrieval
This paper explores the potential for using Brain Computer Interfaces (BCI)
as a relevance feedback mechanism in content-based image retrieval. We
investigate if it is possible to capture useful EEG signals to detect if
relevant objects are present in a dataset of realistic and complex images. We
perform several experiments using a rapid serial visual presentation (RSVP) of
images at different rates (5Hz and 10Hz) on 8 users with different degrees of
familiarization with BCI and the dataset. We then use the feedback from the BCI
and mouse-based interfaces to retrieve localized objects in a subset of TRECVid
images. We show that it is indeed possible to detect such objects in complex
images and, also, that users with previous knowledge on the dataset or
experience with the RSVP outperform others. When the users have limited time to
annotate the images (100 seconds in our experiments) both interfaces are
comparable in performance. Comparing our best users in a retrieval task, we
found that EEG-based relevance feedback outperforms mouse-based feedback. The
realistic and complex image dataset differentiates our work from previous
studies on EEG for image retrieval.Comment: This preprint is the full version of a short paper accepted in the
ACM International Conference on Multimedia Retrieval (ICMR) 2015 (Shanghai,
China
Designing a training tool for imaging mental models
The training process can be conceptualized as the student acquiring an evolutionary sequence of classification-problem solving mental models. For example a physician learns (1) classification systems for patient symptoms, diagnostic procedures, diseases, and therapeutic interventions and (2) interrelationships among these classifications (e.g., how to use diagnostic procedures to collect data about a patient's symptoms in order to identify the disease so that therapeutic measures can be taken. This project developed functional specifications for a computer-based tool, Mental Link, that allows the evaluative imaging of such mental models. The fundamental design approach underlying this representational medium is traversal of virtual cognition space. Typically intangible cognitive entities and links among them are visible as a three-dimensional web that represents a knowledge structure. The tool has a high degree of flexibility and customizability to allow extension to other types of uses, such a front-end to an intelligent tutoring system, knowledge base, hypermedia system, or semantic network
- …