Search CORE

3,291 research outputs found

GeoSay: A Geometric Saliency for Extracting Buildings in Remote Sensing Images

Author: Huang Jin
Lu Qikai
Xia Gui-Song
Xue Nan
Zhu Xiaoxiang
Publication venue
Publication date: 07/11/2018
Field of study

Automatic extraction of buildings in remote sensing images is an important but challenging task and finds many applications in different fields such as urban planning, navigation and so on. This paper addresses the problem of buildings extraction in very high-spatial-resolution (VHSR) remote sensing (RS) images, whose spatial resolution is often up to half meters and provides rich information about buildings. Based on the observation that buildings in VHSR-RS images are always more distinguishable in geometry than in texture or spectral domain, this paper proposes a geometric building index (GBI) for accurate building extraction, by computing the geometric saliency from VHSR-RS images. More precisely, given an image, the geometric saliency is derived from a mid-level geometric representations based on meaningful junctions that can locally describe geometrical structures of images. The resulting GBI is finally measured by integrating the derived geometric saliency of buildings. Experiments on three public and commonly used datasets demonstrate that the proposed GBI achieves the state-of-the-art performance and shows impressive generalization capability. Additionally, GBI preserves both the exact position and accurate shape of single buildings compared to existing methods

arXiv.org e-Print Archive

Institute of Transport Research:Publications

Saliency Prediction in the Data Visualization Design Process

Author: BARRERA LEON LUISA FERNANDA
Publication venue: country:Italy
Publication date: 28/10/2022
Field of study

L'abstract è presente nell'allegato / the abstract is in the attachmen

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

The influence of banner advertisements on attention and memory: human faces with averted gaze can enhance advertising effectiveness

Author: Althoff
Ball
Ball
Baron-Cohen
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2014
Field of study

Research suggests that banner advertisements used in online marketing are often overlooked, especially when positioned horizontally on webpages. Such inattention invariably gives rise to an inability to remember advertising brands and messages, undermining the effectiveness of this marketing method. Recent interest has focused on whether human faces within banner advertisements can increase attention to the information they contain, since the gaze cues conveyed by faces can influence where observers look. We report an experiment that investigated the efficacy of faces located in banner advertisements to enhance the attentional processing and memorability of banner contents. We tracked participants’ eye movements when they examined webpages containing either bottom-right vertical banners or bottom-centre horizontal banners. We also manipulated facial information such that banners either contained no face, a face with mutual gaze or a face with averted gaze. We additionally assessed people’s memories for brands and advertising messages. Results indicated that relative to other conditions, the condition involving faces with averted gaze increased attention to the banner overall, as well as to the advertising text and product. Memorability of the brand and advertising message was also enhanced. Conversely, in the condition involving faces with mutual gaze, the focus of attention was localised more on the face region rather than on the text or product, weakening any memory benefits for the brand and advertising message. This detrimental impact of mutual gaze on attention to advertised products was especially marked for vertical banners. These results demonstrate that the inclusion of human faces with averted gaze in banner advertisements provides a promising means for marketers to increase the attention paid to such adverts, thereby enhancing memory for advertising information

CLoK

Crossref

Directory of Open Access Journals

Frontiers - Publisher Connector

PubMed Central

Content Recognition and Context Modeling for Document Analysis and Retrieval

Author: Zhu Guangyu
Publication venue
Publication date: 01/01/2009
Field of study

The nature and scope of available documents are changing significantly in many areas of document analysis and retrieval as complex, heterogeneous collections become accessible to virtually everyone via the web. The increasing level of diversity presents a great challenge for document image content categorization, indexing, and retrieval. Meanwhile, the processing of documents with unconstrained layouts and complex formatting often requires effective leveraging of broad contextual knowledge. In this dissertation, we first present a novel approach for document image content categorization, using a lexicon of shape features. Each lexical word corresponds to a scale and rotation invariant local shape feature that is generic enough to be detected repeatably and is segmentation free. A concise, structurally indexed shape lexicon is learned by clustering and partitioning feature types through graph cuts. Our idea finds successful application in several challenging tasks, including content recognition of diverse web images and language identification on documents composed of mixed machine printed text and handwriting. Second, we address two fundamental problems in signature-based document image retrieval. Facing continually increasing volumes of documents, detecting and recognizing unique, evidentiary visual entities (\eg, signatures and logos) provides a practical and reliable supplement to the OCR recognition of printed text. We propose a novel multi-scale framework to detect and segment signatures jointly from document images, based on the structural saliency under a signature production model. We formulate the problem of signature retrieval in the unconstrained setting of geometry-invariant deformable shape matching and demonstrate state-of-the-art performance in signature matching and verification. Third, we present a model-based approach for extracting relevant named entities from unstructured documents. In a wide range of applications that require structured information from diverse, unstructured document images, processing OCR text does not give satisfactory results due to the absence of linguistic context. Our approach enables learning of inference rules collectively based on contextual information from both page layout and text features. Finally, we demonstrate the importance of mining general web user behavior data for improving document ranking and other web search experience. The context of web user activities reveals their preferences and intents, and we emphasize the analysis of individual user sessions for creating aggregate models. We introduce a novel algorithm for estimating web page and web site importance, and discuss its theoretical foundation based on an intentional surfer model. We demonstrate that our approach significantly improves large-scale document retrieval performance

CiteSeerX

Digital Repository at the University of Maryland

Understanding the Role of Explanations in Computer Vision Applications

Author: Alqaraawi Ahmed
Publication venue: UCL (University College London)
Publication date: 28/03/2022
Field of study

Recent advancements in AI show great performance over a range of applications, but its operations are hard to interpret, even for experts. Various explanation algorithms have been proposed to address this issue, yet limited research effort has been reported concerning their user evaluation. Against this background, this thesis reports on four user studies designed to investigate the role of explanations in helping end-users build a better functional understanding of computer vision processes. In addition, we seek to understand what features lay users attend to in order to build such functional understanding, and whether different techniques provide different gains. In particular, we begin by examining the utility of "keypoint markers"; coloured dot visualisations that correspond to patterns of interest identified by an underlying algorithm and can be seen in many computer vision applications. We then investigate the utility of saliency maps; a popular group of explanations for the operation of Convolutional Neural Networks (CNNs). The findings indicate that keypoint markers can be helpful if they are presented in line with users' expectations. They also indicate that saliency maps can improve participants' ability to predict the outcome of a CNN, but only moderately. Overall, this thesis contributes by evaluating these explanation techniques through user studies. It also provides a number of key findings that provide helpful guidelines for practitioners on how and when to use these explanations, as well as which types of users to target. Furthermore, it proposes and evaluates two novel explanation techniques as well as a number of helpful tools that help researchers and practitioners when designing user studies around the evaluation of explanations. Finally, this thesis highlights a number of implications for the design of explanation techniques and further research in that area

UCL Discovery