164,712 research outputs found

    The Models of Authority Project: Extending the DigiPal Framework for Script and Decoration

    Get PDF
    The DigiPal project for palaeography has featured in previous DH conferences. It includes a generalised framework for the description and analysis of handwriting, initially applied to Old English of the eleventh century but subsequently extended to Latin, Hebrew, and decoration; it incorporates a novel model for describing handwriting; and a recent addition allows the embedding of linked palaeographical images into prose description. The purpose of this poster is to present new developments which form part of two further major grants, one of which is the Models of Authority project. Specifically, the focus here is on the incorporation of textual content into the model for handwriting

    Pose-Guided Multi-Granularity Attention Network for Text-Based Person Search

    Full text link
    Text-based person search aims to retrieve the corresponding person images in an image database by virtue of a describing sentence about the person, which poses great potential for various applications such as video surveillance. Extracting visual contents corresponding to the human description is the key to this cross-modal matching problem. Moreover, correlated images and descriptions involve different granularities of semantic relevance, which is usually ignored in previous methods. To exploit the multilevel corresponding visual contents, we propose a pose-guided multi-granularity attention network (PMA). Firstly, we propose a coarse alignment network (CA) to select the related image regions to the global description by a similarity-based attention. To further capture the phrase-related visual body part, a fine-grained alignment network (FA) is proposed, which employs pose information to learn latent semantic alignment between visual body part and textual noun phrase. To verify the effectiveness of our model, we perform extensive experiments on the CUHK Person Description Dataset (CUHK-PEDES) which is currently the only available dataset for text-based person search. Experimental results show that our approach outperforms the state-of-the-art methods by 15 \% in terms of the top-1 metric.Comment: published in AAAI2020(oral

    Information Enhancement for Travelogues via a Hybrid Clustering Model

    Full text link
    © 2018 IEEE. Travelogues consist of textual information shared by tourists through web forums or other social media which often lack illustrations (images). In image sharing websites like Flicker, users can post images with rich textual information: 'title', 'tag' and 'description'. The topics of travelogues usually revolve around beautiful sceneries. Corresponding landscape images recommended to these travelogues can enhance the vividness of reading. However, it is difficult to fuse such information because the text attached to each image has diverse meanings/views. In this paper, we propose an unsupervised Hybrid Multiple Kernel K-means (HMKKM) model to link images and travelogues through multiple views. Multi-view matrices are built to reveal the correlations between several respects. For further improving the performance, we add a regularisation based on textual similarity. To evaluate the effectiveness of the proposed method, a dataset is constructed from TripAdvisor and Flicker to find the related images for each travelogue. Experiment results demonstrate the superiority of the proposed model by comparison with other baselines
    • …
    corecore