Search CORE

5 research outputs found

Attentive neural architecture for ad-hoc structured document retrieval

Author: Balaneshinkordan S.
Kotov A.
Nikolaev F.
Publication venue
Publication date: 01/01/2018
Field of study

© 2018 Copyright held by the owner/author(s). Publication rights licensed to ACM. The problem of ad-hoc structured document retrieval arises in many information access scenarios, from Web to product search. Yet neither deep neural networks, which have been successfully applied to ad-hoc information retrieval and Web search, nor the attention mechanism, which has been shown to significantly improve the performance of deep neural networks on natural language processing tasks, have been explored in the context of this problem. In this paper, we propose a deep neural architecture for ad-hoc structured document retrieval, which utilizes attention mechanism to determine important phrases in keyword queries as well as the relative importance of matching those phrases in different fields of structured documents. Experimental evaluation on publicly available collections for Web document, product and entity retrieval from knowledge graphs indicates superior retrieval accuracy of the proposed neural architecture relative to both state-of-the-art neural architectures for ad-hoc document retrieval and probabilistic models for ad-hoc structured document retrieval

Kazan Federal University Digital Repository

Considerations about learning Word2Vec

Author: Amedeo Buonanno
Francesco Palmieri
Giovanni Di Gennaro
Publication venue
Publication date: 01/01/2021
Field of study

AbstractDespite the large diffusion and use of embedding generated through Word2Vec, there are still many open questions about the reasons for its results and about its real capabilities. In particular, to our knowledge, no author seems to have analysed in detail how learning may be affected by the various choices of hyperparameters. In this work, we try to shed some light on various issues focusing on a typical dataset. It is shown that the learning rate prevents the exact mapping of the co-occurrence matrix, that Word2Vec is unable to learn syntactic relationships, and that it does not suffer from the problem of overfitting. Furthermore, through the creation of an ad-hoc network, it is also shown how it is possible to improve Word2Vec directly on the analogies, obtaining very high accuracy without damaging the pre-existing embedding. This analogy-enhanced Word2Vec may be convenient in various NLP scenarios, but it is used here as an optimal starting point to evaluate the limits of Word2Vec

Open Access Repository

Archivio Istituzionale della Ricerca - Università degli Studi della Campania "Luigi Vanvitelli"

Understanding, Categorizing and Predicting Semantic Image-Text Relations

Author: Alexander Kotov Saeid
Bahdanau Dzmitry
Barthes Roland
Grave Edouard
Huang Ting-Hao K.
Hussain Zaeem
Jaques Natasha
Krippendorff Klaus
Lin Tsung-Yi
Martinec Radan
Qi Jinwei
Unsworth Len
Zhang Mingda
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/06/2019
Field of study

Two modalities are often used to convey information in a complementary and beneficial manner, e.g., in online news, videos, educational resources, or scientific publications. The automatic understanding of semantic correlations between text and associated images as well as their interplay has a great potential for enhanced multimodal web search and recommender systems. However, automatic understanding of multimodal information is still an unsolved research problem. Recent approaches such as image captioning focus on precisely describing visual content and translating it to text, but typically address neither semantic interpretations nor the specific role or purpose of an image-text constellation. In this paper, we go beyond previous work and investigate, inspired by research in visual communication, useful semantic image-text relations for multimodal information retrieval. We derive a categorization of eight semantic image-text classes (e.g., "illustration" or "anchorage") and show how they can systematically be characterized by a set of three metrics: cross-modal mutual information, semantic correlation, and the status relation of image and text. Furthermore, we present a deep learning system to predict these classes by utilizing multimodal embeddings. To obtain a sufficiently large amount of training data, we have automatically collected and augmented data from a variety of data sets and web resources, which enables future research on this topic. Experimental results on a demanding test set demonstrate the feasibility of the approach.Comment: 8 pages, 8 Figures, 5 table

arXiv.org e-Print Archive

Crossref