Search CORE

13,915 research outputs found

Location Embedding and Deep Convolutional Neural Networks for Next Location Prediction

Author: Bachir Abdelmalik
Bechkit Walid
Brahimi Mohammed
Sassi Abdessamed
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/10/2019
Field of study

International audienc

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Location Embedding and Deep Convolutional Neural Networks for Next Location Prediction

Author: Bachir Abdelmalik
Bechkit Walid
Brahimi Mohammed
Sassi Abdessamed
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/10/2019
Field of study

International audienc

INRIA a CCSD electronic archive server

Cross-Domain Image Retrieval with Attention Modeling

Author: Ji Xin
Wang Wei
Yang Yang
Zhang Meihui
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 06/09/2017
Field of study

With the proliferation of e-commerce websites and the ubiquitousness of smart phones, cross-domain image retrieval using images taken by smart phones as queries to search products on e-commerce websites is emerging as a popular application. One challenge of this task is to locate the attention of both the query and database images. In particular, database images, e.g. of fashion products, on e-commerce websites are typically displayed with other accessories, and the images taken by users contain noisy background and large variations in orientation and lighting. Consequently, their attention is difficult to locate. In this paper, we exploit the rich tag information available on the e-commerce websites to locate the attention of database images. For query images, we use each candidate image in the database as the context to locate the query attention. Novel deep convolutional neural network architectures, namely TagYNet and CtxYNet, are proposed to learn the attention weights and then extract effective representations of the images. Experimental results on public datasets confirm that our approaches have significant improvement over the existing methods in terms of the retrieval accuracy and efficiency.Comment: 8 pages with an extra reference pag

arXiv.org e-Print Archive

Crossref

Straight to Shapes: Real-time Detection of Encoded Shapes

Author: Golodetz Stuart
Jetley Saumya
Sapienza Michael
Torr Philip H. S.
Publication venue
Publication date: 05/07/2017
Field of study

Current object detection approaches predict bounding boxes, but these provide little instance-specific information beyond location, scale and aspect ratio. In this work, we propose to directly regress to objects' shapes in addition to their bounding boxes and categories. It is crucial to find an appropriate shape representation that is compact and decodable, and in which objects can be compared for higher-order concepts such as view similarity, pose variation and occlusion. To achieve this, we use a denoising convolutional auto-encoder to establish an embedding space, and place the decoder after a fast end-to-end network trained to regress directly to the encoded shape vectors. This yields what to the best of our knowledge is the first real-time shape prediction network, running at ~35 FPS on a high-end desktop. With higher-order shape reasoning well-integrated into the network pipeline, the network shows the useful practical quality of generalising to unseen categories similar to the ones in the training set, something that most existing approaches fail to handle.Comment: 16 pages including appendix; Published at CVPR 201

arXiv.org e-Print Archive

Crossref

ABC-CNN: An Attention Based Convolutional Neural Network for Visual Question Answering

Author: Chen Kan
Chen Liang-Chieh
Gao Haoyuan
Nevatia Ram
Wang Jiang
Xu Wei
Publication venue
Publication date: 03/04/2016
Field of study

We propose a novel attention based deep learning architecture for visual question answering task (VQA). Given an image and an image related natural language question, VQA generates the natural language answer for the question. Generating the correct answers requires the model's attention to focus on the regions corresponding to the question, because different questions inquire about the attributes of different image regions. We introduce an attention based configurable convolutional neural network (ABC-CNN) to learn such question-guided attention. ABC-CNN determines an attention map for an image-question pair by convolving the image feature map with configurable convolutional kernels derived from the question's semantics. We evaluate the ABC-CNN architecture on three benchmark VQA datasets: Toronto COCO-QA, DAQUAR, and VQA dataset. ABC-CNN model achieves significant improvements over state-of-the-art methods on these datasets. The question-guided attention generated by ABC-CNN is also shown to reflect the regions that are highly relevant to the questions

arXiv.org e-Print Archive

CiteSeerX