111,349 research outputs found
Learning Convolutional Text Representations for Visual Question Answering
Visual question answering is a recently proposed artificial intelligence task
that requires a deep understanding of both images and texts. In deep learning,
images are typically modeled through convolutional neural networks, and texts
are typically modeled through recurrent neural networks. While the requirement
for modeling images is similar to traditional computer vision tasks, such as
object recognition and image classification, visual question answering raises a
different need for textual representation as compared to other natural language
processing tasks. In this work, we perform a detailed analysis on natural
language questions in visual question answering. Based on the analysis, we
propose to rely on convolutional neural networks for learning textual
representations. By exploring the various properties of convolutional neural
networks specialized for text data, such as width and depth, we present our
"CNN Inception + Gate" model. We show that our model improves question
representations and thus the overall accuracy of visual question answering
models. We also show that the text representation requirement in visual
question answering is more complicated and comprehensive than that in
conventional natural language processing tasks, making it a better task to
evaluate textual representation methods. Shallow models like fastText, which
can obtain comparable results with deep learning models in tasks like text
classification, are not suitable in visual question answering.Comment: Conference paper at SDM 2018. https://github.com/divelab/sva
TOWARDS DEEP LEARNING FOR ARCHITECTURE: A MONUMENT RECOGNITION MOBILE APP
Abstract. In recent years, the diffusion of large image datasets and an unprecedented computational power have boosted the development of a class of artificial intelligence (AI) algorithms referred to as deep learning (DL). Among DL methods, convolutional neural networks (CNNs) have proven particularly effective in computer vision, finding applications in many disciplines. This paper introduces a project aimed at studying CNN techniques in the field of architectural heritage, a still to be developed research stream. The first steps and results in the development of a mobile app to recognize monuments are discussed. While AI is just beginning to interact with the built environment through mobile devices, heritage technologies have long been producing and exploring digital models and spatial archives. The interaction between DL algorithms and state-of-the-art information modeling is addressed, as an opportunity to both exploit heritage collections and optimize new object recognition techniques.</p
Classification of Time-Series Images Using Deep Convolutional Neural Networks
Convolutional Neural Networks (CNN) has achieved a great success in image
recognition task by automatically learning a hierarchical feature
representation from raw data. While the majority of Time-Series Classification
(TSC) literature is focused on 1D signals, this paper uses Recurrence Plots
(RP) to transform time-series into 2D texture images and then take advantage of
the deep CNN classifier. Image representation of time-series introduces
different feature types that are not available for 1D signals, and therefore
TSC can be treated as texture image recognition task. CNN model also allows
learning different levels of representations together with a classifier,
jointly and automatically. Therefore, using RP and CNN in a unified framework
is expected to boost the recognition rate of TSC. Experimental results on the
UCR time-series classification archive demonstrate competitive accuracy of the
proposed approach, compared not only to the existing deep architectures, but
also to the state-of-the art TSC algorithms.Comment: The 10th International Conference on Machine Vision (ICMV 2017
- …