7 research outputs found
Measurement of Pedestrian Traffic Using Feature-based Regression in the Spatiotemporal Domain
NICE 2023 Zero-shot Image Captioning Challenge
In this report, we introduce NICE
project\footnote{\url{https://nice.lgresearch.ai/}} and share the results and
outcomes of NICE challenge 2023. This project is designed to challenge the
computer vision community to develop robust image captioning models that
advance the state-of-the-art both in terms of accuracy and fairness. Through
the challenge, the image captioning models were tested using a new evaluation
dataset that includes a large variety of visual concepts from many domains.
There was no specific training data provided for the challenge, and therefore
the challenge entries were required to adapt to new types of image descriptions
that had not been seen during training. This report includes information on the
newly proposed NICE dataset, evaluation methods, challenge results, and
technical details of top-ranking entries. We expect that the outcomes of the
challenge will contribute to the improvement of AI models on various
vision-language tasks.Comment: Tech report, project page https://nice.lgresearch.ai
Strength Development of Alkali-Activated Fly Ash Exposed to a Carbon Dioxide-Rich Environment at an Early Age
Superpixel Image Classification with Graph Convolutional Neural Networks Based on Learnable Positional Embedding
Graph convolutional neural networks (GCNNs) have been successfully applied to a wide range of problems, including low-dimensional Euclidean structural domains representing images, videos, and speech and high-dimensional non-Euclidean domains, such as social networks and chemical molecular structures. However, in computer vision, the existing GCNNs are not provided with positional information to distinguish between graphs of new structures; therefore, the performance of the image classification domain represented by arbitrary graphs is significantly poor. In this work, we introduce how to initialize the positional information through a random walk algorithm and continuously learn the additional position-embedded information of various graph structures represented over the superpixel images we choose for efficiency. We call this method the graph convolutional network with learnable positional embedding applied on images (IMGCN-LPE). We apply IMGCN-LPE to three graph convolutional models (the Chebyshev graph convolutional network, graph convolutional network, and graph attention network) to validate performance on various benchmark image datasets. As a result, although not as impressive as convolutional neural networks, the proposed method outperforms various other conventional convolutional methods and demonstrates its effectiveness among the same tasks in the field of GCNNs