Search CORE

12 research outputs found

ATRank: An Attention-Based User Behavior Modeling Framework for Recommendation

Author: Bai Jinze
Chen Xiusi
Gao Jun
Liu Xiaofei
Song Junshuai
Zhao Zhengchao
Zhou Chang
Publication venue
Publication date: 27/11/2017
Field of study

A user can be represented as what he/she does along the history. A common way to deal with the user modeling problem is to manually extract all kinds of aggregated features over the heterogeneous behaviors, which may fail to fully represent the data itself due to limited human instinct. Recent works usually use RNN-based methods to give an overall embedding of a behavior sequence, which then could be exploited by the downstream applications. However, this can only preserve very limited information, or aggregated memories of a person. When a downstream application requires to facilitate the modeled user features, it may lose the integrity of the specific highly correlated behavior of the user, and introduce noises derived from unrelated behaviors. This paper proposes an attention based user behavior modeling framework called ATRank, which we mainly use for recommendation tasks. Heterogeneous user behaviors are considered in our model that we project all types of behaviors into multiple latent semantic spaces, where influence can be made among the behaviors via self-attention. Downstream applications then can use the user behavior vectors via vanilla attention. Experiments show that ATRank can achieve better performance and faster training process. We further explore ATRank to use one unified model to predict different types of user behaviors at the same time, showing a comparable performance with the highly optimized individual models.Comment: AAAI 201

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

TouchStone: Evaluating Vision-Language Models by Language Models

Author: Bai Jinze
Bai Shuai
Lin Junyang
Wang Peng
Wang Xinggang
Yang Shusheng
Zhang Xingxuan
Zhou Chang
Zhou Jingren
Publication venue
Publication date: 04/09/2023
Field of study

Large vision-language models (LVLMs) have recently witnessed rapid advancements, exhibiting a remarkable capacity for perceiving, understanding, and processing visual information by connecting visual receptor with large language models (LLMs). However, current assessments mainly focus on recognizing and reasoning abilities, lacking direct evaluation of conversational skills and neglecting visual storytelling abilities. In this paper, we propose an evaluation method that uses strong LLMs as judges to comprehensively evaluate the various abilities of LVLMs. Firstly, we construct a comprehensive visual dialogue dataset TouchStone, consisting of open-world images and questions, covering five major categories of abilities and 27 subtasks. This dataset not only covers fundamental recognition and comprehension but also extends to literary creation. Secondly, by integrating detailed image annotations we effectively transform the multimodal input content into a form understandable by LLMs. This enables us to employ advanced LLMs for directly evaluating the quality of the multimodal dialogue without requiring human intervention. Through validation, we demonstrate that powerful LVLMs, such as GPT-4, can effectively score dialogue quality by leveraging their textual capabilities alone, aligning with human preferences. We hope our work can serve as a touchstone for LVLMs' evaluation and pave the way for building stronger LVLMs. The evaluation code is available at https://github.com/OFA-Sys/TouchStone.Comment: https://github.com/OFA-Sys/TouchSton

arXiv.org e-Print Archive

Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities

Author: Bai Jinze
Bai Shuai
Lin Junyang
Tan Sinan
Wang Peng
Wang Shijie
Yang Shusheng
Zhou Chang
Zhou Jingren
Publication venue
Publication date: 24/08/2023
Field of study

We introduce the Qwen-VL series, a set of large-scale vision-language models designed to perceive and understand both text and images. Comprising Qwen-VL and Qwen-VL-Chat, these models exhibit remarkable performance in tasks like image captioning, question answering, visual localization, and flexible interaction. The evaluation covers a wide range of tasks including zero-shot captioning, visual or document visual question answering, and grounding. We demonstrate the Qwen-VL outperforms existing Large Vision Language Models (LVLMs). We present their architecture, training, capabilities, and performance, highlighting their contributions to advancing multimodal artificial intelligence. Code, demo and models are available at https://github.com/QwenLM/Qwen-VL.Comment: Code, demo and models are available at https://github.com/QwenLM/Qwen-V

arXiv.org e-Print Archive

Qwen Technical Report

Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first installment of our large language model series. Qwen is a comprehensive language model series that encompasses distinct models with varying parameter counts. It includes Qwen, the base pretrained language models, and Qwen-Chat, the chat models finetuned with human alignment techniques. The base language models consistently demonstrate superior performance across a multitude of downstream tasks, and the chat models, particularly those trained using Reinforcement Learning from Human Feedback (RLHF), are highly competitive. The chat models possess advanced tool-use and planning capabilities for creating agent applications, showcasing impressive performance even when compared to bigger models on complex tasks like utilizing a code interpreter. Furthermore, we have developed coding-specialized models, Code-Qwen and Code-Qwen-Chat, as well as mathematics-focused models, Math-Qwen-Chat, which are built upon base language models. These models demonstrate significantly improved performance in comparison with open-source models, and slightly fall behind the proprietary models.Comment: 59 pages, 5 figure

arXiv.org e-Print Archive

Fractal complex transform technology for fractal Kkorteweg-de Vries equation within a local fractional derivative

Author: Jian-Hong Wang
Jinze Xu
Ping Cui
Yunru Bai
Zeng-Shun Chen
Publication venue: 'National Library of Serbia'
Publication date: 01/01/2016
Field of study

Crossref

Fractal complex transform technology for fractal Kkorteweg-de Vries equation within a local fractional derivative

Author: Bai Yunru
Chen Zeng-Shun
Cui Ping
Wang Jian-Hong
Xu Jinze
Publication venue: VINCA Institute of Nuclear Sciences
Publication date: 01/01/2016
Field of study

In this paper, we present the fractal complex transform via a local fractional derivative. The traveling wave solutions for the fractal Korteweg-de Vries equations within local fractional derivative are obtained based on the special functions defined on Cantor sets. The technology is a powerful tool for solving the local fractional non-linear partial differential equations

Directory of Open Access Journals

Fabrication of Ordered SnO2 Nanostructures with Enhanced Humidity Sensing Performance

Author: Chao Ding
Gang Bai
Jie Xu
Jinze Li
Juyan Liu
Qingying Ren
Wei Li
Publication venue: 'MDPI AG'
Publication date: 01/10/2017
Field of study

Ordered SnO2 nanostructures were prepared as humidity sensors by nanosphere lithography with the magnetron sputtering technique. The X-ray diffraction patterns of SnO2 nanostructures show that all intense diffraction peaks correspond to the crystallographic planes of SnO2. The Atomic Force Microscope (AFM) mage shows that these SnO2 nanostructures exhibited a classic honeycomb structure. The resistance of this sensor was measured to show that the resistance of the sensor decreases with an increase from lower relative humidity (RH) to higher RH. Additionally, the longest response/recovery time was 32 s/42 s for 11–96% RH. The hysteresis for the SnO2 nanostructure sensor was <5%

Directory of Open Access Journals

Enhancement of Detecting Permanent Water and Temporary Water in Flood Disasters by Fusing Sentinel-1 and Sentinel-2 Imagery Using Deep Learning Algorithms: Demonstration of Sen1Floods11 Benchmark Datasets

Author: Bo Zhao
Erick Mas
Hanfang Yang
Jinze Yu
Shunichi Koshimura
Wenqi Wu
Xing Liu
Yanbing Bai
Zhengxin Yang
Publication venue: 'MDPI AG'
Publication date: 01/06/2021
Field of study

Identifying permanent water and temporary water in flood disasters efficiently has mainly relied on change detection method from multi-temporal remote sensing imageries, but estimating the water type in flood disaster events from only post-flood remote sensing imageries still remains challenging. Research progress in recent years has demonstrated the excellent potential of multi-source data fusion and deep learning algorithms in improving flood detection, while this field has only been studied initially due to the lack of large-scale labelled remote sensing images of flood events. Here, we present new deep learning algorithms and a multi-source data fusion driven flood inundation mapping approach by leveraging a large-scale publicly available Sen1Flood11 dataset consisting of roughly 4831 labelled Sentinel-1 SAR and Sentinel-2 optical imagery gathered from flood events worldwide in recent years. Specifically, we proposed an automatic segmentation method for surface water, permanent water, and temporary water identification, and all tasks share the same convolutional neural network architecture. We utilize focal loss to deal with the class (water/non-water) imbalance problem. Thorough ablation experiments and analysis confirmed the effectiveness of various proposed designs. In comparison experiments, the method proposed in this paper is superior to other classical models. Our model achieves a mean Intersection over Union (mIoU) of 52.99%, Intersection over Union (IoU) of 52.30%, and Overall Accuracy (OA) of 92.81% on the Sen1Flood11 test set. On the Sen1Flood11 Bolivia test set, our model also achieves very high mIoU (47.88%), IoU (76.74%), and OA (95.59%) and shows good generalization ability

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Fabrication of Ordered SnO2 Nanostructures with Enhanced Humidity Sensing Performance

Author: Chao Ding
Duy
Gang Bai
Jie Xu
Jinze Li
Juyan Liu
Morrison
Qingying Ren
Wei Li
Publication venue: 'MDPI AG'
Publication date
Field of study

Crossref