Search CORE

28 research outputs found

3D Question Answering

Author: Chen Dongdong
Han Songfang
Liao Jing
Ye Shuquan
Publication venue
Publication date: 28/11/2022
Field of study

Visual Question Answering (VQA) has witnessed tremendous progress in recent years. However, most efforts only focus on the 2D image question answering tasks. In this paper, we present the first attempt at extending VQA to the 3D domain, which can facilitate artificial intelligence's perception of 3D real-world scenarios. Different from image based VQA, 3D Question Answering (3DQA) takes the color point cloud as input and requires both appearance and 3D geometry comprehension ability to answer the 3D-related questions. To this end, we propose a novel transformer-based 3DQA framework "3DQA-TR", which consists of two encoders for exploiting the appearance and geometry information, respectively. The multi-modal information of appearance, geometry, and the linguistic question can finally attend to each other via a 3D-Linguistic Bert to predict the target answers. To verify the effectiveness of our proposed 3DQA framework, we further develop the first 3DQA dataset "ScanQA", which builds on the ScanNet dataset and contains

\sim

6K questions,

\sim

30K answers for

806

scenes. Extensive experiments on this dataset demonstrate the obvious superiority of our proposed 3DQA framework over existing VQA frameworks, and the effectiveness of our major designs. Our code and dataset will be made publicly available to facilitate the research in this direction.Comment: To Appear at IEEE Transactions on Visualization and Computer Graphics (TVCG) 202

arXiv.org e-Print Archive

LOGEN: Few-shot Logical Knowledge-Conditioned Text Generation with Self-training

Author: Chen Huajun
Chen Mosha
Deng Shumin
Huang Fei
Huang Songfang
Tan Chuanqi
Yang Jiacheng
Ye Hongbin
Zhang Ningyu
Publication venue
Publication date: 03/12/2022
Field of study

Natural language generation from structured data mainly focuses on surface-level descriptions, suffering from uncontrollable content selection and low fidelity. Previous works leverage logical forms to facilitate logical knowledge-conditioned text generation. Though achieving remarkable progress, they are data-hungry, which makes the adoption for real-world applications challenging with limited data. To this end, this paper proposes a unified framework for logical knowledge-conditioned text generation in the few-shot setting. With only a few seeds logical forms (e.g., 20/100 shot), our approach leverages self-training and samples pseudo logical forms based on content and structure consistency. Experimental results demonstrate that our approach can obtain better few-shot performance than baselines.Comment: Work in progres

arXiv.org e-Print Archive

Contrastive Demonstration Tuning for Pre-trained Language Models

Author: Bi Zhen
Chen Huajun
Cheng Siyuan
Huang Fei
Huang Songfang
Liang Xiaozhuan
Tan Chuanqi
Zhang Ningyu
Zhang Zhenru
Publication venue
Publication date: 18/04/2022
Field of study

Pretrained language models can be effectively stimulated by textual prompts or demonstrations, especially in low-data scenarios. Recent works have focused on automatically searching discrete or continuous prompts or optimized verbalizers, yet studies for the demonstration are still limited. Concretely, the demonstration examples are crucial for an excellent final performance of prompt-tuning. In this paper, we propose a novel pluggable, extensible, and efficient approach named contrastive demonstration tuning, which is free of demonstration sampling. Furthermore, the proposed approach can be: (i) Plugged to any previous prompt-tuning approaches; (ii) Extended to widespread classification tasks with a large number of categories. Experimental results on 16 datasets illustrate that our method integrated with previous approaches LM-BFF and P-tuning can yield better performance. Code is available in https://github.com/zjunlp/PromptKG/tree/main/research/Demo-Tuning.Comment: Work in progres

arXiv.org e-Print Archive

Joint Inference for Knowledge Base Population

Author: Chen Liwei
Feng Yansong
Huang Songfang
Mo Jinghui
Zhao Dongyan
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2014
Field of study

Populating Knowledge Base (KB) with new knowledge facts from reliable text resources usually consists of linking name mentions to KB entities and identifying relationship between entity pairs. However, the task often suffers from errors propagating from upstream entity linkers to downstream relation extractors. In this paper, we propose a novel joint inference framework to allow interactions between the two subtasks and find an optimal assignment by addressing the coherence among preliminary local predictions: whether the types of entities meet the expectations of relations explicitly or implicitly, and whether the local predictions are globally compatible. We further measure the confidence of the extracted triples by looking at the details of the complete extraction process. Experiments show that the proposed framework can significantly reduce the error propagations thus obtain more reliable facts, and outperforms competitive baselines with state-of-the-art relation extraction models. ? 2014 Association for Computational Linguistics.EI

Crossref

Harder Tasks Need More Experts: Dynamic Routing in MoE Models

Author: An Zhenwei
Chen Liwei
Feng Yansong
Huang Quzhe
Huang Songfang
Jin Yang
Tao Mingxu
Xu Kun
Xu Kun
Zhang Chen
Zhuang Nan
Publication venue
Publication date: 12/03/2024
Field of study

In this paper, we introduce a novel dynamic expert selection framework for Mixture of Experts (MoE) models, aiming to enhance computational efficiency and model performance by adjusting the number of activated experts based on input difficulty. Unlike traditional MoE approaches that rely on fixed Top-K routing, which activates a predetermined number of experts regardless of the input's complexity, our method dynamically selects experts based on the confidence level in expert selection for each input. This allows for a more efficient utilization of computational resources, activating more experts for complex tasks requiring advanced reasoning and fewer for simpler tasks. Through extensive evaluations, our dynamic routing method demonstrates substantial improvements over conventional Top-2 routing across various benchmarks, achieving an average improvement of 0.7% with less than 90% activated parameters. Further analysis shows our model dispatches more experts to tasks requiring complex reasoning skills, like BBH, confirming its ability to dynamically allocate computational resources in alignment with the input's complexity. Our findings also highlight a variation in the number of experts needed across different layers of the transformer model, offering insights into the potential for designing heterogeneous MoE frameworks. The code and models are available at https://github.com/ZhenweiAn/Dynamic_MoE

arXiv.org e-Print Archive