Search CORE

129 research outputs found

Agents meet OKR: An Object and Key Results Driven Agent System with Hierarchical Self-Collaboration and Self-Evaluation

Author: Huang Haibin
Ma Chongyang
Shi Kanle
Zheng Yi
Publication venue
Publication date: 28/11/2023
Field of study

In this study, we introduce the concept of OKR-Agent designed to enhance the capabilities of Large Language Models (LLMs) in task-solving. Our approach utilizes both self-collaboration and self-correction mechanism, facilitated by hierarchical agents, to address the inherent complexities in task-solving. Our key observations are two-fold: first, effective task-solving demands in-depth domain knowledge and intricate reasoning, for which deploying specialized agents for individual sub-tasks can markedly enhance LLM performance. Second, task-solving intrinsically adheres to a hierarchical execution structure, comprising both high-level strategic planning and detailed task execution. Towards this end, our OKR-Agent paradigm aligns closely with this hierarchical structure, promising enhanced efficacy and adaptability across a range of scenarios. Specifically, our framework includes two novel modules: hierarchical Objects and Key Results generation and multi-level evaluation, each contributing to more efficient and robust task-solving. In practical, hierarchical OKR generation decomposes Objects into multiple sub-Objects and assigns new agents based on key results and agent responsibilities. These agents subsequently elaborate on their designated tasks and may further decompose them as necessary. Such generation operates recursively and hierarchically, culminating in a comprehensive set of detailed solutions. The multi-level evaluation module of OKR-Agent refines solution by leveraging feedback from all associated agents, optimizing each step of the process. This ensures solution is accurate, practical, and effectively address intricate task requirements, enhancing the overall reliability and quality of the outcome. Experimental results also show our method outperforms the previous methods on several tasks. Code and demo are available at https://okr-agent.github.io

arXiv.org e-Print Archive

Embedding Heterogeneous Networks into Hyperbolic Space Without Meta-path

Author: Gao Chongyang
Huang Chenghan
Liu Ruibo
Ma Weicheng
Vosoughi Soroush
Wang Lili
Publication venue
Publication date: 18/05/2021
Field of study

Networks found in the real-world are numerous and varied. A common type of network is the heterogeneous network, where the nodes (and edges) can be of different types. Accordingly, there have been efforts at learning representations of these heterogeneous networks in low-dimensional space. However, most of the existing heterogeneous network embedding methods suffer from the following two drawbacks: (1) The target space is usually Euclidean. Conversely, many recent works have shown that complex networks may have hyperbolic latent anatomy, which is non-Euclidean. (2) These methods usually rely on meta-paths, which require domain-specific prior knowledge for meta-path selection. Additionally, different down-streaming tasks on the same network might require different meta-paths in order to generate task-specific embeddings. In this paper, we propose a novel self-guided random walk method that does not require meta-path for embedding heterogeneous networks into hyperbolic space. We conduct thorough experiments for the tasks of network reconstruction and link prediction on two public datasets, showing that our model outperforms a variety of well-known baselines across all tasks.Comment: In proceedings of the 35th AAAI Conference on Artificial Intelligenc

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Implicit Neural Deformation for Sparse-View Face Reconstruction

Author: Huang Haibin
Li Mengtian
Li Moran
Ma Chongyang
Sang Nong
Zheng Yi
Publication venue
Publication date: 03/10/2022
Field of study

In this work, we present a new method for 3D face reconstruction from sparse-view RGB images. Unlike previous methods which are built upon 3D morphable models (3DMMs) with limited details, we leverage an implicit representation to encode rich geometric features. Our overall pipeline consists of two major components, including a geometry network, which learns a deformable neural signed distance function (SDF) as the 3D face representation, and a rendering network, which learns to render on-surface points of the neural SDF to match the input images via self-supervised optimization. To handle in-the-wild sparse-view input of the same target with different expressions at test time, we propose residual latent code to effectively expand the shape space of the learned implicit face representation as well as a novel view-switch loss to enforce consistency among different views. Our experimental results on several benchmark datasets demonstrate that our approach outperforms alternative baselines and achieves superior face reconstruction results compared to state-of-the-art methods.Comment: 10 pages, 6 figures, The 30th Pacific Conference on Computer Graphics and Applications. Pacific Graphics(PG) 202

arXiv.org e-Print Archive

Multi-Modal Face Stylization with a Generative Prior

Author: Dong Yi
Huang Haibin
Li Mengtian
Lin Minxuan
Ma Chongyang
Wan Pengfei
Publication venue
Publication date: 29/05/2023
Field of study

In this work, we introduce a new approach for artistic face stylization. Despite existing methods achieving impressive results in this task, there is still room for improvement in generating high-quality stylized faces with diverse styles and accurate facial reconstruction. Our proposed framework, MMFS, supports multi-modal face stylization by leveraging the strengths of StyleGAN and integrates it into an encoder-decoder architecture. Specifically, we use the mid-resolution and high-resolution layers of StyleGAN as the decoder to generate high-quality faces, while aligning its low-resolution layer with the encoder to extract and preserve input facial details. We also introduce a two-stage training strategy, where we train the encoder in the first stage to align the feature maps with StyleGAN and enable a faithful reconstruction of input faces. In the second stage, the entire network is fine-tuned with artistic data for stylized face generation. To enable the fine-tuned model to be applied in zero-shot and one-shot stylization tasks, we train an additional mapping network from the large-scale Contrastive-Language-Image-Pre-training (CLIP) space to a latent

w+

space of fine-tuned StyleGAN. Qualitative and quantitative experiments show that our framework achieves superior face stylization performance in both one-shot and zero-shot stylization tasks, outperforming state-of-the-art methods by a large margin

arXiv.org e-Print Archive

Task-Aware Sampling Layer for Point-Wise Analysis

Author: Chen Lichang
Cui Shuguang
Han Xiaoguang
Huang Haibin
Lin Yiqun
Ma Chongyang
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 26/04/2022
Field of study

Sampling, grouping, and aggregation are three important components in the multi-scale analysis of point clouds. In this paper, we present a novel data-driven sampler learning strategy for point-wise analysis tasks. Unlike the widely used sampling technique, Farthest Point Sampling (FPS), we propose to learn sampling and downstream applications jointly. Our key insight is that uniform sampling methods like FPS are not always optimal for different tasks: sampling more points around boundary areas can make the point-wise classification easier for segmentation. Towards this end, we propose a novel sampler learning strategy that learns sampling point displacement supervised by task-related ground truth information and can be trained jointly with the underlying tasks. We further demonstrate our methods in various point-wise analysis tasks, including semantic part segmentation, point cloud completion, and keypoint detection. Our experiments show that jointly learning of the sampler and task brings better performance than using FPS in various point-based networks.Comment: 14 pages, 13 figures and 14 table

arXiv.org e-Print Archive

An Empirical Survey of Unsupervised Text Representation Methods on Twitter Data

Author: Gao Chongyang
Liu Ruibo
Ma Weicheng
Vosoughi Soroush
Wang Lili
Wei Jason
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2020
Field of study

The field of NLP has seen unprecedented achievements in recent years. Most notably, with the advent of large-scale pre-trained Transformer-based language models, such as BERT, there has been a noticeable improvement in text representation. It is, however, unclear whether these improvements translate to noisy user-generated text, such as tweets. In this paper, we present an experimental survey of a wide range of well-known text representation techniques for the task of text clustering on noisy Twitter data. Our results indicate that the more advanced models do not necessarily work best on tweets and that more exploration in this area is needed.Comment: In proceedings of the 6th Workshop on Noisy User-generated Text (W-NUT) at EMNLP 202

arXiv.org e-Print Archive

Crossref