Search CORE

8 research outputs found

Fusion-Eval: Integrating Evaluators with LLMs

Author: Chen Jindong
Liu Yinxiao
Luo Liangchen
Meng Lei
Shu Lei
Wichers Nevan
Zhu Yun
Publication venue
Publication date: 15/11/2023
Field of study

Evaluating Large Language Models (LLMs) is a complex task, especially considering the intricacies of natural language understanding and the expectations for high-level reasoning. Traditional evaluations typically lean on human-based, model-based, or automatic-metrics-based paradigms, each with its own advantages and shortcomings. We introduce "Fusion-Eval", a system that employs LLMs not solely for direct evaluations, but to skillfully integrate insights from diverse evaluators. This gives Fusion-Eval flexibility, enabling it to work effectively across diverse tasks and make optimal use of multiple references. In testing on the SummEval dataset, Fusion-Eval achieved a Spearman correlation of 0.96, outperforming other evaluators. The success of Fusion-Eval underscores the potential of LLMs to produce evaluations that closely align human perspectives, setting a new standard in the field of LLM evaluation

arXiv.org e-Print Archive

SiRA: Sparse Mixture of Low Rank Adaptation

Author: Chen Jindong
Chen Tianlong
Lin Chu-Cheng
Liu Canoee
Lu Han
Luo Liangchen
Meng Lei
Shu Lei
Wang Xinyi
Wichers Nevan
Zhu Yun
Publication venue
Publication date: 15/11/2023
Field of study

Parameter Efficient Tuning has been an prominent approach to adapt the Large Language Model to downstream tasks. Most previous works considers adding the dense trainable parameters, where all parameters are used to adapt certain task. We found this less effective empirically using the example of LoRA that introducing more trainable parameters does not help. Motivated by this we investigate the importance of leveraging "sparse" computation and propose SiRA: sparse mixture of low rank adaption. SiRA leverages the Sparse Mixture of Expert(SMoE) to boost the performance of LoRA. Specifically it enforces the top

k

experts routing with a capacity limit restricting the maximum number of tokens each expert can process. We propose a novel and simple expert dropout on top of gating network to reduce the over-fitting issue. Through extensive experiments, we verify SiRA performs better than LoRA and other mixture of expert approaches across different single tasks and multitask settings

arXiv.org e-Print Archive

SAFER: Data-Efficient and Safe Reinforcement Learning via Skill Acquisition

Author: Chow Yinlam
Dai Bo
Slack Dylan
Wichers Nevan
Publication venue
Publication date: 30/06/2022
Field of study

Methods that extract policy primitives from offline demonstrations using deep generative models have shown promise at accelerating reinforcement learning(RL) for new tasks. Intuitively, these methods should also help to trainsafeRLagents because they enforce useful skills. However, we identify these techniques are not well equipped for safe policy learning because they ignore negative experiences(e.g., unsafe or unsuccessful), focusing only on positive experiences, which harms their ability to generalize to new tasks safely. Rather, we model the latentsafetycontextusing principled contrastive training on an offline dataset of demonstrations from many tasks, including both negative and positive experiences. Using this late variable, our RL framework, SAFEty skill pRiors (SAFER) extracts task-specific safe primitive skills to safely and successfully generalize to new tasks. In the inference stage, policies trained with SAFER learn to compose safe skills into successful policies. We theoretically characterize why SAFER can enforce safe policy learning and demonstrate its effectiveness on several complex safety-critical robotic grasping tasks inspired by the game Operation, in which SAFERoutperforms state-of-the-art primitive learning methods in success and safety

arXiv.org e-Print Archive

ActionBert: Leveraging User Actions for Semantic Understanding of User Interfaces

Author: Arcas Blaise Agüera y
Chen Jindong
He Zecheng
Lee Ruby
Liu Lijuan
Schubiner Gabriel
Sunkara Srinivas
Wichers Nevan
Xu Ying
Zang Xiaoxue
Publication venue
Publication date: 25/01/2021
Field of study

As mobile devices are becoming ubiquitous, regularly interacting with a variety of user interfaces (UIs) is a common aspect of daily life for many people. To improve the accessibility of these devices and to enable their usage in a variety of settings, building models that can assist users and accomplish tasks through the UI is vitally important. However, there are several challenges to achieve this. First, UI components of similar appearance can have different functionalities, making understanding their function more important than just analyzing their appearance. Second, domain-specific features like Document Object Model (DOM) in web pages and View Hierarchy (VH) in mobile applications provide important signals about the semantics of UI elements, but these features are not in a natural language format. Third, owing to a large diversity in UIs and absence of standard DOM or VH representations, building a UI understanding model with high coverage requires large amounts of training data. Inspired by the success of pre-training based approaches in NLP for tackling a variety of problems in a data-efficient way, we introduce a new pre-trained UI representation model called ActionBert. Our methodology is designed to leverage visual, linguistic and domain-specific features in user interaction traces to pre-train generic feature representations of UIs and their components. Our key intuition is that user actions, e.g., a sequence of clicks on different UI components, reveals important information about their functionality. We evaluate the proposed model on a wide variety of downstream tasks, ranging from icon classification to UI component retrieval based on its natural language description. Experiments show that the proposed ActionBert model outperforms multi-modal baselines across all downstream tasks by up to 15.5%.Comment: Accepted to AAAI Conference on Artificial Intelligence (AAAI-21

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications