37 research outputs found
Is Prompt All You Need? No. A Comprehensive and Broader View of Instruction Learning
Task semantics can be expressed by a set of input-to-output examples or a
piece of textual instruction. Conventional machine learning approaches for
natural language processing (NLP) mainly rely on the availability of
large-scale sets of task-specific examples. Two issues arise: first, collecting
task-specific labeled examples does not apply to scenarios where tasks may be
too complicated or costly to annotate, or the system is required to handle a
new task immediately; second, this is not user-friendly since end-users are
probably more willing to provide task description rather than a set of examples
before using the system. Therefore, the community is paying increasing interest
in a new supervision-seeking paradigm for NLP: learning from task instructions.
Despite its impressive progress, there are some common issues that the
community struggles with. This survey paper tries to summarize the current
research on instruction learning, particularly, by answering the following
questions: (i) what is task instruction, and what instruction types exist? (ii)
how to model instructions? (iii) what factors influence and explain the
instructions' performance? (iv) what challenges remain in instruction learning?
To our knowledge, this is the first comprehensive survey about textual
instructions.Comment: Early Draft. Will be further reorganized and polished. The paper list
is available at https://github.com/RenzeLou/awesome-instruction-learnin
A Survey on Knowledge Graphs: Representation, Acquisition and Applications
Human knowledge provides a formal understanding of the world. Knowledge
graphs that represent structural relations between entities have become an
increasingly popular research direction towards cognition and human-level
intelligence. In this survey, we provide a comprehensive review of knowledge
graph covering overall research topics about 1) knowledge graph representation
learning, 2) knowledge acquisition and completion, 3) temporal knowledge graph,
and 4) knowledge-aware applications, and summarize recent breakthroughs and
perspective directions to facilitate future research. We propose a full-view
categorization and new taxonomies on these topics. Knowledge graph embedding is
organized from four aspects of representation space, scoring function, encoding
models, and auxiliary information. For knowledge acquisition, especially
knowledge graph completion, embedding methods, path inference, and logical rule
reasoning, are reviewed. We further explore several emerging topics, including
meta relational learning, commonsense reasoning, and temporal knowledge graphs.
To facilitate future research on knowledge graphs, we also provide a curated
collection of datasets and open-source libraries on different tasks. In the
end, we have a thorough outlook on several promising research directions
Automatic Learning Improves Human-Robot Interaction in Productive Environments: A Review
In the creation of new industries, products and services -- all of which are advances of the Fourth Industrial Revolution -- the human-robot interaction that includes automatic learning and computer vision are elements to consider since they promote collaborative environments between people and robots. The use of machine learning and computer vision provides the tools needed to increase productivity and minimizes delivery reaction times by assisting in the optimization of complex production planning processes. This review of the state of the art presents the main trends that seek to improve human-robot interaction in productive environments, and identifies challenges in research as well as in industrial - technological development in this topic. In addition, this review offers a proposal on the needs of use of artificial intelligence in all processes of industry 4.0 as a crucial linking element among humans, robots, intelligent and traditional machines; as well as a mechanism for quality control and occupational safety.This work has been funded by the Spanish Government [TIN2016-76515-R] grant for the COMBAHO project, supported with Feder funds
Integration of multi-scale protein interactions for biomedical data analysis
With the advancement of modern technologies, we observe an increasing accumulation of biomedical data about diseases. There is a need for computational methods to sift through and extract knowledge from the diverse data available in order to improve our mechanistic understanding of diseases and improve patient care. Biomedical data come in various forms as exemplified by the various omics data. Existing studies have shown that each form of omics data gives only partial information on cells state and motivated jointly mining multi-omics, multi-modal data to extract integrated system knowledge. The interactome is of particular importance as it enables the modelling of dependencies arising from molecular interactions. This Thesis takes a special interest in the multi-scale protein interactome and its integration with computational models to extract relevant information from biomedical data. We define multi-scale interactions at different omics scale that involve proteins: pairwise protein-protein interactions, multi-protein complexes, and biological pathways. Using hypergraph representations, we motivate considering higher-order protein interactions, highlighting the complementary biological information contained in the multi-scale interactome. Based on those results, we further investigate how those multi-scale protein interactions can be used as either prior knowledge, or auxiliary data to develop machine learning algorithms. First, we design a neural network using the multi-scale organization of proteins in a cell into biological pathways as prior knowledge and train it to predict a patient's diagnosis based on transcriptomics data. From the trained models, we develop a strategy to extract biomedical knowledge pertaining to the diseases investigated. Second, we propose a general framework based on Non-negative Matrix Factorization to integrate the multi-scale protein interactome with multi-omics data. We show that our approach outperforms the existing methods, provide biomedical insights and relevant hypotheses for specific cancer types
Counterfactual inference to predict causal knowledge graph for relational transfer learning by assimilating expert knowledge --Relational feature transfer learning algorithm
Transfer learning (TL) is a machine learning (ML) method in which knowledge is transferred from the existing models of related problems to the model for solving the problem at hand. Relational TL enables the ML models to transfer the relationship networks from one domain to another. However, it has two critical issues. One is determining the proper way of extracting and expressing relationships among data features in the source domain such that the relationships can be transferred to the target domain. The other is how to do the transfer procedure. Knowledge graphs (KGs) are knowledge bases that use data and logic to graph-structured information; they are helpful tools for dealing with the first issue. The proposed relational feature transfer learning algorithm (RF-TL) embodies an extended structural equation modelling (SEM) as a method for constructing KGs. Additionally, in fields such as medicine, economics, and law related to people’s lives and property safety and security, the knowledge of domain experts is a gold standard. This paper introduces the causal analysis and counterfactual inference in the TL domain that directs the transfer procedure. Different from traditional feature-based TL algorithms like transfer component analysis (TCA) and CORelation Alignment (CORAL), RF-TL not only considers relations between feature items but also utilizes causality knowledge, enabling it to perform well in practical cases. The algorithm was tested on two different healthcare-related datasets — sleep apnea questionnaire study data and COVID-19 case data on ICU admission — and compared its performance with TCA and CORAL. The experimental results show that RF-TL can generate better transferred models that give more accurate predictions with fewer input features
FireAct: Toward Language Agent Fine-tuning
Recent efforts have augmented language models (LMs) with external tools or
environments, leading to the development of language agents that can reason and
act. However, most of these agents rely on few-shot prompting techniques with
off-the-shelf LMs. In this paper, we investigate and argue for the overlooked
direction of fine-tuning LMs to obtain language agents. Using a setup of
question answering (QA) with a Google search API, we explore a variety of base
LMs, prompting methods, fine-tuning data, and QA tasks, and find language
agents are consistently improved after fine-tuning their backbone LMs. For
example, fine-tuning Llama2-7B with 500 agent trajectories generated by GPT-4
leads to a 77% HotpotQA performance increase. Furthermore, we propose FireAct,
a novel approach to fine-tuning LMs with trajectories from multiple tasks and
prompting methods, and show having more diverse fine-tuning data can further
improve agents. Along with other findings regarding scaling effects,
robustness, generalization, efficiency and cost, our work establishes
comprehensive benefits of fine-tuning LMs for agents, and provides an initial
set of experimental designs, insights, as well as open questions toward
language agent fine-tuning.Comment: Code, data, and models are available at
https://fireact-agent.github.i
RODE: Learning Roles to Decompose Multi-Agent Tasks
Role-based learning holds the promise of achieving scalable multi-agent
learning by decomposing complex tasks using roles. However, it is largely
unclear how to efficiently discover such a set of roles. To solve this problem,
we propose to first decompose joint action spaces into restricted role action
spaces by clustering actions according to their effects on the environment and
other agents. Learning a role selector based on action effects makes role
discovery much easier because it forms a bi-level learning hierarchy -- the
role selector searches in a smaller role space and at a lower temporal
resolution, while role policies learn in significantly reduced primitive
action-observation spaces. We further integrate information about action
effects into the role policies to boost learning efficiency and policy
generalization. By virtue of these advances, our method (1) outperforms the
current state-of-the-art MARL algorithms on 10 of the 14 scenarios that
comprise the challenging StarCraft II micromanagement benchmark and (2)
achieves rapid transfer to new environments with three times the number of
agents. Demonstrative videos are available at
https://sites.google.com/view/rode-marl
A Survey of Knowledge Graph Reasoning on Graph Types: Static, Dynamic, and Multimodal
Knowledge graph reasoning (KGR), aiming to deduce new facts from existing
facts based on mined logic rules underlying knowledge graphs (KGs), has become
a fast-growing research direction. It has been proven to significantly benefit
the usage of KGs in many AI applications, such as question answering,
recommendation systems, and etc. According to the graph types, existing KGR
models can be roughly divided into three categories, i.e., static models,
temporal models, and multi-modal models. Early works in this domain mainly
focus on static KGR, and recent works try to leverage the temporal and
multi-modal information, which are more practical and closer to real-world.
However, no survey papers and open-source repositories comprehensively
summarize and discuss models in this important direction. To fill the gap, we
conduct a first survey for knowledge graph reasoning tracing from static to
temporal and then to multi-modal KGs. Concretely, the models are reviewed based
on bi-level taxonomy, i.e., top-level (graph types) and base-level (techniques
and scenarios). Besides, the performances, as well as datasets, are summarized
and presented. Moreover, we point out the challenges and potential
opportunities to enlighten the readers. The corresponding open-source
repository is shared on GitHub
https://github.com/LIANGKE23/Awesome-Knowledge-Graph-Reasoning.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Intelligent Knowledge Transfer for Multi-Stage and Multi-Task Learning
Machine learning plays a significant role in powering artificial intelligence advances in many areas like natural language processing and personalized recommendation, aiming to build models to fit the labeled training data and then predict on the held-out testing data. A key challenge for these machine learning models is the imbalance between scarce labeled data and continuously increased model capacity. On the one hand, the labeled data of many tasks is scarce because human annotations are expensive, which is especially true for some specialized domains like biomedical. On the other hand, the capacity of models are growing continuously in the last decade, with parameters ranging from millions to billions. Without enough labeled data, such large-scale models may overfit on low-resource tasks, resulting in performance deterioration. Recently, many work demonstrate that transferring useful knowledge from pre-training stages or jointly trained related tasks to the target task may alleviate the label scarcity problem and significantly boost the performance of the target task. Despite the prominence achieved in the recent work, there are still many challenges and open problems to be explored for the knowledge transfer. First, transferring domain-specific knowledge from pre-training stages to large-scale language models remains under-explored, which limits the performance of natural language understanding over the corresponding domains. Second, training multiple tasks jointly hinders the performance on individual tasks, which is more serious in transformer-based multi-task co-training because all tasks share a single set of parameters. Third, transferring knowledge from the source might have a negative impact on the target learner, leading to worse results than training the target task alone. To overcome these challenges, three contributions are made in this dissertation:
• To transfer disease knowledge to enhance BERT-like language models over health-related
tasks, we propose a new pre-training procedure named disease knowledge infusion, which
efficiently exploit the self-supervised learning signals of Wikipedia pages.
• The second contribution is a novel method named HyperPrompt that utilizes HyperNetworks to generate task-conditioned prompts for multi-task learning, where the task-specific knowledge can be flexibly shared via the HyperNetworks.
• To alleviate the negative transfer problem from the perspective of gradient magnitudes, we
propose a novel algorithm named MetaBalance to dynamically and adaptively balance the
gradients of auxiliary tasks to better assist the target task
Recommended from our members
Hypernetworks Analysis of RoboCup Interactions
Robotic soccer simulations are controlled environments in which the rich variety of interactions among agents make them good candidates to be studied as complex adaptive systems. The challenge is to create an autonomous team of soccer agents that can adapt and improve its behaviour as it plays other teams. By analogy with chess, the movements of the soccer agents and the ball form ever-changing networks as players in one team form structures that give their team an advantage. For example, the Defender’s Dilemma involves relationships between an attacker with the ball, a team-mate and a defender. The defender must choose between tackling the player with the ball, or taking a position to intercept a pass to the other attacker. Since these structures involve more that two interacting entities it is necessary to go beyond networks to multidimensional hypernetworks. In this context, this thesis investigates (i) is it possible to identify patterns of play, that lead a team to obtain an advantage ?, (ii) is it possible to forecast with a good degree of accuracy if a certain game action or sequence of game actions is going to be successful, before it has been completed ?, and (iii) is it possible to make behavioural patterns emerge in the game without specifying the behavioural rules in detail ? To investigate these research questions we devised two methods to analyse the interactions between robotic players, one based on traditional programming and one based on Deep Learning. The first method identified thousands of Defender’s Dilemma configurations from RoboCup 2D simulator games and found a statistically significant association between winning and the creation of the defender’s dilemma by the attackers of the winning team. The second method showed that a feedforward Artificial Neural Network trained on thousands of games can take as input the current game configuration and forecast to a high degree of accuracy if the current action will end up in a goal or not. Finally, we designed our own fast and simple robotic soccer simulator for investigating Reinforcement Learning. This showed that Reinforcement Learning using Proximal Policy Optimization could train two agents in the task of scoring a goal, using only basic actions without using pre-built hand-programmed skills. These experiments provide evidence that it is possible: to identify advantageous patterns of play; to forecast if an action or sequence of actions will be successful; and to make behavioural patterns emerge in the game without specifying the behavioural rules in detail