224 research outputs found

    LLaMA Rider: Spurring Large Language Models to Explore the Open World

    Full text link
    Recently, various studies have leveraged Large Language Models (LLMs) to help decision-making and planning in environments, and try to align the LLMs' knowledge with the world conditions. Nonetheless, the capacity of LLMs to continuously acquire environmental knowledge and adapt in an open world remains uncertain. In this paper, we propose an approach to spur LLMs to explore the open world, gather experiences, and learn to improve their task-solving capabilities. In this approach, a multi-round feedback-revision mechanism is utilized to encourage LLMs to actively select appropriate revision actions guided by feedback information from the environment. This facilitates exploration and enhances the model's performance. Besides, we integrate sub-task relabeling to assist LLMs in maintaining consistency in sub-task planning and help the model learn the combinatorial nature between tasks, enabling it to complete a wider range of tasks through training based on the acquired exploration experiences. By evaluation in Minecraft, an open-ended sandbox world, we demonstrate that our approach LLaMA-Rider enhances the efficiency of the LLM in exploring the environment, and effectively improves the LLM's ability to accomplish more tasks through fine-tuning with merely 1.3k instances of collected data, showing minimal training costs compared to the baseline using reinforcement learning.Comment: 18 page

    UniCode: Learning a Unified Codebook for Multimodal Large Language Models

    Full text link
    In this paper, we propose \textbf{UniCode}, a novel approach within the domain of multimodal large language models (MLLMs) that learns a unified codebook to efficiently tokenize visual, text, and potentially other types of signals. This innovation addresses a critical limitation in existing MLLMs: their reliance on a text-only codebook, which restricts MLLM's ability to generate images and texts in a multimodal context. Towards this end, we propose a language-driven iterative training paradigm, coupled with an in-context pre-training task we term ``image decompression'', enabling our model to interpret compressed visual data and generate high-quality images.The unified codebook empowers our model to extend visual instruction tuning to non-linguistic generation tasks. Moreover, UniCode is adaptable to diverse stacked quantization approaches in order to compress visual signals into a more compact token representation. Despite using significantly fewer parameters and less data during training, Unicode demonstrates promising capabilities in visual reconstruction and generation. It also achieves performances comparable to leading MLLMs across a spectrum of VQA benchmarks.Comment: 14 pages, 2 figures, 11 table

    MA2QL: A Minimalist Approach to Fully Decentralized Multi-Agent Reinforcement Learning

    Full text link
    Decentralized learning has shown great promise for cooperative multi-agent reinforcement learning (MARL). However, non-stationarity remains a significant challenge in decentralized learning. In the paper, we tackle the non-stationarity problem in the simplest and fundamental way and propose \textit{multi-agent alternate Q-learning} (MA2QL), where agents take turns to update their Q-functions by Q-learning. MA2QL is a \textit{minimalist} approach to fully decentralized cooperative MARL but is theoretically grounded. We prove that when each agent guarantees a ε\varepsilon-convergence at each turn, their joint policy converges to a Nash equilibrium. In practice, MA2QL only requires minimal changes to independent Q-learning (IQL). We empirically evaluate MA2QL on a variety of cooperative multi-agent tasks. Results show MA2QL consistently outperforms IQL, which verifies the effectiveness of MA2QL, despite such minimal changes

    APANet: Adaptive Prototypes Alignment Network for Few-Shot Semantic Segmentation

    Get PDF
    Few-shot semantic segmentation aims to segment novel-class objects in a given query image with only a few labeled support images. Most advanced solutions exploit a metric learning framework that performs segmentation through matching each query feature to a learned class-specific prototype. However, this framework suffers from biased classification due to incomplete feature comparisons. To address this issue, we present an adaptive prototype representation by introducing class-specific and class-agnostic prototypes and thus construct complete sample pairs for learning semantic alignment with query features. The complementary features learning manner effectively enriches feature comparison and helps yield an unbiased segmentation model in the few-shot setting. It is implemented with a two-branch end-to-end network (\ie, a class-specific branch and a class-agnostic branch), which generates prototypes and then combines query features to perform comparisons. In addition, the proposed class-agnostic branch is simple yet effective. In practice, it can adaptively generate multiple class-agnostic prototypes for query images and learn feature alignment in a self-contrastive manner. Extensive experiments on PASCAL-5 i and COCO-20 i demonstrate the superiority of our method. At no expense of inference efficiency, our model achieves state-of-the-art results in both 1-shot and 5-shot settings for few-shot semantic segmentation

    Plan4MC: Skill Reinforcement Learning and Planning for Open-World Minecraft Tasks

    Full text link
    We study building a multi-task agent in Minecraft. Without human demonstrations, solving long-horizon tasks in this open-ended environment with reinforcement learning (RL) is extremely sample inefficient. To tackle the challenge, we decompose solving Minecraft tasks into learning basic skills and planning over the skills. We propose three types of fine-grained basic skills in Minecraft, and use RL with intrinsic rewards to accomplish basic skills with high success rates. For skill planning, we use Large Language Models to find the relationships between skills and build a skill graph in advance. When the agent is solving a task, our skill search algorithm walks on the skill graph and generates the proper skill plans for the agent. In experiments, our method accomplishes 24 diverse Minecraft tasks, where many tasks require sequentially executing for more than 10 skills. Our method outperforms baselines in most tasks by a large margin. The project's website and code can be found at https://sites.google.com/view/plan4mc.Comment: 19 page

    APANet: Adaptive Prototypes Alignment Network for Few-Shot Semantic Segmentation

    Get PDF
    Few-shot semantic segmentation aims to segment novel-class objects in a given query image with only a few labeled support images. Most advanced solutions exploit a metric learning framework that performs segmentation through matching each query feature to a learned class-specific prototype. However, this framework suffers from biased classification due to incomplete feature comparisons. To address this issue, we present an adaptive prototype representation by introducing class-specific and class-agnostic prototypes and thus construct complete sample pairs for learning semantic alignment with query features. The complementary features learning manner effectively enriches feature comparison and helps yield an unbiased segmentation model in the few-shot setting. It is implemented with a two-branch end-to-end network (\ie, a class-specific branch and a class-agnostic branch), which generates prototypes and then combines query features to perform comparisons. In addition, the proposed class-agnostic branch is simple yet effective. In practice, it can adaptively generate multiple class-agnostic prototypes for query images and learn feature alignment in a self-contrastive manner. Extensive experiments on PASCAL-5 i and COCO-20 i demonstrate the superiority of our method. At no expense of inference efficiency, our model achieves state-of-the-art results in both 1-shot and 5-shot settings for few-shot semantic segmentation

    Comparative analysis of adipokinetic hormones and their receptors in Blattodea reveals novel patterns of gene evolution

    Get PDF
    Adipokinetic hormone (AKH) is a neuropeptide produced in the insect corpora cardiaca that plays an essential role in mobilising carbohydrates and lipids from the fat body to the haemolymph. AKH acts by binding to a rhodopsin-like G protein-coupled receptor (GPCR), the adipokinetic hormone receptor (AKHR). In this study, we tackle AKH ligand and receptor gene evolution as well as the evolutionary origins of AKH gene paralogues from the order Blattodea (termites and cockroaches). Phylogenetic analyses of AKH precursor sequences point to an ancient AKH gene duplication event in the common ancestor of Blaberoidea, yielding a new group of putative decapeptides. In total, 16 different AKH peptides from 90 species were obtained. Two octapeptides and seven putatively novel decapeptides are predicted for the first time. AKH receptor sequences from 18 species, spanning solitary cockroaches and subsocial wood roaches as well as lower and higher termites, were subsequently acquired using classical molecular methods and in silico approaches employing transcriptomic data. Aligned AKHR open reading frames revealed 7 highly conserved transmembrane regions, a typical arrangement for GPCRs. Phylogenetic analyses based on AKHR sequences support accepted relationships among termite, subsocial (Cryptocercus spp.) and solitary cockroach lineages to a large extent, while putative post-translational modification sites do not greatly differ between solitary and subsocial roaches and social termites. Our study provides important information not only for AKH and AKHR functional research but also for further analyses interested in their development as potential candidates for biorational pest control agents against invasive termites and cockroaches

    Evolutionary rates are correlated between cockroach symbionts and mitochondrial genomes

    Get PDF
    Bacterial endosymbionts evolve under strong host-driven selection. Factors influencing host evolution might affect symbionts in similar ways, potentially leading to correlations between the molecular evolutionary rates of hosts and symbionts. Although there is evidence of rate correlations between mitochondrial and nuclear genes, similar investigations of hosts and symbionts are lacking. Here, we demonstrate a correlation in molecular rates between the genomes of an endosymbiont (Blattabacterium cuenoti) and the mitochondrial genomes of their hosts (cockroaches). We used partial genome data for multiple strains of B. cuenoti to compare phylogenetic relationships and evolutionary rates for 55 cockroach/symbiont pairs. The phylogenies inferred for B. cuenoti and the mitochondrial genomes of their hosts were largely congruent, as expected from their identical maternal and cytoplasmic mode of inheritance. We found a correlation between evolutionary rates of the two genomes, based on comparisons of root-to-tip distances and on comparisons of the branch lengths of phylogenetically independent species pairs. Our results underscore the profound effects that long-term symbiosis can have on the biology of each symbiotic partner

    Evidence for reduced immune gene diversity and activity during the evolution of termites

    Get PDF
    This study was supported by Freie Universität Internal Research Funding and Devtsche Forschungsgemeinschaft (DFG, grant no. MC 436/5-1) to D.P.M. S.H., P.S. and J.S. are supported by ‘EVA4.0’ (no. CZ.02.1.01/0.0/0.0/16_019/0000803), and P.S. and J.S. are supported by CIGA no. 20184306. Y.C. and Z.W. are supported by the National Natural Science Foundation of China (grant no. 31672329).The evolution of biological complexity is associated with the emergence of bespoke immune systems that maintain and protect organism integrity. Unlike the well-studied immune systems of cells and individuals, little is known about the origins of immunity during the transition to eusociality, a major evolutionary transition comparable to the evolution of multicellular organisms from single-celled ancestors. We aimed to tackle this by characterizing the immune gene repertoire of 18 cockroach and termite species, spanning the spectrum of solitary, subsocial and eusocial lifestyles. We find that key transitions in termite sociality are correlated with immune gene family contractions. In cross-species comparisons of immune gene expression, we find evidence for a caste-specific social defence system in termites, which appears to operate at the expense of individual immune protection. Our study indicates that a major transition in organismal complexity may have entailed a fundamental reshaping of the immune system optimized for group over individual defence.Peer reviewe