12 research outputs found

    Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning

    Full text link
    Deep cooperative multi-agent reinforcement learning has demonstrated its remarkable success over a wide spectrum of complex control tasks. However, recent advances in multi-agent learning mainly focus on value decomposition while leaving entity interactions still intertwined, which easily leads to over-fitting on noisy interactions between entities. In this work, we introduce a novel interactiOn Pattern disenTangling (OPT) method, to disentangle not only the joint value function into agent-wise value functions for decentralized execution, but also the entity interactions into interaction prototypes, each of which represents an underlying interaction pattern within a subgroup of the entities. OPT facilitates filtering the noisy interactions between irrelevant entities and thus significantly improves generalizability as well as interpretability. Specifically, OPT introduces a sparse disagreement mechanism to encourage sparsity and diversity among discovered interaction prototypes. Then the model selectively restructures these prototypes into a compact interaction pattern by an aggregator with learnable weights. To alleviate the training instability issue caused by partial observability, we propose to maximize the mutual information between the aggregation weights and the history behaviors of each agent. Experiments on both single-task and multi-task benchmarks demonstrate that the proposed method yields results superior to the state-of-the-art counterparts. Our code is available at https://github.com/liushunyu/OPT

    Contrastive Identity-Aware Learning for Multi-Agent Value Decomposition

    Full text link
    Value Decomposition (VD) aims to deduce the contributions of agents for decentralized policies in the presence of only global rewards, and has recently emerged as a powerful credit assignment paradigm for tackling cooperative Multi-Agent Reinforcement Learning (MARL) problems. One of the main challenges in VD is to promote diverse behaviors among agents, while existing methods directly encourage the diversity of learned agent networks with various strategies. However, we argue that these dedicated designs for agent networks are still limited by the indistinguishable VD network, leading to homogeneous agent behaviors and thus downgrading the cooperation capability. In this paper, we propose a novel Contrastive Identity-Aware learning (CIA) method, explicitly boosting the credit-level distinguishability of the VD network to break the bottleneck of multi-agent diversity. Specifically, our approach leverages contrastive learning to maximize the mutual information between the temporal credits and identity representations of different agents, encouraging the full expressiveness of credit assignment and further the emergence of individualities. The algorithm implementation of the proposed CIA module is simple yet effective that can be readily incorporated into various VD architectures. Experiments on the SMAC benchmarks and across different VD backbones demonstrate that the proposed method yields results superior to the state-of-the-art counterparts. Our code is available at https://github.com/liushunyu/CIA

    Complete mitochondrial genome sequence and phylogenetic analysis of the Taiwan tai Argyrops bleekeri (Spariformes: Sparidae)

    No full text
    In this study, the complete mitochondrial genome of the Taiwan tai Argyrops bleekeri was determined for the first time by next-generation sequencing. The circular mtDNA molecule was 16,646 bp in size and the overall base composition was A (27.77%), C (28.95%), G (16.60%), and T (26.68%), with a slight bias toward A + T. The complete mitogenome encoded 13 protein-coding genes (PCGs), 22 tRNA genes, two rRNA genes, and a control region. Phylogenetic analysis based on the 13 PCGs of the Sparidae family revealed that Argyrops appears to be most closely related to Pagrus and Parargyrops, but further research is needed

    Spatiotemporal-Augmented Graph Neural Networks for Human Mobility Simulation

    Full text link
    Human mobility patterns have shown significant applications in policy-decision scenarios and economic behavior researches. The human mobility simulation task aims to generate human mobility trajectories given a small set of trajectory data, which have aroused much concern due to the scarcity and sparsity of human mobility data. Existing methods mostly rely on the static relationships of locations, while largely neglect the dynamic spatiotemporal effects of locations. On the one hand, spatiotemporal correspondences of visit distributions reveal the spatial proximity and the functionality similarity of locations. On the other hand, the varying durations in different locations hinder the iterative generation process of the mobility trajectory. Therefore, we propose a novel framework to model the dynamic spatiotemporal effects of locations, namely SpatioTemporal-Augmented gRaph neural networks (STAR). The STAR framework designs various spatiotemporal graphs to capture the spatiotemporal correspondences and builds a novel dwell branch to simulate the varying durations in locations, which is finally optimized in an adversarial manner. The comprehensive experiments over four real datasets for the human mobility simulation have verified the superiority of STAR to state-of-the-art methods. Our code will be made publicly available

    Visual Boundary Knowledge Translation for Foreground Segmentation

    No full text
    When confronted with objects of unknown types in an image, humans can effortlessly and precisely tell their visual boundaries. This recognition mechanism and underlying generalization capability seem to contrast to state-of-the-art image segmentation networks that rely on large-scale category-aware annotated training samples. In this paper, we make an attempt towards building models that explicitly account for visual boundary knowledge, in hope to reduce the training effort on segmenting unseen categories. Specifically, we investigate a new task termed as Boundary Knowledge Translation (BKT). Given a set of fully labeled categories, BKT aims to translate the visual boundary knowledge learned from the labeled categories, to a set of novel categories, each of which is provided only a few labeled samples. To this end, we propose a Translation Segmentation Network (Trans-Net), which comprises a segmentation network and two boundary discriminators. The segmentation network, combined with a boundary-aware self-supervised mechanism, is devised to conduct foreground segmentation, while the two discriminators work together in an adversarial manner to ensure an accurate segmentation of the novel categories under light supervision. Exhaustive experiments demonstrate that, with only tens of labeled samples as guidance, Trans-Net achieves close results on par with fully supervised methods

    Ask-AC: An Initiative Advisor-in-the-Loop Actor-Critic Framework

    Full text link
    Despite the promising results achieved, state-of-the-art interactive reinforcement learning schemes rely on passively receiving supervision signals from advisor experts, in the form of either continuous monitoring or pre-defined rules, which inevitably result in a cumbersome and expensive learning process. In this paper, we introduce a novel initiative advisor-in-the-loop actor-critic framework, termed as Ask-AC, that replaces the unilateral advisor-guidance mechanism with a bidirectional learner-initiative one, and thereby enables a customized and efficacious message exchange between learner and advisor. At the heart of Ask-AC are two complementary components, namely action requester and adaptive state selector, that can be readily incorporated into various discrete actor-critic architectures. The former component allows the agent to initiatively seek advisor intervention in the presence of uncertain states, while the latter identifies the unstable states potentially missed by the former especially when environment changes, and then learns to promote the ask action on such states. Experimental results on both stationary and non-stationary environments and across different actor-critic backbones demonstrate that the proposed framework significantly improves the learning efficiency of the agent, and achieves the performances on par with those obtained by continuous advisor monitoring

    Distribution-Aware Graph Representation Learning for Transient Stability Assessment of Power System

    Full text link
    The real-time transient stability assessment (TSA) plays a critical role in the secure operation of the power system. Although the classic numerical integration method, \textit{i.e.} time-domain simulation (TDS), has been widely used in industry practice, it is inevitably trapped in a high computational complexity due to the high latitude sophistication of the power system. In this work, a data-driven power system estimation method is proposed to quickly predict the stability of the power system before TDS reaches the end of simulating time windows, which can reduce the average simulation time of stability assessment without loss of accuracy. As the topology of the power system is in the form of graph structure, graph neural network based representation learning is naturally suitable for learning the status of the power system. Motivated by observing the distribution information of crucial active power and reactive power on the power system's bus nodes, we thus propose a distribution-aware learning~(DAL) module to explore an informative graph representation vector for describing the status of a power system. Then, TSA is re-defined as a binary classification task, and the stability of the system is determined directly from the resulting graph representation without numerical integration. Finally, we apply our method to the online TSA task. The case studies on the IEEE 39-bus system and Polish 2383-bus system demonstrate the effectiveness of our proposed method.Comment: 8 pages, 6 figures, 4 table

    Larval Spatiotemporal Distribution of Six Fish Species: Implications for Sustainable Fisheries Management in the East China Sea

    No full text
    The larval distributions of the small-sized fishes Omobranchus elegans, Erisphex pottii, Benthosema pterotum, Acropoma japonicum, Upeneus bensasi, and Apogonichthys lineatus in the East China Sea ecosystem are important due to their ecological and economic benefits. To date, however, there have been few studies describing their population distributions and dynamics. In the current study, ichthyoplankton surveys were carried out from April to July 2018 to analyze variations in the larval abundance, distribution, and development stages of these species. In addition, the spatiotemporal larval distribution was investigated in terms of measured environmental variables. It was found that larvae were mainly distributed at depths of 5.00–66.00 m, in areas with sea surface temperature of 4.40–29.60 °C, sea surface salinity of 16.54–34.60 psu, pH of 7.00–9.00, and dissolved oxygen concentration of 2.54–8.70 mg/L. Benthosema pterotum and A. lineatus migrated from 30.00–31.00° N 123.17–123.50° E in June to 30.00–32.50° N 122.22–123.50° E in July. The results of this study can help to preserve spawning and nursery grounds and contribute to sustainable coastal fisheries management

    Larval Spatiotemporal Distribution of Six Fish Species: Implications for Sustainable Fisheries Management in the East China Sea

    No full text
    The larval distributions of the small-sized fishes Omobranchus elegans, Erisphex pottii, Benthosema pterotum, Acropoma japonicum, Upeneus bensasi, and Apogonichthys lineatus in the East China Sea ecosystem are important due to their ecological and economic benefits. To date, however, there have been few studies describing their population distributions and dynamics. In the current study, ichthyoplankton surveys were carried out from April to July 2018 to analyze variations in the larval abundance, distribution, and development stages of these species. In addition, the spatiotemporal larval distribution was investigated in terms of measured environmental variables. It was found that larvae were mainly distributed at depths of 5.00–66.00 m, in areas with sea surface temperature of 4.40–29.60 °C, sea surface salinity of 16.54–34.60 psu, pH of 7.00–9.00, and dissolved oxygen concentration of 2.54–8.70 mg/L. Benthosema pterotum and A. lineatus migrated from 30.00–31.00° N 123.17–123.50° E in June to 30.00–32.50° N 122.22–123.50° E in July. The results of this study can help to preserve spawning and nursery grounds and contribute to sustainable coastal fisheries management
    corecore