1,040 research outputs found

    SDRL: Interpretable and Data-efficient Deep Reinforcement Learning Leveraging Symbolic Planning

    Full text link
    Deep reinforcement learning (DRL) has gained great success by learning directly from high-dimensional sensory inputs, yet is notorious for the lack of interpretability. Interpretability of the subtasks is critical in hierarchical decision-making as it increases the transparency of black-box-style DRL approach and helps the RL practitioners to understand the high-level behavior of the system better. In this paper, we introduce symbolic planning into DRL and propose a framework of Symbolic Deep Reinforcement Learning (SDRL) that can handle both high-dimensional sensory inputs and symbolic planning. The task-level interpretability is enabled by relating symbolic actions to options.This framework features a planner -- controller -- meta-controller architecture, which takes charge of subtask scheduling, data-driven subtask learning, and subtask evaluation, respectively. The three components cross-fertilize each other and eventually converge to an optimal symbolic plan along with the learned subtasks, bringing together the advantages of long-term planning capability with symbolic knowledge and end-to-end reinforcement learning directly from a high-dimensional sensory input. Experimental results validate the interpretability of subtasks, along with improved data efficiency compared with state-of-the-art approaches

    Variance-Constrained H∞H_{\infty } finite-horizon filtering for multi-rate time-varying networked systems based on stochastic protocols

    Get PDF
    summary:In this paper, the variance-constrained H∞H_\infty finite-horizon filtering problem is investigated for a class of time-varying nonlinear system under muti-rate communication network and stochastic protocol (SP). The stochastic protocol is employed to determine which sensor obtains access to the muti-rate communication network in order to relieve communication burden. A novel mapping technology is applied to characterize the randomly switching behavior of the data transmission resulting from the utilization of the SP in muti-rate communication network. By using relaxation method, sufficient conditions are derived for the existence of the finite-horizon filter satisfying both the prescribed H∞H_\infty performance and the covariance requirement of filtering errors, and the solutions of filters satisfying the above indexes are obtained by using linear matrix inequalities. Finally, the validity and effectiveness of the proposed filter scheme are verified by numerical simulation

    MATLABER: Material-Aware Text-to-3D via LAtent BRDF auto-EncodeR

    Full text link
    Based on powerful text-to-image diffusion models, text-to-3D generation has made significant progress in generating compelling geometry and appearance. However, existing methods still struggle to recover high-fidelity object materials, either only considering Lambertian reflectance, or failing to disentangle BRDF materials from the environment lights. In this work, we propose Material-Aware Text-to-3D via LAtent BRDF auto-EncodeR (\textbf{MATLABER}) that leverages a novel latent BRDF auto-encoder for material generation. We train this auto-encoder with large-scale real-world BRDF collections and ensure the smoothness of its latent space, which implicitly acts as a natural distribution of materials. During appearance modeling in text-to-3D generation, the latent BRDF embeddings, rather than BRDF parameters, are predicted via a material network. Through exhaustive experiments, our approach demonstrates the superiority over existing ones in generating realistic and coherent object materials. Moreover, high-quality materials naturally enable multiple downstream tasks such as relighting and material editing. Code and model will be publicly available at \url{https://sheldontsui.github.io/projects/Matlaber}

    Reconstructive Neuron Pruning for Backdoor Defense

    Full text link
    Deep neural networks (DNNs) have been found to be vulnerable to backdoor attacks, raising security concerns about their deployment in mission-critical applications. While existing defense methods have demonstrated promising results, it is still not clear how to effectively remove backdoor-associated neurons in backdoored DNNs. In this paper, we propose a novel defense called \emph{Reconstructive Neuron Pruning} (RNP) to expose and prune backdoor neurons via an unlearning and then recovering process. Specifically, RNP first unlearns the neurons by maximizing the model's error on a small subset of clean samples and then recovers the neurons by minimizing the model's error on the same data. In RNP, unlearning is operated at the neuron level while recovering is operated at the filter level, forming an asymmetric reconstructive learning procedure. We show that such an asymmetric process on only a few clean samples can effectively expose and prune the backdoor neurons implanted by a wide range of attacks, achieving a new state-of-the-art defense performance. Moreover, the unlearned model at the intermediate step of our RNP can be directly used to improve other backdoor defense tasks including backdoor removal, trigger recovery, backdoor label detection, and backdoor sample detection. Code is available at \url{https://github.com/bboylyg/RNP}.Comment: Accepted by ICML2

    Fault Detection of Networked Control Systems Based on Sliding Mode Observer

    Get PDF
    This paper is concerned with the network-based fault detection problem for a class of nonlinear discrete-time networked control systems with multiple communication delays and bounded disturbances. First, a sliding mode based nonlinear discrete observer is proposed. Then the sufficient conditions of sliding motion asymptotical stability are derived by means of the linear matrix inequality (LMI) approach on a designed surface. Then a discrete-time sliding-mode fault observer is designed that is capable of guaranteeing the discrete-time sliding-mode reaching condition of the specified sliding surface. Finally, an illustrative example is provided to show the usefulness and effectiveness of the proposed design method
    • …
    corecore