57 research outputs found

    Seeing is not Believing: Robust Reinforcement Learning against Spurious Correlation

    Full text link
    Robustness has been extensively studied in reinforcement learning (RL) to handle various forms of uncertainty such as random perturbations, rare events, and malicious attacks. In this work, we consider one critical type of robustness against spurious correlation, where different portions of the state do not have correlations induced by unobserved confounders. These spurious correlations are ubiquitous in real-world tasks, for instance, a self-driving car usually observes heavy traffic in the daytime and light traffic at night due to unobservable human activity. A model that learns such useless or even harmful correlation could catastrophically fail when the confounder in the test case deviates from the training one. Although motivated, enabling robustness against spurious correlation poses significant challenges since the uncertainty set, shaped by the unobserved confounder and causal structure, is difficult to characterize and identify. Existing robust algorithms that assume simple and unstructured uncertainty sets are therefore inadequate to address this challenge. To solve this issue, we propose Robust State-Confounded Markov Decision Processes (RSC-MDPs) and theoretically demonstrate its superiority in avoiding learning spurious correlations compared with other robust RL counterparts. We also design an empirical algorithm to learn the robust optimal policy for RSC-MDPs, which outperforms all baselines in eight realistic self-driving and manipulation tasks.Comment: Accepted to NeurIPS 202

    MM-Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing

    Full text link
    Recognizing and localizing events in videos is a fundamental task for video understanding. Since events may occur in auditory and visual modalities, multimodal detailed perception is essential for complete scene comprehension. Most previous works attempted to analyze videos from a holistic perspective. However, they do not consider semantic information at multiple scales, which makes the model difficult to localize events in different lengths. In this paper, we present a Multimodal Pyramid Attentional Network (\textbf{MM-Pyramid}) for event localization. Specifically, we first propose the attentive feature pyramid module. This module captures temporal pyramid features via several stacking pyramid units, each of them is composed of a fixed-size attention block and dilated convolution block. We also design an adaptive semantic fusion module, which leverages a unit-level attention block and a selective fusion block to integrate pyramid features interactively. Extensive experiments on audio-visual event localization and weakly-supervised audio-visual video parsing tasks verify the effectiveness of our approach.Comment: ACM MM 202

    Large Language Models are Complex Table Parsers

    Full text link
    With the Generative Pre-trained Transformer 3.5 (GPT-3.5) exhibiting remarkable reasoning and comprehension abilities in Natural Language Processing (NLP), most Question Answering (QA) research has primarily centered around general QA tasks based on GPT, neglecting the specific challenges posed by Complex Table QA. In this paper, we propose to incorporate GPT-3.5 to address such challenges, in which complex tables are reconstructed into tuples and specific prompt designs are employed for dialogues. Specifically, we encode each cell's hierarchical structure, position information, and content as a tuple. By enhancing the prompt template with an explanatory description of the meaning of each tuple and the logical reasoning process of the task, we effectively improve the hierarchical structure awareness capability of GPT-3.5 to better parse the complex tables. Extensive experiments and results on Complex Table QA datasets, i.e., the open-domain dataset HiTAB and the aviation domain dataset AIT-QA show that our approach significantly outperforms previous work on both datasets, leading to state-of-the-art (SOTA) performance.Comment: EMNLP 2023 Mai

    Pushing the Limits of Machine Design: Automated CPU Design with AI

    Full text link
    Design activity -- constructing an artifact description satisfying given goals and constraints -- distinguishes humanity from other animals and traditional machines, and endowing machines with design abilities at the human level or beyond has been a long-term pursuit. Though machines have already demonstrated their abilities in designing new materials, proteins, and computer programs with advanced artificial intelligence (AI) techniques, the search space for designing such objects is relatively small, and thus, "Can machines design like humans?" remains an open question. To explore the boundary of machine design, here we present a new AI approach to automatically design a central processing unit (CPU), the brain of a computer, and one of the world's most intricate devices humanity have ever designed. This approach generates the circuit logic, which is represented by a graph structure called Binary Speculation Diagram (BSD), of the CPU design from only external input-output observations instead of formal program code. During the generation of BSD, Monte Carlo-based expansion and the distance of Boolean functions are used to guarantee accuracy and efficiency, respectively. By efficiently exploring a search space of unprecedented size 10^{10^{540}}, which is the largest one of all machine-designed objects to our best knowledge, and thus pushing the limits of machine design, our approach generates an industrial-scale RISC-V CPU within only 5 hours. The taped-out CPU successfully runs the Linux operating system and performs comparably against the human-designed Intel 80486SX CPU. In addition to learning the world's first CPU only from input-output observations, which may reform the semiconductor industry by significantly reducing the design cycle, our approach even autonomously discovers human knowledge of the von Neumann architecture.Comment: 28 page

    Annealing novel nucleobase-lipids with oligonucleotides or plasmid DNA based on H-bonding or π-π interaction:Assemblies and transfections

    Get PDF
    Lipid derivatives of nucleoside analogs have been highlighted for their potential for effective gene delivery. A novel class of nucleobase-lipids are rationally designed and readily synthesized, comprising thymine/cytosine, an ester/amide linker and an oleyl lipid. The diversity of four nucleobase-lipids termed DXBAs (DOTA, DNTA, DOCA and DNCA) is investigated. Besides, DNCA is demonstrated to be an effective neutral transfection material for nucleic acid delivery, which enbles to bind to oligonucleotides via H-bonding and π-π stacking with reduced toxicity in vitro and in vivo. Several kinds of nucleic acid drugs including aptamer, ssRNA, antisense oligonucleotide, and plasmid DNAs can be delivered by DXBAs, especially DNCA. In particular, G4-aptamer AS1411 encapsulated by DNCA exhibits cellular uptake enhancement, lysosome degradation reduction, cell apoptosis promotion, cell cycle phase alteration in vitro and duration prolongation in vivo, resulting in significant anti-proliferative activity. Our results demonstrate that DNCA is a promising transfection agent for G4-aptamers and exhibites bright application prospects in the permeation improvement of single-stranded oligonucleotides or plasmid DNAs

    Investigating changes in the gas-phase conformation of Antithrombin III upon binding of Arixtra using traveling wave ion mobility spectrometry (TWIMS)

    Get PDF
    We validate the utility of ion mobility to measure protein conformational changes induced by the binding of glycosaminoglycan ligands

    Computational Study on the Catalytic Performance of Single-Atom Catalysts Anchored on g-CN for Electrochemical Oxidation of Formic Acid

    No full text
    The electrochemical formic acid oxidation reaction (FAOR) has attracted great attention due to its high volumetric energy density and high theoretical efficiency for future portable electronic applications, for which the development of highly efficient and low-cost electrocatalysts is of great significance. In this work, taking single-atom catalysts (SACs) supported on graphitic carbon nitrides (g-CN) as potential catalysts, their catalytic performance for the FAOR was systemically explored by means of density functional theory computations. Our results revealed that the strong hybridization with the unpaired lone electrons of N atoms in the g-CN substrate ensured the high stability of these anchored SACs and endowed them with excellent electrical conductivity. Based on the computed free energy changes of all possible elementary steps, we predicted that a highly efficient FAOR could be achieved on Ru/g-CN with a low limiting potential of −0.15 V along a direct pathway of HCOOH(aq) → HCOOH* → HCOO* → CO2* → CO2(g), in which the formation of HCOO* was identified as the potential-determining step, while the rate-determining step was located at the CO2* formation, with a moderate kinetic barrier of 0.89 eV. Remarkably, the moderate d-band center and polarized charge of the Ru active site caused the Ru/g-CN catalyst to exhibit an optimal binding strength with various reaction intermediates, explaining well its superior FAOR catalytic performance. Hence, the single Ru atom anchored on g-CN could be utilized as a promising SAC for the FAOR, which opens a new avenue to further develop novel catalysts for a sustainable FAOR in formic-acid-based fuel cells

    A Practical Approach to Forgetting in Description Logics with Nominals

    No full text
    This paper investigates the problem of forgetting in description logics with nominals. In particular, we develop a practical method for forgetting concept and role names from ontologies specified in the description logic ALCO, extending the basic ALC with nominals. The method always terminates, and is sound in the sense that the forgetting solution computed by the method has the same logical consequences with the original ontology. The method is so far the only approach to deductive forgetting in description logics with nominals. An evaluation of a prototype implementation shows that the method achieves a significant speed-up and notably better success rates than the Lethe tool which performs deductive forgetting for ALC-ontologies. Compared to Fame, a semantic forgetting tool for ALCOIH-ontologies, better success rates are attained. From the perspective of ontology engineering this is very useful, as it provides ontology curators with a powerful tool to produce views of ontologies

    BEER: Fast O(1/T) rate for decentralized nonconvex optimization with communication compression

    No full text
    Communication efficiency has been widely recognized as the bottleneck for large-scale decentralized machine learning applications in multi-agent or federated environments. To tackle the communication bottleneck, there have been many efforts to design communication-compressed algorithms for decentralized nonconvex optimization, where the clients are only allowed to communicate a small amount of quantized information (aka bits) with their neighbors over a predefined graph topology. Despite significant efforts, the state-of-the-art algorithm in the nonconvex setting still suffers from a slower rate of convergence O((G/T)2/3)O((G/T)^{2/3}) compared with their uncompressed counterpart, where GG measures the data heterogeneity across different clients, and TT is the number of communication rounds. This paper proposes BEER, which adopts communication compression with gradient tracking, and shows it converges at a faster rate of O(1/T)O(1/T). This significantly improves over the state-of-the-art rate, by matching the rate without compression even under arbitrary data heterogeneity. Numerical experiments are also provided to corroborate our theory and confirm the practical superiority of BEER in the data heterogeneous regime.Comment: NeurIPS 202
    • …
    corecore