57 research outputs found
Seeing is not Believing: Robust Reinforcement Learning against Spurious Correlation
Robustness has been extensively studied in reinforcement learning (RL) to
handle various forms of uncertainty such as random perturbations, rare events,
and malicious attacks. In this work, we consider one critical type of
robustness against spurious correlation, where different portions of the state
do not have correlations induced by unobserved confounders. These spurious
correlations are ubiquitous in real-world tasks, for instance, a self-driving
car usually observes heavy traffic in the daytime and light traffic at night
due to unobservable human activity. A model that learns such useless or even
harmful correlation could catastrophically fail when the confounder in the test
case deviates from the training one. Although motivated, enabling robustness
against spurious correlation poses significant challenges since the uncertainty
set, shaped by the unobserved confounder and causal structure, is difficult to
characterize and identify. Existing robust algorithms that assume simple and
unstructured uncertainty sets are therefore inadequate to address this
challenge. To solve this issue, we propose Robust State-Confounded Markov
Decision Processes (RSC-MDPs) and theoretically demonstrate its superiority in
avoiding learning spurious correlations compared with other robust RL
counterparts. We also design an empirical algorithm to learn the robust optimal
policy for RSC-MDPs, which outperforms all baselines in eight realistic
self-driving and manipulation tasks.Comment: Accepted to NeurIPS 202
MM-Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing
Recognizing and localizing events in videos is a fundamental task for video
understanding. Since events may occur in auditory and visual modalities,
multimodal detailed perception is essential for complete scene comprehension.
Most previous works attempted to analyze videos from a holistic perspective.
However, they do not consider semantic information at multiple scales, which
makes the model difficult to localize events in different lengths. In this
paper, we present a Multimodal Pyramid Attentional Network
(\textbf{MM-Pyramid}) for event localization. Specifically, we first propose
the attentive feature pyramid module. This module captures temporal pyramid
features via several stacking pyramid units, each of them is composed of a
fixed-size attention block and dilated convolution block. We also design an
adaptive semantic fusion module, which leverages a unit-level attention block
and a selective fusion block to integrate pyramid features interactively.
Extensive experiments on audio-visual event localization and weakly-supervised
audio-visual video parsing tasks verify the effectiveness of our approach.Comment: ACM MM 202
Large Language Models are Complex Table Parsers
With the Generative Pre-trained Transformer 3.5 (GPT-3.5) exhibiting
remarkable reasoning and comprehension abilities in Natural Language Processing
(NLP), most Question Answering (QA) research has primarily centered around
general QA tasks based on GPT, neglecting the specific challenges posed by
Complex Table QA. In this paper, we propose to incorporate GPT-3.5 to address
such challenges, in which complex tables are reconstructed into tuples and
specific prompt designs are employed for dialogues. Specifically, we encode
each cell's hierarchical structure, position information, and content as a
tuple. By enhancing the prompt template with an explanatory description of the
meaning of each tuple and the logical reasoning process of the task, we
effectively improve the hierarchical structure awareness capability of GPT-3.5
to better parse the complex tables. Extensive experiments and results on
Complex Table QA datasets, i.e., the open-domain dataset HiTAB and the aviation
domain dataset AIT-QA show that our approach significantly outperforms previous
work on both datasets, leading to state-of-the-art (SOTA) performance.Comment: EMNLP 2023 Mai
Pushing the Limits of Machine Design: Automated CPU Design with AI
Design activity -- constructing an artifact description satisfying given
goals and constraints -- distinguishes humanity from other animals and
traditional machines, and endowing machines with design abilities at the human
level or beyond has been a long-term pursuit. Though machines have already
demonstrated their abilities in designing new materials, proteins, and computer
programs with advanced artificial intelligence (AI) techniques, the search
space for designing such objects is relatively small, and thus, "Can machines
design like humans?" remains an open question. To explore the boundary of
machine design, here we present a new AI approach to automatically design a
central processing unit (CPU), the brain of a computer, and one of the world's
most intricate devices humanity have ever designed. This approach generates the
circuit logic, which is represented by a graph structure called Binary
Speculation Diagram (BSD), of the CPU design from only external input-output
observations instead of formal program code. During the generation of BSD,
Monte Carlo-based expansion and the distance of Boolean functions are used to
guarantee accuracy and efficiency, respectively. By efficiently exploring a
search space of unprecedented size 10^{10^{540}}, which is the largest one of
all machine-designed objects to our best knowledge, and thus pushing the limits
of machine design, our approach generates an industrial-scale RISC-V CPU within
only 5 hours. The taped-out CPU successfully runs the Linux operating system
and performs comparably against the human-designed Intel 80486SX CPU. In
addition to learning the world's first CPU only from input-output observations,
which may reform the semiconductor industry by significantly reducing the
design cycle, our approach even autonomously discovers human knowledge of the
von Neumann architecture.Comment: 28 page
Annealing novel nucleobase-lipids with oligonucleotides or plasmid DNA based on H-bonding or π-π interaction:Assemblies and transfections
Lipid derivatives of nucleoside analogs have been highlighted for their potential for effective gene delivery. A novel class of nucleobase-lipids are rationally designed and readily synthesized, comprising thymine/cytosine, an ester/amide linker and an oleyl lipid. The diversity of four nucleobase-lipids termed DXBAs (DOTA, DNTA, DOCA and DNCA) is investigated. Besides, DNCA is demonstrated to be an effective neutral transfection material for nucleic acid delivery, which enbles to bind to oligonucleotides via H-bonding and π-π stacking with reduced toxicity in vitro and in vivo. Several kinds of nucleic acid drugs including aptamer, ssRNA, antisense oligonucleotide, and plasmid DNAs can be delivered by DXBAs, especially DNCA. In particular, G4-aptamer AS1411 encapsulated by DNCA exhibits cellular uptake enhancement, lysosome degradation reduction, cell apoptosis promotion, cell cycle phase alteration in vitro and duration prolongation in vivo, resulting in significant anti-proliferative activity. Our results demonstrate that DNCA is a promising transfection agent for G4-aptamers and exhibites bright application prospects in the permeation improvement of single-stranded oligonucleotides or plasmid DNAs
Investigating changes in the gas-phase conformation of Antithrombin III upon binding of Arixtra using traveling wave ion mobility spectrometry (TWIMS)
We validate the utility of ion mobility to measure protein conformational changes induced by the binding of glycosaminoglycan ligands
Computational Study on the Catalytic Performance of Single-Atom Catalysts Anchored on g-CN for Electrochemical Oxidation of Formic Acid
The electrochemical formic acid oxidation reaction (FAOR) has attracted great attention due to its high volumetric energy density and high theoretical efficiency for future portable electronic applications, for which the development of highly efficient and low-cost electrocatalysts is of great significance. In this work, taking single-atom catalysts (SACs) supported on graphitic carbon nitrides (g-CN) as potential catalysts, their catalytic performance for the FAOR was systemically explored by means of density functional theory computations. Our results revealed that the strong hybridization with the unpaired lone electrons of N atoms in the g-CN substrate ensured the high stability of these anchored SACs and endowed them with excellent electrical conductivity. Based on the computed free energy changes of all possible elementary steps, we predicted that a highly efficient FAOR could be achieved on Ru/g-CN with a low limiting potential of −0.15 V along a direct pathway of HCOOH(aq) → HCOOH* → HCOO* → CO2* → CO2(g), in which the formation of HCOO* was identified as the potential-determining step, while the rate-determining step was located at the CO2* formation, with a moderate kinetic barrier of 0.89 eV. Remarkably, the moderate d-band center and polarized charge of the Ru active site caused the Ru/g-CN catalyst to exhibit an optimal binding strength with various reaction intermediates, explaining well its superior FAOR catalytic performance. Hence, the single Ru atom anchored on g-CN could be utilized as a promising SAC for the FAOR, which opens a new avenue to further develop novel catalysts for a sustainable FAOR in formic-acid-based fuel cells
A Practical Approach to Forgetting in Description Logics with Nominals
This paper investigates the problem of forgetting in description logics with nominals. In particular, we develop a practical method for forgetting concept and role names from ontologies specified in the description logic ALCO, extending the basic ALC with nominals. The method always terminates, and is sound in the sense that the forgetting solution computed by the method has the same logical consequences with the original ontology. The method is so far the only approach to deductive forgetting in description logics with nominals. An evaluation of a prototype implementation shows that the method achieves a significant speed-up and notably better success rates than the Lethe tool which performs deductive forgetting for ALC-ontologies. Compared to Fame, a semantic forgetting tool for ALCOIH-ontologies, better success rates are attained. From the perspective of ontology engineering this is very useful, as it provides ontology curators with a powerful tool to produce views of ontologies
BEER: Fast O(1/T) rate for decentralized nonconvex optimization with communication compression
Communication efficiency has been widely recognized as the bottleneck for
large-scale decentralized machine learning applications in multi-agent or
federated environments. To tackle the communication bottleneck, there have been
many efforts to design communication-compressed algorithms for decentralized
nonconvex optimization, where the clients are only allowed to communicate a
small amount of quantized information (aka bits) with their neighbors over a
predefined graph topology. Despite significant efforts, the state-of-the-art
algorithm in the nonconvex setting still suffers from a slower rate of
convergence compared with their uncompressed counterpart,
where measures the data heterogeneity across different clients, and is
the number of communication rounds. This paper proposes BEER, which adopts
communication compression with gradient tracking, and shows it converges at a
faster rate of . This significantly improves over the state-of-the-art
rate, by matching the rate without compression even under arbitrary data
heterogeneity. Numerical experiments are also provided to corroborate our
theory and confirm the practical superiority of BEER in the data heterogeneous
regime.Comment: NeurIPS 202
- …