97 research outputs found
Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos
The task of video grounding, which temporally localizes a natural language
description in a video, plays an important role in understanding videos.
Existing studies have adopted strategies of sliding window over the entire
video or exhaustively ranking all possible clip-sentence pairs in a
pre-segmented video, which inevitably suffer from exhaustively enumerated
candidates. To alleviate this problem, we formulate this task as a problem of
sequential decision making by learning an agent which regulates the temporal
grounding boundaries progressively based on its policy. Specifically, we
propose a reinforcement learning based framework improved by multi-task
learning and it shows steady performance gains by considering additional
supervised boundary information during training. Our proposed framework
achieves state-of-the-art performance on ActivityNet'18 DenseCaption dataset
and Charades-STA dataset while observing only 10 or less clips per video.Comment: AAAI 201
Distribution-Specific Auditing For Subgroup Fairness
We study the problem of auditing classifiers with the notion of statistical
subgroup fairness. Kearns et al. (2018) has shown that the problem of auditing
combinatorial subgroups fairness is as hard as agnostic learning. Essentially
all work on remedying statistical measures of discrimination against subgroups
assumes access to an oracle for this problem, despite the fact that no
efficient algorithms are known for it. If we assume the data distribution is
Gaussian, or even merely log-concave, then a recent line of work has discovered
efficient agnostic learning algorithms for halfspaces. Unfortunately, the
reduction of Kearns et al. was formulated in terms of weak, "distribution-free"
learning, and thus did not establish a connection for families such as
log-concave distributions.
In this work, we give positive and negative results on auditing for Gaussian
distributions: On the positive side, we present an alternative approach to
leverage these advances in agnostic learning and thereby obtain the first
polynomial-time approximation scheme (PTAS) for auditing nontrivial
combinatorial subgroup fairness: we show how to audit statistical notions of
fairness over homogeneous halfspace subgroups when the features are Gaussian.
On the negative side, we find that under cryptographic assumptions, no
polynomial-time algorithm can guarantee any nontrivial auditing, even under
Gaussian feature distributions, for general halfspace subgroups
Spatio-Temporal Dynamics of Global Potential Vegetation Distributions Simulated by CSCS Approach
The study of Potential Natural Vegetation (PNV) has been proposed as a way to examine the impact of changes in climate on the distribution of vegetation. This study analyzes the influence of climate change in the potential vegetation distribution at global scale, using the Comprehensive Sequential Classification System (CSCS) approach to explore the changes of area, shift distance and direction for each broad vegetation category
Coordinate-based Neural Network for Fourier Phase Retrieval
Fourier phase retrieval is essential for high-definition imaging of nanoscale
structures across diverse fields, notably coherent diffraction imaging. This
study presents the Single impliCit neurAl Network (SCAN), a tool built upon
coordinate neural networks meticulously designed for enhanced phase retrieval
performance. Remedying the drawbacks of conventional iterative methods which
are easiliy trapped into local minimum solutions and sensitive to noise, SCAN
adeptly connects object coordinates to their amplitude and phase within a
unified network in an unsupervised manner. While many existing methods
primarily use Fourier magnitude in their loss function, our approach
incorporates both the predicted magnitude and phase, enhancing retrieval
accuracy. Comprehensive tests validate SCAN's superiority over traditional and
other deep learning models regarding accuracy and noise robustness. We also
demonstrate that SCAN excels in the ptychography setting
ELECTRO-ACUPUNCTURE AT JIANSHI (PC5) AND NEIGUAN (PC6) ALTERS HEART RATE VARIABILITY (HRV) IN FRIGHTENED VOLUNTEERS
Background: Fear is one of the most widely studied emotions and is closely associated with the autonomic nervous system (ANS). Previous studies have proven that acupuncture directly impacts the ANS, influences the heart rate (HR) and the heart rate variability (HRV) and exerts other effects. The aim of this study was to explore the effect of Jianshi (PC5) and Neiguan (PC6) electro-acupuncture on HRV during fear-invoking auditory stimulation using an Actiheart ECG recorder.
Materials and Methods: Two hundred healthy subjects were recruited. Using a random number table, subjects were grouped for exposure to fear-invoking auditory stimulation (n=40) or neutral auditory stimulation (n=40). After determining that our fear-invoking auditory stimulation produced the fear emotion, the other 120 subjects were similarly divided into an electro-acupuncture (EA group) and a control group that received PC5 and PC6 electro-acupuncture or no intervention.
Results: The fear score of the fear-invoking auditory group was significantly higher than that of the neutral auditory group. The EA group showed higher SD, RMSSD, and high frequency (HF) components of HRV than those of the control group.
Conclusion: The primary result suggests that PC5 and PC6 electro-acupuncture affects cardiac autonomic neural regulation, mainly via the parasympathetic system, in subjects exposed to fear-invoking auditory stimulation
Hierarchical Large Language Models in Cloud Edge End Architecture for Heterogeneous Robot Cluster Control
Despite their powerful semantic understanding and code generation
capabilities, Large Language Models (LLMs) still face challenges when dealing
with complex tasks. Multi agent strategy generation and motion control are
highly complex domains that inherently require experts from multiple fields to
collaborate. To enhance multi agent strategy generation and motion control, we
propose an innovative architecture that employs the concept of a cloud edge end
hierarchical structure. By leveraging multiple large language models with
distinct areas of expertise, we can efficiently generate strategies and perform
task decomposition. Introducing the cosine similarity approach,aligning task
decomposition instructions with robot task sequences at the vector level, we
can identify subtasks with incomplete task decomposition and iterate on them
multiple times to ultimately generate executable machine task sequences.The
robot is guided through these task sequences to complete tasks of higher
complexity. With this architecture, we implement the process of natural
language control of robots to perform complex tasks, and successfully address
the challenge of multi agent execution of open tasks in open scenarios and the
problem of task decomposition
- …