97 research outputs found

    Read, Watch, and Move: Reinforcement Learning for Temporally Grounding Natural Language Descriptions in Videos

    Full text link
    The task of video grounding, which temporally localizes a natural language description in a video, plays an important role in understanding videos. Existing studies have adopted strategies of sliding window over the entire video or exhaustively ranking all possible clip-sentence pairs in a pre-segmented video, which inevitably suffer from exhaustively enumerated candidates. To alleviate this problem, we formulate this task as a problem of sequential decision making by learning an agent which regulates the temporal grounding boundaries progressively based on its policy. Specifically, we propose a reinforcement learning based framework improved by multi-task learning and it shows steady performance gains by considering additional supervised boundary information during training. Our proposed framework achieves state-of-the-art performance on ActivityNet'18 DenseCaption dataset and Charades-STA dataset while observing only 10 or less clips per video.Comment: AAAI 201

    Distribution-Specific Auditing For Subgroup Fairness

    Full text link
    We study the problem of auditing classifiers with the notion of statistical subgroup fairness. Kearns et al. (2018) has shown that the problem of auditing combinatorial subgroups fairness is as hard as agnostic learning. Essentially all work on remedying statistical measures of discrimination against subgroups assumes access to an oracle for this problem, despite the fact that no efficient algorithms are known for it. If we assume the data distribution is Gaussian, or even merely log-concave, then a recent line of work has discovered efficient agnostic learning algorithms for halfspaces. Unfortunately, the reduction of Kearns et al. was formulated in terms of weak, "distribution-free" learning, and thus did not establish a connection for families such as log-concave distributions. In this work, we give positive and negative results on auditing for Gaussian distributions: On the positive side, we present an alternative approach to leverage these advances in agnostic learning and thereby obtain the first polynomial-time approximation scheme (PTAS) for auditing nontrivial combinatorial subgroup fairness: we show how to audit statistical notions of fairness over homogeneous halfspace subgroups when the features are Gaussian. On the negative side, we find that under cryptographic assumptions, no polynomial-time algorithm can guarantee any nontrivial auditing, even under Gaussian feature distributions, for general halfspace subgroups

    Spatio-Temporal Dynamics of Global Potential Vegetation Distributions Simulated by CSCS Approach

    Get PDF
    The study of Potential Natural Vegetation (PNV) has been proposed as a way to examine the impact of changes in climate on the distribution of vegetation. This study analyzes the influence of climate change in the potential vegetation distribution at global scale, using the Comprehensive Sequential Classification System (CSCS) approach to explore the changes of area, shift distance and direction for each broad vegetation category

    Coordinate-based Neural Network for Fourier Phase Retrieval

    Full text link
    Fourier phase retrieval is essential for high-definition imaging of nanoscale structures across diverse fields, notably coherent diffraction imaging. This study presents the Single impliCit neurAl Network (SCAN), a tool built upon coordinate neural networks meticulously designed for enhanced phase retrieval performance. Remedying the drawbacks of conventional iterative methods which are easiliy trapped into local minimum solutions and sensitive to noise, SCAN adeptly connects object coordinates to their amplitude and phase within a unified network in an unsupervised manner. While many existing methods primarily use Fourier magnitude in their loss function, our approach incorporates both the predicted magnitude and phase, enhancing retrieval accuracy. Comprehensive tests validate SCAN's superiority over traditional and other deep learning models regarding accuracy and noise robustness. We also demonstrate that SCAN excels in the ptychography setting

    ELECTRO-ACUPUNCTURE AT JIANSHI (PC5) AND NEIGUAN (PC6) ALTERS HEART RATE VARIABILITY (HRV) IN FRIGHTENED VOLUNTEERS

    Get PDF
    Background: Fear is one of the most widely studied emotions and is closely associated with the autonomic nervous system (ANS). Previous studies have proven that acupuncture directly impacts the ANS, influences the heart rate (HR) and the heart rate variability (HRV) and exerts other effects. The aim of this study was to explore the effect of Jianshi (PC5) and Neiguan (PC6) electro-acupuncture on HRV during fear-invoking auditory stimulation using an Actiheart ECG recorder. Materials and Methods: Two hundred healthy subjects were recruited. Using a random number table, subjects were grouped for exposure to fear-invoking auditory stimulation (n=40) or neutral auditory stimulation (n=40). After determining that our fear-invoking auditory stimulation produced the fear emotion, the other 120 subjects were similarly divided into an electro-acupuncture (EA group) and a control group that received PC5 and PC6 electro-acupuncture or no intervention. Results: The fear score of the fear-invoking auditory group was significantly higher than that of the neutral auditory group. The EA group showed higher SD, RMSSD, and high frequency (HF) components of HRV than those of the control group. Conclusion: The primary result suggests that PC5 and PC6 electro-acupuncture affects cardiac autonomic neural regulation, mainly via the parasympathetic system, in subjects exposed to fear-invoking auditory stimulation

    Hierarchical Large Language Models in Cloud Edge End Architecture for Heterogeneous Robot Cluster Control

    Full text link
    Despite their powerful semantic understanding and code generation capabilities, Large Language Models (LLMs) still face challenges when dealing with complex tasks. Multi agent strategy generation and motion control are highly complex domains that inherently require experts from multiple fields to collaborate. To enhance multi agent strategy generation and motion control, we propose an innovative architecture that employs the concept of a cloud edge end hierarchical structure. By leveraging multiple large language models with distinct areas of expertise, we can efficiently generate strategies and perform task decomposition. Introducing the cosine similarity approach,aligning task decomposition instructions with robot task sequences at the vector level, we can identify subtasks with incomplete task decomposition and iterate on them multiple times to ultimately generate executable machine task sequences.The robot is guided through these task sequences to complete tasks of higher complexity. With this architecture, we implement the process of natural language control of robots to perform complex tasks, and successfully address the challenge of multi agent execution of open tasks in open scenarios and the problem of task decomposition
    corecore