186 research outputs found

    Revisiting the Spatial and Temporal Modeling for Few-shot Action Recognition

    Full text link
    Spatial and temporal modeling is one of the most core aspects of few-shot action recognition. Most previous works mainly focus on long-term temporal relation modeling based on high-level spatial representations, without considering the crucial low-level spatial features and short-term temporal relations. Actually, the former feature could bring rich local semantic information, and the latter feature could represent motion characteristics of adjacent frames, respectively. In this paper, we propose SloshNet, a new framework that revisits the spatial and temporal modeling for few-shot action recognition in a finer manner. First, to exploit the low-level spatial features, we design a feature fusion architecture search module to automatically search for the best combination of the low-level and high-level spatial features. Next, inspired by the recent transformer, we introduce a long-term temporal modeling module to model the global temporal relations based on the extracted spatial appearance features. Meanwhile, we design another short-term temporal modeling module to encode the motion characteristics between adjacent frame representations. After that, the final predictions can be obtained by feeding the embedded rich spatial-temporal features to a common frame-level class prototype matcher. We extensively validate the proposed SloshNet on four few-shot action recognition datasets, including Something-Something V2, Kinetics, UCF101, and HMDB51. It achieves favorable results against state-of-the-art methods in all datasets

    Blind2Sound: Self-Supervised Image Denoising without Residual Noise

    Full text link
    Self-supervised blind denoising for Poisson-Gaussian noise remains a challenging task. Pseudo-supervised pairs constructed from single noisy images re-corrupt the signal and degrade the performance. The visible blindspots solve the information loss in masked inputs. However, without explicitly noise sensing, mean square error as an objective function cannot adjust denoising intensities for dynamic noise levels, leading to noticeable residual noise. In this paper, we propose Blind2Sound, a simple yet effective approach to overcome residual noise in denoised images. The proposed adaptive re-visible loss senses noise levels and performs personalized denoising without noise residues while retaining the signal lossless. The theoretical analysis of intermediate medium gradients guarantees stable training, while the Cramer Gaussian loss acts as a regularization to facilitate the accurate perception of noise levels and improve the performance of the denoiser. Experiments on synthetic and real-world datasets show the superior performance of our method, especially for single-channel images

    Two Candidate Obscured Tidal Disruption Events Coincident with High-energy Neutrinos

    Full text link
    Recently, three optical tidal disruption event (TDE) candidates discovered by the Zwicky Transient Facility (ZTF) have been suggested to be coincident with high-energy neutrinos. They all exhibit unusually strong dust infrared (IR) echoes, with their peak times matching the neutrino arrival time even better than the optical peaks. We hereby report on two new TDE candidates that are spatially and temporally coincident with neutrinos by matching our sample of mid-infrared outbursts in nearby galaxies (MIRONG) with Gold alerts of IceCube high-energy neutrino events up to June 2022. The two candidates show negligible optical variability according to their ZTF light curves and can therefore be classified as part of the growing population of obscured TDE candidates. The chance probability of finding two such candidates about ∼3%\sim3\% by redistributing the MIRONG sources randomly in the SDSS footprint, which will be as low as ∼0.1%\sim0.1\% (or ∼0.2%\sim0.2\%) if we limit to sources with increased fluxes (or variability amplitudes) comparable with the matched two sources. Our findings further support the potential connection between high-energy neutrinos and TDEs in dusty environments by increasing the total number of neutrino-associated TDE and TDE candidates to five, although the underlying physics remains poorly understood.Comment: Published, ApJL, 953, L1

    AT2018dyk Revisited: a Tidal Disruption Event Candidate with Prominent Infrared Echo and Delayed X-ray Emission in a LINER Galaxy

    Full text link
    The multiwavelength data of nuclear transient AT2018dyk, initially discovered as a changing-look low-ionization nuclear emission-line region (LINER) galaxy, has been revisited by us and found being in agreement with a tidal disruption event (TDE) scenario. The optical light curve of AT2018dyk declines as a power-law form approximately with index -5/3 yet its X-ray emission lags behind the optical peak by ∼140\sim140 days, both of which are typical characteristics for TDEs. The X-ray spectra are softer than normal active galactic nuclei (AGNs) although they show a slight trend of hardening. Interestingly, its rising time scale belongs to the longest among TDEs while it is nicely consistent with the theoretical prediction from its relatively large supermassive black hole (SMBH) mass (∼107.38M⊙\sim10^{7.38} M_{\odot}). Moreover, a prominent infrared echo with peak luminosity ∼7.4×1042 erg s−1\sim7.4\times10^{42}~\text{erg}~\text{s}^{-1} has been also detected in AT2018dyk, implying an unusually dusty subparsec nuclear environment in contrast with other TDEs. In our sample, LINERs share similar covering factors with AGNs, which indicates the existence of the dusty torus in these objects. Our work suggests that the nature of nuclear transients in LINERs needs to be carefully identified and their infrared echoes offer us a unique opportunity for exploring the environment of SMBHs at low accretion rate, which has been so far poorly explored but is crucial for understanding the SMBH activity.Comment: 9 pages, 6figures, 1 table. Accepted for publication in MNRA

    LLaMA Rider: Spurring Large Language Models to Explore the Open World

    Full text link
    Recently, various studies have leveraged Large Language Models (LLMs) to help decision-making and planning in environments, and try to align the LLMs' knowledge with the world conditions. Nonetheless, the capacity of LLMs to continuously acquire environmental knowledge and adapt in an open world remains uncertain. In this paper, we propose an approach to spur LLMs to explore the open world, gather experiences, and learn to improve their task-solving capabilities. In this approach, a multi-round feedback-revision mechanism is utilized to encourage LLMs to actively select appropriate revision actions guided by feedback information from the environment. This facilitates exploration and enhances the model's performance. Besides, we integrate sub-task relabeling to assist LLMs in maintaining consistency in sub-task planning and help the model learn the combinatorial nature between tasks, enabling it to complete a wider range of tasks through training based on the acquired exploration experiences. By evaluation in Minecraft, an open-ended sandbox world, we demonstrate that our approach LLaMA-Rider enhances the efficiency of the LLM in exploring the environment, and effectively improves the LLM's ability to accomplish more tasks through fine-tuning with merely 1.3k instances of collected data, showing minimal training costs compared to the baseline using reinforcement learning.Comment: 18 page

    AT 2023clx: the Faintest and Closest Optical Tidal Disruption Event Discovered in Nearby Star-forming Galaxy NGC 3799

    Full text link
    We report the discovery of a faint optical tidal disruption event (TDE) in the nearby star-forming galaxy NGC 3799. Identification of the TDE is based on its position at the galaxy nucleus, a light curve declining as t^-5/3, a blue continuum with an almost constant blackbody temperature of ~12,000K, and broad (~15,000kms^-1) Balmer lines and characteristic He~II 4686A emission. The light curve of AT 2023clx peaked at an absolute magnitude of -17.16mag in the g-band and a maximum blackbody bolometric luminosity of 4.56*10^42 ergs^-1, making it the faintest TDE discovered to date. With a redshift of 0.01107 and a corresponding luminosity distance of 47.8Mpc, it is also the closest optical TDE ever discovered to our best knowledge. Furthermore, our analysis of Swift/XRT observations of AT 2023clx yields a very tight 3 sigma upper limit of 9.53*10^39 ergs^-1 in the range 0.3--10keV. AT 2023clx, together with very few other faint TDEs such as AT 2020wey, prove that there are probably a large number of faint TDEs yet to be discovered at higher redshifts, which is consistent with the prediction of luminosity functions (LFs). The upcoming deeper optical time-domain surveys, such as the Legacy Survey of Space and Time (LSST) and the Wide-Field Survey Telescope (WFST) will discover more TDEs at even lower luminosities, allowing for a more precise constraint of the low-end of the LF.Comment: 9 pages, 6 figures; Accepted for ApJL (July, 2023

    Robust Point Cloud Registration Framework Based on Deep Graph Matching(TPAMI Version)

    Full text link
    3D point cloud registration is a fundamental problem in computer vision and robotics. Recently, learning-based point cloud registration methods have made great progress. However, these methods are sensitive to outliers, which lead to more incorrect correspondences. In this paper, we propose a novel deep graph matching-based framework for point cloud registration. Specifically, we first transform point clouds into graphs and extract deep features for each point. Then, we develop a module based on deep graph matching to calculate a soft correspondence matrix. By using graph matching, not only the local geometry of each point but also its structure and topology in a larger range are considered in establishing correspondences, so that more correct correspondences are found. We train the network with a loss directly defined on the correspondences, and in the test stage the soft correspondences are transformed into hard one-to-one correspondences so that registration can be performed by a correspondence-based solver. Furthermore, we introduce a transformer-based method to generate edges for graph construction, which further improves the quality of the correspondences. Extensive experiments on object-level and scene-level benchmark datasets show that the proposed method achieves state-of-the-art performance. The code is available at: \href{https://github.com/fukexue/RGM}{https://github.com/fukexue/RGM}.Comment: accepted by TPAMI 2022. arXiv admin note: substantial text overlap with arXiv:2103.0425
    • …
    corecore