104 research outputs found

    MoviePuzzle: Visual Narrative Reasoning through Multimodal Order Learning

    Full text link
    We introduce MoviePuzzle, a novel challenge that targets visual narrative reasoning and holistic movie understanding. Despite the notable progress that has been witnessed in the realm of video understanding, most prior works fail to present tasks and models to address holistic video understanding and the innate visual narrative structures existing in long-form videos. To tackle this quandary, we put forth MoviePuzzle task that amplifies the temporal feature learning and structure learning of video models by reshuffling the shot, frame, and clip layers of movie segments in the presence of video-dialogue information. We start by establishing a carefully refined dataset based on MovieNet by dissecting movies into hierarchical layers and randomly permuting the orders. Besides benchmarking the MoviePuzzle with prior arts on movie understanding, we devise a Hierarchical Contrastive Movie Clustering (HCMC) model that considers the underlying structure and visual semantic orders for movie reordering. Specifically, through a pairwise and contrastive learning approach, we train models to predict the correct order of each layer. This equips them with the knack for deciphering the visual narrative structure of movies and handling the disorder lurking in video data. Experiments show that our approach outperforms existing state-of-the-art methods on the \MoviePuzzle benchmark, underscoring its efficacy

    Shuo Wen Jie Zi: Rethinking Dictionaries and Glyphs for Chinese Language Pre-training

    Full text link
    We introduce CDBERT, a new learning paradigm that enhances the semantics understanding ability of the Chinese PLMs with dictionary knowledge and structure of Chinese characters. We name the two core modules of CDBERT as Shuowen and Jiezi, where Shuowen refers to the process of retrieving the most appropriate meaning from Chinese dictionaries and Jiezi refers to the process of enhancing characters' glyph representations with structure understanding. To facilitate dictionary understanding, we propose three pre-training tasks, i.e., Masked Entry Modeling, Contrastive Learning for Synonym and Antonym, and Example Learning. We evaluate our method on both modern Chinese understanding benchmark CLUE and ancient Chinese benchmark CCLUE. Moreover, we propose a new polysemy discrimination task PolyMRC based on the collected dictionary of ancient Chinese. Our paradigm demonstrates consistent improvements on previous Chinese PLMs across all tasks. Moreover, our approach yields significant boosting on few-shot setting of ancient Chinese understanding.Comment: To appear at ACL 2023 Finding

    A Search for Spectral Galaxy Pairs of Overlapping Galaxies based on Fuzzy Recognition

    Full text link
    The Spectral Galaxy Pairs (SGPs) are defined as the composite galaxy spectra which contain two independent redshift systems. These spectra are useful for studying dust properties of the foreground galaxies. In this paper, a total of 165 spectra of SGPs are mined out from Sloan Digital Sky Survey (SDSS) Data Release 9 (DR9) using the concept of membership degree from the fuzzy set theory particularly defined to be suitable for fuzzily identifying emission lines. The spectra and images of this sample are classified according to the membership degree and their image features, respectively. Many of these 2nd redshift systems are too small or too dim to select from the SDSS images alone, making the sample a potentially unique source of information on dust effects in low-luminosity or low-surface-brightness galaxies that are underrepresented in morphological pair samples. The dust extinction of the objects with high membership degree is also estimated by Balmer decrement. Additionally, analyses for a series of spectroscopic observations of one SGP from 165 systems indicate that a newly star-forming region of our Milky Way might occur.Comment: 16pages, 6figure

    Quasi-4-dimension ionospheric modeling and its application in PPP

    Get PDF
    The version of record of this article, first published in Satellite Navigation, is available online at Publisher’s website: http://dx.doi.org/10.1186/s43020-022-00085-zIonospheric delay modeling is not only important for GNSS based space weather study and monitoring, but also an efficient tool to overcome the long convergence time of PPP. In this study, a novel model, denoted as Q4DIM (Quasi-4-dimension ionospheric modeling) is proposed for wide-area high precision ionospheric delay correction. In Q4DIM, the LOS (line of sight) ionospheric delay from a GNSS station network is divided into different clusters according to not only latitude and longitude, but also elevation and azimuth. Both GIM (global ionosphere map) and SID (slant ionospheric delay) that traditionally used for wide-area and regional ionospheric delay modeling, respectively, can be regarded as special case of Q4DIM by defining proper grids in latitude, longitude, elevation and azimuth. Thus, Q4DIM presents a resilient model that is capable for both wide-area coverage and high precision. Then four different sets of clusters are defined to illustrate the properties of Q4DIM based on 200 EPN stations. The results suggested that Q4DIM is compatible with the widely acknowledged GIM products. Moreover, it is proved that by inducting the elevation and azimuth angle dependent residuals, the precision of the 2-dimensional GIM-like model, i.e., Q4DIM-2D, is improved from around 1.5 TECU to better than 0.5 TECU. In addition, by treating Q4DIM as a 4-dimensional matrix in latitude, longitude, elevation and azimuth, its sparsity is less than 5%, thus guarantees its feasibility in a bandwidth-sensitive applications, e.g., satellite-based PPP-RTK service. Finally, the advantage of Q4DIM in single frequency PPP over the 2-dimensional models is demonstrated with one month’s data from 30 EPN stations.Peer ReviewedPostprint (published version
    • …
    corecore