20 research outputs found

    Video Question Answering via Attribute-Augmented Attention Network Learning

    Full text link
    Video Question Answering is a challenging problem in visual information retrieval, which provides the answer to the referenced video content according to the question. However, the existing visual question answering approaches mainly tackle the problem of static image question, which may be ineffectively for video question answering due to the insufficiency of modeling the temporal dynamics of video contents. In this paper, we study the problem of video question answering by modeling its temporal dynamics with frame-level attention mechanism. We propose the attribute-augmented attention network learning framework that enables the joint frame-level attribute detection and unified video representation learning for video question answering. We then incorporate the multi-step reasoning process for our proposed attention network to further improve the performance. We construct a large-scale video question answering dataset. We conduct the experiments on both multiple-choice and open-ended video question answering tasks to show the effectiveness of the proposed method.Comment: Accepted for SIGIR 201

    Judging a video by its bitstream cover

    Full text link
    Classifying videos into distinct categories, such as Sport and Music Video, is crucial for multimedia understanding and retrieval, especially in an age where an immense volume of video content is constantly being generated. Traditional methods require video decompression to extract pixel-level features like color, texture, and motion, thereby increasing computational and storage demands. Moreover, these methods often suffer from performance degradation in low-quality videos. We present a novel approach that examines only the post-compression bitstream of a video to perform classification, eliminating the need for bitstream. We validate our approach using a custom-built data set comprising over 29,000 YouTube video clips, totaling 6,000 hours and spanning 11 distinct categories. Our preliminary evaluations indicate precision, accuracy, and recall rates well over 80%. The algorithm operates approximately 15,000 times faster than real-time for 30fps videos, outperforming traditional Dynamic Time Warping (DTW) algorithm by six orders of magnitude

    Reinforcement-Learning based Portfolio Management with Augmented Asset Movement Prediction States

    Full text link
    Portfolio management (PM) is a fundamental financial planning task that aims to achieve investment goals such as maximal profits or minimal risks. Its decision process involves continuous derivation of valuable information from various data sources and sequential decision optimization, which is a prospective research direction for reinforcement learning (RL). In this paper, we propose SARL, a novel State-Augmented RL framework for PM. Our framework aims to address two unique challenges in financial PM: (1) data heterogeneity -- the collected information for each asset is usually diverse, noisy and imbalanced (e.g., news articles); and (2) environment uncertainty -- the financial market is versatile and non-stationary. To incorporate heterogeneous data and enhance robustness against environment uncertainty, our SARL augments the asset information with their price movement prediction as additional states, where the prediction can be solely based on financial data (e.g., asset prices) or derived from alternative sources such as news. Experiments on two real-world datasets, (i) Bitcoin market and (ii) HighTech stock market with 7-year Reuters news articles, validate the effectiveness of SARL over existing PM approaches, both in terms of accumulated profits and risk-adjusted profits. Moreover, extensive simulations are conducted to demonstrate the importance of our proposed state augmentation, providing new insights and boosting performance significantly over standard RL-based PM method and other baselines.Comment: AAAI 202

    A surge in cytoplasmic viscosity triggers nuclear remodeling required for Dux silencing and pre-implantation embryo development

    No full text
    Summary: Embryonic genome activation (EGA) marks the transition from dependence on maternal transcripts to an embryonic transcriptional program. The precise temporal regulation of gene expression, specifically the silencing of the Dux/murine endogenous retrovirus type L (MERVL) program during late 2-cell interphase, is crucial for developmental progression in mouse embryos. How this finely tuned regulation is achieved within this specific window is poorly understood. Here, using particle-tracking microrheology throughout the mouse oocyte-to-embryo transition, we identify a surge in cytoplasmic viscosity specific to late 2-cell interphase brought about by high microtubule and endomembrane density. Importantly, preventing the rise in 2-cell viscosity severely impairs nuclear reorganization, resulting in a persistently open chromatin configuration and failure to silence Dux/MERVL. This, in turn, derails embryo development beyond the 2- and 4-cell stages. Our findings reveal a mechanical role of the cytoplasm in regulating Dux/MERVL repression via nuclear remodeling during a temporally confined period in late 2-cell interphase

    Anisotropic emission of orientation-controlled mixed-dimensional perovskites for light-emitting devices

    No full text
    Perovskite light-emitting diodes (PeLEDs) are attracting increasing attention owing to their impressive efficiencies and high luminance across the full visible light range. Further improvement of the external quantum efficiency (EQE) of planar PeLEDs is limited by the light out-coupling efficiency. Introducing perovskite emitters with directional emission in PeLEDs is an effective way to improve light extraction. Here, we report that it is possible to achieve directional emission in mixed-dimensional perovskites by controlling the orientation of the emissive center in the film. Multiple characterization methods suggest that our mixed-dimensional perovskite film shows highly orientated transition dipole moments (TDMs) with the horizontal ratio of over 88%, substantially higher than that of the isotropic emitters. The horizontally dominated TDMs lead to PeLEDs with exceptional high light out-coupling efficiency of over 32%, enabling a high EQE of 18.2%

    Land control and crop booms inside China: implications for how we think about the global land rush

    No full text
    This paper aims to broaden the scope of analysis of the contemporary global land rush by examining crop booms not only outside, but inside China; and investment flows not only from China, but also within and into China. It does so by examining the eucalyptus and sugarcane sectors in southern China, which have witnessed investment booms during the past decade, with capital being infused by both domestic capital and foreign capital, including Finnish, Indonesian, and Thai companies. Our argument addresses three key issues: (a) explaining why foreign and domestic companies enter into a multitude of lease and grower contracts involving holders of micro-plots, (b) revisiting the notion of extra-economic coercion, and (c) a critique of thinking about flows of large-scale investments centred primarily on nationality. These issues are central in current debates in the land grabs literature, and our study offers a different perspective from dominant narratives
    corecore