1,395 research outputs found

    Understanding Video Scenes through Text: Insights from Text-based Video Question Answering

    Full text link
    Researchers have extensively studied the field of vision and language, discovering that both visual and textual content is crucial for understanding scenes effectively. Particularly, comprehending text in videos holds great significance, requiring both scene text understanding and temporal reasoning. This paper focuses on exploring two recently introduced datasets, NewsVideoQA and M4-ViteVQA, which aim to address video question answering based on textual content. The NewsVideoQA dataset contains question-answer pairs related to the text in news videos, while M4-ViteVQA comprises question-answer pairs from diverse categories like vlogging, traveling, and shopping. We provide an analysis of the formulation of these datasets on various levels, exploring the degree of visual understanding and multi-frame comprehension required for answering the questions. Additionally, the study includes experimentation with BERT-QA, a text-only model, which demonstrates comparable performance to the original methods on both datasets, indicating the shortcomings in the formulation of these datasets. Furthermore, we also look into the domain adaptation aspect by examining the effectiveness of training on M4-ViteVQA and evaluating on NewsVideoQA and vice-versa, thereby shedding light on the challenges and potential benefits of out-of-domain training

    Successful Treatment of Postoperative External Biliary Fistula by Selective Nasobiliary Drainage

    Get PDF
    A 25-year old man presented with a high output external biliary fistula after an operation for a giant hydatid cyst of the liver. Endoscopic sphincterotomy was inadequate to close the fistula. A nasobiliary tube was selectively inserted into the leaking hepatic duct and bile was continuously aspirated. The fistula and the residual cavity healed completely. Details of the patients' management using this alternative technique, are discussed

    Reading Between the Lanes: Text VideoQA on the Road

    Full text link
    Text and signs around roads provide crucial information for drivers, vital for safe navigation and situational awareness. Scene text recognition in motion is a challenging problem, while textual cues typically appear for a short time span, and early detection at a distance is necessary. Systems that exploit such information to assist the driver should not only extract and incorporate visual and textual cues from the video stream but also reason over time. To address this issue, we introduce RoadTextVQA, a new dataset for the task of video question answering (VideoQA) in the context of driver assistance. RoadTextVQA consists of 3,2223,222 driving videos collected from multiple countries, annotated with 10,50010,500 questions, all based on text or road signs present in the driving videos. We assess the performance of state-of-the-art video question answering models on our RoadTextVQA dataset, highlighting the significant potential for improvement in this domain and the usefulness of the dataset in advancing research on in-vehicle support systems and text-aware multimodal question answering. The dataset is available at http://cvit.iiit.ac.in/research/projects/cvit-projects/roadtextvq

    Diffusive behavior for randomly kicked Newtonian particles in a spatially periodic medium

    Full text link
    We prove a central limit theorem for the momentum distribution of a particle undergoing an unbiased spatially periodic random forcing at exponentially distributed times without friction. The start is a linear Boltzmann equation for the phase space density, where the average energy of the particle grows linearly in time. Rescaling time, the momentum converges to a Brownian motion, and the position is its time-integral showing superdiffusive scaling with time t3/2t^{3/2}. The analysis has two parts: (1) to show that the particle spends most of its time at high energy, where the spatial environment is practically invisible; (2) to treat the low energy incursions where the motion is dominated by the deterministic force, with potential drift but where symmetry arguments cancel the ballistic behavior.Comment: 55 pages. Some typos corrected from previous versio
    corecore