Search CORE

40 research outputs found

Is the Reign of Interactive Search Eternal? Findings from the Video Browser Showdown 2020

Author: Bailer Werner
Gurrin Cathal
Jónsson Björn Thór
Kovalčík Gregor
Lokoč Jakub
Mejzlík František
Rossetto Luca
Sauter Loris
Schoeffmann Klaus
Song Jaeyub
Souček Tomáš
Veselý Patrik
Vrochidis Stefanos
Wu Jiaxin
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/07/2021
Field of study

The IT University of Copenhagen's Repository

ZORA

An Asynchronous Scheme for the Distributed Evaluation of Interactive Multimedia Retrieval

Author: Bernstein Abraham
Gasser Ralph
Rossetto Luca
Sauter Loris
Schuldt Heiko
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 14/11/2022
Field of study

Evaluation campaigns for interactive multimedia retrieval, such as the Video Browser Shodown (VBS) or the Lifelog Search Challenge (LSC), so far imposed constraints on both simultaneity and locality of all participants, requiring them to solve the same tasks in the same place, at the same time and under the same conditions. These constraints are in contrast to other evaluation campaigns that do not focus on interactivity, where participants can process the tasks in any place at any time. The recent travel restrictions necessitated the relaxation of the locality constraint of interactive campaigns, enabling participants to take place from an arbitrary location. Born out of necessity, this relaxation turned out to be a boon since it greatly simplified the evaluation process and enabled organisation of ad-hoc evaluations outside of the large campaigns. However, it also introduced an additional complication in cases where participants were spread over several time zones. In this paper, we introduce an evaluation scheme for interactive retrieval evaluation that relaxes both the simultaneity and locality constraints, enabling participation from any place at any time within a predefined time frame. This scheme, as implemented in the Distributed Retrieval Evaluation Server (DRES), enables novel ways of conducting interactive retrieval evaluation and bridged the gap between interactive campaigns and non-interactive ones

ZORA

EXPLOITING BERT FOR MALFORMED SEGMENTATION DETECTION TO IMPROVE SCIENTIFIC WRITINGS

Author: Gamalel-Din Shehab
Halawa Abdelrahman
Nasr Abdurrahman
Publication venue: Lublin University of Technology
Publication date: 30/06/2023
Field of study

Writing a well-structured scientific documents, such as articles and theses, is vital for comprehending the document's argumentation and understanding its messages. Furthermore, it has an impact on the efficiency and time required for studying the document. Proper document segmentation also yields better results when employing automated Natural Language Processing (NLP) manipulation algorithms, including summarization and other information retrieval and analysis functions. Unfortunately, inexperienced writers, such as young researchers and graduate students, often struggle to produce well-structured professional documents. Their writing frequently exhibits improper segmentations or lacks semantically coherent segments, a phenomenon referred to as "mal-segmentation." Examples of mal-segmentation include improper paragraph or section divisions and unsmooth transitions between sentences and paragraphs. This research addresses the issue of mal-segmentation in scientific writing by introducing an automated method for detecting mal-segmentations, and utilizing Sentence Bidirectional Encoder Representations from Transformers (sBERT) as an encoding mechanism. The experimental results section shows a promising results for the detection of mal-segmentation using the sBERT technique

Lublin University of Technology Journals

Audiovisual annotation procedure for multi-view field recordings

Author: AA Liu
D Turnbull
H McGurk
I Lefter
JC Pereira
L Aroyo
M Chion
O Russakovsky
R Iedema
S Bird
X Wang
Y LeCun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/12/2018
Field of study

Audio and video parts of an audiovisual document interact to produce an audiovisual, or multi-modal, perception. Yet, automatic analysis on these documents are usually based on separate audio and video annotations. Regarding the audiovisual content, these annotations could be incomplete, or not relevant. Besides, the expanding possibilities of creating audiovisual documents lead to consider different kinds of contents, including videos filmed in uncontrolled conditions (i.e. fields recordings), or scenes filmed from different points of view (multi-view). In this paper we propose an original procedure to produce manual annotations in different contexts, including multi-modal and multi-view documents. This procedure, based on using both audio and video annotations, ensures consistency considering audio or video only, and provides additionally audiovisual information at a richer level. Finally, different applications are made possible when considering such annotated data. In particular, we present an example application in a network of recordings in which our annotations allow multi-source retrieval using mono or multi-modal queries

Crossref

Scientific Publications of the University of Toulouse II Le Mirail

Open Archive Toulouse Archive Ouverte

VieLens,: an interactive search engine for LSC2019

Author: Crane Martin
Gurrin Cathal
Healy Graham
Le Dien H.
Nguyen Son P.
Pham Uyen H.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2019
Field of study

With the appearance of many wearable devices like smartwatches, recording glasses (such as Google glass), smart phones, digital personal profiles have become more readily available nowadays. However, searching and navigating these multi-source, multi-modal, and often unstructured data to extract useful information is still a relatively challenging task. Therefore, the LSC2019 competition has been organized so that researchers can demonstrate novel search engines, as well as exchange ideas and collaborate on these types of problems. We present in this paper our approach for supporting interactive searches of lifelog data by employing a new retrieval system called VieLens, which is an interactive retrieval system enhanced by natural language processing techniques to extend and improve search results mainly in the context of a user’s activities in their daily life

Crossref

DCU Online Research Access Service

MultiVENT: Multilingual Videos of Events with Aligned Natural Text

Author: Etter David
Kriz Reno
Sanders Kate
Van Durme Benjamin
Publication venue
Publication date: 06/07/2023
Field of study

Everyday news coverage has shifted from traditional broadcasts towards a wide range of presentation formats such as first-hand, unedited video footage. Datasets that reflect the diverse array of multimodal, multilingual news sources available online could be used to teach models to benefit from this shift, but existing news video datasets focus on traditional news broadcasts produced for English-speaking audiences. We address this limitation by constructing MultiVENT, a dataset of multilingual, event-centric videos grounded in text documents across five target languages. MultiVENT includes both news broadcast videos and non-professional event footage, which we use to analyze the state of online news videos and how they can be leveraged to build robust, factually accurate models. Finally, we provide a model for complex, multilingual video retrieval to serve as a baseline for information retrieval using MultiVENT

arXiv.org e-Print Archive

Multimodal Automated Fact-Checking: A Survey

Author: Akhtar Mubashara
Cocarascu Oana
Guo Zhijiang
Schlichtkrull Michael
Simperl Elena
Vlachos Andreas
Publication venue
Publication date: 25/10/2023
Field of study

Misinformation is often conveyed in multiple modalities, e.g. a miscaptioned image. Multimodal misinformation is perceived as more credible by humans, and spreads faster than its text-only counterparts. While an increasing body of research investigates automated fact-checking (AFC), previous surveys mostly focus on text. In this survey, we conceptualise a framework for AFC including subtasks unique to multimodal misinformation. Furthermore, we discuss related terms used in different communities and map them to our framework. We focus on four modalities prevalent in real-world fact-checking: text, image, audio, and video. We survey benchmarks and models, and discuss limitations and promising directions for future researchComment: The 2023 Conference on Empirical Methods in Natural Language Processing (EMNLP): Finding

arXiv.org e-Print Archive

Multimedia modeling: 25th international conference, MMM 2019, Thessaloniki, Greece, January 8-11, 2019, proceedings, part I

Author: Cheng Wen-Huang
Gurrin Cathal
Huet Benoit
Kompatsiaris Ioannis
Mezaris Vasileios
Vrochidis Stefanos
Publication venue: Springer International Publishing AG
Publication date: 01/01/2018
Field of study

CERN Document Server