Search CORE

6 research outputs found

whu-nercms at trecvid2021:instance search task

Author: Chen Jun
Huang Baojin
Huang Ji
Liang Chao
Lu Ankang
Niu Yanrui
Wang Zhongyuan
Wen Shishi
Xu Dongshu
Yang Jingyao
Zhang Yue
Publication venue
Publication date: 17/06/2022
Field of study

We will make a brief introduction of the experimental methods and results of the WHU-NERCMS in the TRECVID2021 in the paper. This year we participate in the automatic and interactive tasks of Instance Search (INS). For the automatic task, the retrieval target is divided into two parts, person retrieval, and action retrieval. We adopt a two-stage method including face detection and face recognition for person retrieval and two kinds of action detection methods consisting of three frame-based human-object interaction detection methods and two video-based general action detection methods for action retrieval. After that, the person retrieval results and action retrieval results are fused to initialize the result ranking lists. In addition, we make attempts to use complementary methods to further improve search performance. For interactive tasks, we test two different interaction strategies on the fusion results. We submit 4 runs for automatic and interactive tasks respectively. The introduction of each run is shown in Table 1. The official evaluations show that the proposed strategies rank 1st in both automatic and interactive tracks.Comment: 9 pages, 4 figure

arXiv.org e-Print Archive

An overview on the evaluated video retrieval tasks at TRECVID 2022

Author: Awad George
Butt Asad
Curtis Keith
Delgado Andrew
Diduch Lukas
Fiscus Jonathan
Godard Eliot
Godil Afzal
Graham Yvette
Lee Yooyoung
Liu Jeffrey
Quenot Georges
Publication venue
Publication date: 22/06/2023
Field of study

The TREC Video Retrieval Evaluation (TRECVID) is a TREC-style video analysis and retrieval evaluation with the goal of promoting progress in research and development of content-based exploitation and retrieval of information from digital video via open, tasks-based evaluation supported by metrology. Over the last twenty-one years this effort has yielded a better understanding of how systems can effectively accomplish such processing and how one can reliably benchmark their performance. TRECVID has been funded by NIST (National Institute of Standards and Technology) and other US government agencies. In addition, many organizations and individuals worldwide contribute significant time and effort. TRECVID 2022 planned for the following six tasks: Ad-hoc video search, Video to text captioning, Disaster scene description and indexing, Activity in extended videos, deep video understanding, and movie summarization. In total, 35 teams from various research organizations worldwide signed up to join the evaluation campaign this year. This paper introduces the tasks, datasets used, evaluation frameworks and metrics, as well as a high-level results overview.Comment: arXiv admin note: substantial text overlap with arXiv:2104.13473, arXiv:2009.0998

arXiv.org e-Print Archive

TRECVID 2018: Benchmarking Video Activity Detection, Video Captioning and Matching, Video Storytelling Linking and Video Search

Author: Awad George
Blasi Saverio
Butt Asad,
Curtis Keith
Delgado Andrew
Fiscus Jonathan
Godil Afzad
Graham Yvette
Joy David
Kraaij Wessel
Lee Yooyoung
Magalhaes Joao
Quénot Georges
Semedo David
Smeaton Alan,
Publication venue: HAL CCSD
Publication date: 13/11/2018
Field of study

International audienc

Hal - Université Grenoble Alpes

TRECVID 2015 – An Overview of the Goals, Tasks, Data, Evaluation Mechanisms, and Metrics

Author: Aly Robin
Awad George
Fiscus Jon
Joy David
Kraaij Wessel
Michel Martial
Ordelman Roeland
Over Paul
Quénot Georges
Smeaton Alan,
Publication venue: HAL CCSD
Publication date: 16/11/2015
Field of study

International audienc

Intelligent Data Analytics using Deep Learning for Data Science

Author: Presa Reyes Maria E
Publication venue: FIU Digital Commons
Publication date: 13/05/2022
Field of study

Nowadays, data science stimulates the interest of academics and practitioners because it can assist in the extraction of significant insights from massive amounts of data. From the years 2018 through 2025, the Global Datasphere is expected to rise from 33 Zettabytes to 175 Zettabytes, according to the International Data Corporation. This dissertation proposes an intelligent data analytics framework that uses deep learning to tackle several difficulties when implementing a data science application. These difficulties include dealing with high inter-class similarity, the availability and quality of hand-labeled data, and designing a feasible approach for modeling significant correlations in features gathered from various data sources. The proposed intelligent data analytics framework employs a novel strategy for improving data representation learning by incorporating supplemental data from various sources and structures. First, the research presents a multi-source fusion approach that utilizes confident learning techniques to improve the data quality from many noisy sources. Meta-learning methods based on advanced techniques such as the mixture of experts and differential evolution combine the predictive capacity of individual learners with a gating mechanism, ensuring that only the most trustworthy features or predictions are integrated to train the model. Then, a Multi-Level Convolutional Fusion is presented to train a model on the correspondence between local-global deep feature interactions to identify easily confused samples of different classes. The convolutional fusion is further enhanced with the power of Graph Transformers, aggregating the relevant neighboring features in graph-based input data structures and achieving state-of-the-art performance on a large-scale building damage dataset. Finally, weakly-supervised strategies, noise regularization, and label propagation are proposed to train a model on sparse input labeled data, ensuring the model\u27s robustness to errors and supporting the automatic expansion of the training set. The suggested approaches outperformed competing strategies in effectively training a model on a large-scale dataset of 500k photos, with just about 7% of the images annotated by a human. The proposed framework\u27s capabilities have benefited various data science applications, including fluid dynamics, geometric morphometrics, building damage classification from satellite pictures, disaster scene description, and storm-surge visualization

DigitalCommons@Florida International University