4 research outputs found

    EPIC-Fusion:Audio-Visual Temporal Binding for Egocentric Action Recognition

    Get PDF
    We focus on multi-modal fusion for egocentric action recognition, and propose a novel architecture for multi-modal temporal-binding, i.e. the combination of modalities within a range of temporal offsets. We train the architecture with three modalities -- RGB, Flow and Audio -- and combine them with mid-level fusion alongside sparse temporal sampling of fused representations. In contrast with previous works, modalities are fused before temporal aggregation, with shared modality and fusion weights over time. Our proposed architecture is trained end-to-end, outperforming individual modalities as well as late-fusion of modalities. We demonstrate the importance of audio in egocentric vision, on per-class basis, for identifying actions as well as interacting objects. Our method achieves state of the art results on both the seen and unseen test sets of the largest egocentric dataset: EPIC-Kitchens, on all metrics using the public leaderboard.Comment: Accepted for presentation at ICCV 201

    Rescaling Egocentric Vision:Collection Pipeline and Challenges for EPIC-KITCHENS-100

    Get PDF
    This paper introduces the pipeline to extend the largest dataset in egocentric vision, EPIC-KITCHENS. The effort culminates in EPIC-KITCHENS-100, a collection of 100 hours, 20M frames, 90K actions in 700 variable-length videos, capturing long-term unscripted activities in 45 environments, using head-mounted cameras. Compared to its previous version (Damen in Scaling egocentric vision: ECCV, 2018), EPIC-KITCHENS-100 has been annotated using a novel pipeline that allows denser (54% more actions per minute) and more complete annotations of fine-grained actions (+128% more action segments). This collection enables new challenges such as action detection and evaluating the “test of time”—i.e. whether models trained on data collected in 2018 can generalise to new footage collected two years later. The dataset is aligned with 6 challenges: action recognition (full and weak supervision), action detection, action anticipation, cross-modal retrieval (from captions), as well as unsupervised domain adaptation for action recognition. For each challenge, we define the task, provide baselines and evaluation metrics.Published versionResearch at Bristol is supported by Engineering and Physical Sciences Research Council (EPSRC) Doctoral Training Program (DTP), EPSRC Fellowship UMPIRE (EP/T004991/1). Research at Catania is sponsored by Piano della Ricerca 2016-2018 linea di Intervento 2 of DMI, by MISE - PON I&C 2014-2020, ENIGMA project (CUP: B61B19000520008) and by MIUR AIM - Attrazione e Mobilita Internazionale Linea 1 - AIM1893589 - CUP E64118002540007

    Optimization of the Reaction between 5-O-Caffeoylquinic Acid (5-CQA) and Tryptophan—Isolation of the Product and Its Evaluation as a Food Dye

    No full text
    The food industry is seeking a stable, non-toxic red dye as a substitute for synthetic pigments. This can result from the reaction between 5-O-Caffeoylquinic acid (5-CQA) and tryptophan (TRP). This study explores the reaction kinetics under ultrasound conditions and investigates reaction parameters, such as pH, temperature, and reactants’ concentrations, to accelerate the reaction. At the end of the reaction, the solution was either spray-dried or acidified to isolate the pigment, which was evaluated for its potential as a food dye. Using ultrasound at 40 °C led to a significant acceleration of the reaction that was completed in 8.5 h, marking a 300% improvement compared to literature. The caffeic acid, and not the quinic acid, moiety of 5-CQA seems to be partly responsible for the formation of the red pigment. The pH had a profound impact on the reaction rate, with an optimal value of pH = 9.5. Increased TRP concentrations led to increased reaction rates, while higher 5-CQA concentrations led to significant deviations from redness (a* value). The pigment, lacking significant antimicrobial activity, exhibited remarkable thermal stability (pH 3–9), delaying food oxidation and color deterioration. The results indicate that the reaction can be significantly accelerated by ultrasound, which will be useful for the scale-up of the process and giving the produced pigment the potential for use as an alternative to artificial coloring
    corecore