Search CORE

16 research outputs found

A Web Service for Video Summarization

Author: Fajtl Jiri
Fajtl Jiri
Fu Tsu-Jui
Gong Boqing
Potapov Danila
Song Yale
Yuan Li
Zhou Kaiyang
Zhou Kaiyang
Publication venue
Publication date: 17/06/2020
Field of study

This paper presents a Web service that supports the automatic generation of video summaries for user-submitted videos. The developed Web application decomposes the video into segments, evaluates the fitness of each segment to be included in the video summary and selects appropriate segments until a pre-defined time budget is filled. The integrated deep-learning-based video analysis and summarization technologies exhibit state-of-the-art performance and, by exploiting the processing capabilities of modern GPUs, offer faster than real-time processing. Configurations for generating video summaries that fulfill the specifications for posting on the most common video sharing platforms and social networks are available in the user interface of this application, enabling the one-click generation of distribution-channel-specific summaries

Crossref

ZENODO

Queen Mary Research Online

A Comparison of Embedded Deep Learning Methods for Person Detection

Author: Argyriou Vasileios
Braz J
Fajtl Jiri
Farinella GM
Kim Chloe Eunhyang
Oghaz Mandi Maktab Dar
Remagnino Paolo
Tremeau A
Publication venue
Publication date: 01/01/2019
Field of study

Recent advancements in parallel computing, GPU technology and deep learning provide a new platform for complex image processing tasks such as person detection to flourish. Person detection is fundamental preliminary operation for several high level computer vision tasks. One industry that can significantly benefit from person detection is retail. In recent years, various studies attempt to find an optimal solution for person detection using neural networks and deep learning. This study conducts a comparison among the state of the art deep learning base object detector with the focus on person detection performance in indoor environments. Performance of various implementations of YOLO, SSD, RCNN, R-FCN and SqueezeDet have been assessed using our in-house proprietary dataset which consists of over 10 thousands indoor images captured form shopping malls, retails and stores. Experimental results indicate that, Tiny YOLO-416 and SSD (VGG-300) are the fastest and Faster-RCNN (Inception ResNet-v2) and R-FCN (ResNet-101) are the most accurate detectors investigated in this study. Further analysis shows that YOLO v3-416 delivers relatively accurate result in a reasonable amount of time, which makes it an ideal model for person detection in embedded platforms

arXiv.org e-Print Archive

Durham Research Online

Crossref

Two-year recall for people with no diabetic retinopathy : a multiethnic population-based retrospective cohort study using real-world data to quantify the effect

Author: Anderson John
Barman Sarah
Bolter Louis
Chambers Ryan
Chew Emily Y
Egan Catherine
Fajtl Jiri
Ferris Frederick L.
Hingorani Aroon D
Lee Aaron
Olvera-Barrios Abraham
Owen Christopher G
Remagnino Paolo
Rudnicka Alicja R
Sofat Reecha
the ARIAS Research Group
Tufail Adnan
Warwick Alasdair N
Welikala Roshan
Wu Yue
Publication venue: BMJ Publishing Group
Publication date
Field of study

Kingston University Research Repository

Sladkovodni sedimenty, rizika a moznosti zpracovani.

Author: Fajtl Jiri
Publication venue: Ceske Budejovice (Czech Republic) : Jihoceska univerzita v Ceskych Budejovicich
Publication date: 01/01/2002
Field of study

The aim of this thesis is to evaluate chemical changes after aeration of a contaminated freshwater sediment (changes in oxidative-reduction potential and pH values, sulphate leaching and leaching of selected cations of potentially toxic metals), and to explore two new techniques for sulphate treatment: (i) inhibition of sulphate production via suppression of microbial oxidative processes in the sediment by application of NaOH,Available from STL Prague, CZ / NTK - National Technical LibrarySIGLECZCzech Republi

OpenGrey Repository

Visual memories

Author: Fajtl Jiri
Publication venue: Kingston University
Publication date
Field of study

Despite the rapid progress in the field of artificial intelligence, there are still important new areas to be explored and existing methods enhanced to make machines think like humans. This thesis conducts research in four machine learning and computer vision areas in this direction. First, we study what makes some images more memorable than others and propose a new machine learning method to learn and predict image memorability, closely matching human performance. A spatial attention function is learnt to localize image regions responsible for the image retention in memory. To identify meaningful temporal segments in a video stream, we study episodic segmentation in our memory and design a novel algorithm for video summarization to mimic human capabilities. A soft, self-attention method without a recurrent network is used to learn frame importance scores for the video summarization. This simple algorithm demonstrates a performance superior to the current state-of-the-art methods. Inspired by our brain’s ability to project high dimensional visual information to computationally efficient, meaningful representations, we propose a method for latent binary representations learning and methods for operations in this discrete latent space such as interpolation, novel image generation, and attribute modification outperforming more complex published methods. To advance methods targeting catastrophic interference, one of the most fundamental problems of artificial neural networks, we study elementary neural mechanisms mitigating this phenomenon in our brain’s memory. Building on our insights on the function of pattern separation in the hippocampus, we propose a conceptually simple and resource-efficient method to learn high dimensional sparse binary representations for continual learning. By performing elementary binary operations or and and over a continual stream of sparse representations of novel classes, our method exhibits performance significantly exceeding the current state-of-the-art meta-learning methods on identical benchmarks

Kingston University Research Repository

Improving Dataset Volumes and Model Accuracy With Semi-Supervised Iterative Self-Learning

Author: Argyriou Vasileios
Dupre Robert
Fajtl Jiri
Remagnino Paolo
Publication venue: Institute of Electrical and Electronics Engineers
Publication date: 01/01/2020
Field of study

Durham Research Online

Kingston University Research Repository

AMNet: Memorability Estimation with Attention

Author: Argyriou Vasileios
Fajtl Jiri
Monekosso Dorothy
Remagnino Paolo
Publication venue
Publication date: 09/04/2018
Field of study

In this paper we present the design and evaluation of an end-to-end trainable, deep neural network with a visual attention mechanism for memorability estimation in still images. We analyze the suitability of transfer learning of deep models from image classification to the memorability task. Further on we study the impact of the attention mechanism on the memorability estimation and evaluate our network on the SUN Memorability and the LaMem datasets. Our network outperforms the existing state of the art models on both datasets in terms of the Spearman's rank correlation as well as the mean squared error, closely matching human consistency.Comment: To appear at CVPR 201

arXiv.org e-Print Archive

Durham Research Online

Crossref

Kingston University Research Repository

Leeds Beckett Repository

DEEP RESIDUAL NETWORK WITH SUBCLASS DISCRIMINANT ANALYSIS FOR CROWD BEHAVIOR RECOGNITION

Author: Argyriou Vasileios
Fajtl Jiri
Mandal Bappaditya
Monekosso Dorothy
Remagnino Paolo
Publication venue
Publication date: 06/09/2018
Field of study

In this work, we extract rich representations of crowd behavior from video using a fine-tuned deep convolutional neural residual network. Using spatial partitioning trees we create subclasses within the feature maps from each of the crowd behavior attributes (classes). Features from these subclasses are then regularized using an eigen modeling scheme. This enables to model the variance appearing from the intra-subclass information. Low dimensional discriminative features are extracted after using the total subclass scatter information. Dynamic time warping is used on the cosine distance measure to find the similarity measure between videos. A 1-nearest neighbor (NN) classifier is used to find the respective crowd behavior attribute classes from the normal videos. Experimental results on large crowd behavior video database show the superior performance of our proposed framework as compared to the baseline and current state-of-the-art methodologies for the crowd behavior recognition task

Durham Research Online

Keele Research Repository

Crossref

Kingston University Research Repository

Summarizing videos with attention

Author: Argyriou Vasileios
Fajtl Jiri
Monekosso Dorothy
Remagnino Paolo
Sadeghi Sokeh Hajar
Publication venue
Publication date
Field of study

Kingston University Research Repository

Latent Bernoulli Autoencoder

Author: Argyriou Vasileios
Daume H
Fajtl Jiri
Monekosso Dorothy
Remagnino Paolo
Singh A
Publication venue
Publication date: 31/12/2020
Field of study

Durham Research Online