Search CORE

326 research outputs found

Learning to detect video events from zero or very few video examples

Author: Galanopoulos Damianos
Mezaris Vasileios
Patras Ioannis
Tzelepis Christos
Publication venue: 'Elsevier BV'
Publication date: 25/11/2015
Field of study

In this work we deal with the problem of high-level event detection in video. Specifically, we study the challenging problems of i) learning to detect video events from solely a textual description of the event, without using any positive video examples, and ii) additionally exploiting very few positive training samples together with a small number of ``related'' videos. For learning only from an event's textual description, we first identify a general learning framework and then study the impact of different design choices for various stages of this framework. For additionally learning from example videos, when true positive training samples are scarce, we employ an extension of the Support Vector Machine that allows us to exploit ``related'' event videos by automatically introducing different weights for subsets of the videos in the overall training set. Experimental evaluations performed on the large-scale TRECVID MED 2014 video dataset provide insight on the effectiveness of the proposed methods.Comment: Image and Vision Computing Journal, Elsevier, 2015, accepted for publicatio

arXiv.org e-Print Archive

City Research Online

Deliverable D1.4 Visual, text and audio information analysis for hypervideo, final release

Author: Apostolidis E. (Evlampios)
et al.
Publication venue
Publication date: 30/09/2014
Field of study

Having extensively evaluated the performance of the technologies included in the first release of WP1 multimedia analysis tools, using content from the LinkedTV scenarios and by participating in international benchmarking activities, concrete decisions regarding the appropriateness and the importance of each individual method or combination of methods were made, which, combined with an updated list of information needs for each scenario, led to a new set of analysis requirements that had to be addressed through the release of the final set of analysis techniques of WP1. To this end, coordinated efforts on three directions, including (a) the improvement of a number of methods in terms of accuracy and time efficiency, (b) the development of new technologies and (c) the definition of synergies between methods for obtaining new types of information via multimodal processing, resulted in the final bunch of multimedia analysis methods for video hyperlinking. Moreover, the different developed analysis modules have been integrated into a web-based infrastructure, allowing the fully automatic linking of the multitude of WP1 technologies and the overall LinkedTV platform

CWI's Institutional Repository

Deliverable D1.1 State of the art and requirements analysis for hypervideo

Author: Apostolidis E. (Evlampios)
et al.
Publication venue
Publication date: 30/09/2012
Field of study

This deliverable presents a state-of-art and requirements analysis report for hypervideo authored as part of the WP1 of the LinkedTV project. Initially, we present some use-case (viewers) scenarios in the LinkedTV project and through the analysis of the distinctive needs and demands of each scenario we point out the technical requirements from a user-side perspective. Subsequently we study methods for the automatic and semi-automatic decomposition of the audiovisual content in order to effectively support the annotation process. Considering that the multimedia content comprises of different types of information, i.e., visual, textual and audio, we report various methods for the analysis of these three different streams. Finally we present various annotation tools which could integrate the developed analysis results so as to effectively support users (video producers) in the semi-automatic linking of hypervideo content, and based on them we report on the initial progress in building the LinkedTV annotation tool. For each one of the different classes of techniques being discussed in the deliverable we present the evaluation results from the application of one such method of the literature to a dataset well-suited to the needs of the LinkedTV project, and we indicate the future technical requirements that should be addressed in order to achieve higher levels of performance (e.g., in terms of accuracy and time-efficiency), as necessary

CWI's Institutional Repository

Multimodal Subspace Support Vector Data Description

Author: Gabbouj Moncef
Iosifidis Alexandros
Raitoharju Jenni
Sohrab Fahad
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

In this paper, we propose a novel method for projecting data from multiple modalities to a new subspace optimized for one-class classification. The proposed method iteratively transforms the data from the original feature space of each modality to a new common feature space along with finding a joint compact description of data coming from all the modalities. For data in each modality, we define a separate transformation to map the data from the corresponding feature space to the new optimized subspace by exploiting the available information from the class of interest only. We also propose different regularization strategies for the proposed method and provide both linear and non-linear formulations. The proposed Multimodal Subspace Support Vector Data Description outperforms all the competing methods using data from a single modality or fusing data from all modalities in four out of five datasets.Comment: 26 pages manuscript (6 tables, 2 figures), 24 pages supplementary material (27 tables, 10 figures). The manuscript and supplementary material are combined as a single .pdf (50 pages) fil

arXiv.org e-Print Archive

Trepo - Institutional Repository of Tampere University

Robust Deep Learning Based Framework for Detecting Cyber Attacks from Abnormal Network Traffic

Author: Narsimha G.
Swathi K.
Publication venue: Auricle Global Society of Education and Research
Publication date: 01/09/2023
Field of study

The internet's recent rapid growth and expansion have raised concerns about cyberattacks, which are constantly evolving and changing. As a result, a robust intrusion detection system was needed to safeguard data. One of the most effective ways to meet this problem was by creating the artificial intelligence subfields of machine learning and deep learning models. Network integration is frequently used to enable remote management, monitoring, and reporting for cyber-physical systems (CPS). This work addresses the primary assault categories such as Denial of Services(DoS), Probe, User to Root(U2R) and Root to Local(R2L) attacks. As a result, we provide a novel Recurrent Neural Networks (RNN) cyberattack detection framework that combines AI and ML techniques. To evaluate the developed system, we employed the Network Security Laboratory-Knowledge Discovery Databases (NSL-KDD), which covered all critical threats. We used normalisation to eliminate mistakes and duplicated data before pre-processing the data. Linear Discriminant Analysis(LDA) is used to extract the characteristics. The fundamental rationale for choosing RNN-LDA for this study is that it is particularly efficient at tackling sequence issues, time series prediction, text generation, machine translation, picture descriptions, handwriting recognition, and other tasks. The proposed model RNN-LDA is used to learn time-ordered sequences of network flow traffic and assess its performance in detecting abnormal behaviour. According to the results of the experiments, the framework is more effective than traditional tactics at ensuring high levels of privacy. Additionally, the framework beats current detection techniques in terms of detection rate, false positive rate, and processing time

International Journal on Recent and Innovation Trends in Computing and Communication

Deliverable D9.3 Final Project Report

Author: et al.
Köhler J. (Joachim)
Publication venue
Publication date: 30/03/2015
Field of study

This document comprises the final report of LinkedTV. It includes a publishable summary, a plan for use and dissemination of foreground and a report covering the wider societal implications of the project in the form of a questionnaire

CWI's Institutional Repository

Vision-based techniques for automatic marine plankton classification

Author: Bandera-Rubio Antonio Jesús
González-García Martín
Hernández León Santiago
Sosa-Trejo David
Publication venue: Springer
Publication date: 01/01/2023
Field of study

Plankton are an important component of life on Earth. Since the 19th century, scientists have attempted to quantify species distributions using many techniques, such as direct counting, sizing, and classification with microscopes. Since then, extraordinary work has been performed regarding the development of plankton imaging systems, producing a massive backlog of images that await classification. Automatic image processing and classification approaches are opening new avenues for avoiding time-consuming manual procedures. While some algorithms have been adapted from many other applications for use with plankton, other exciting techniques have been developed exclusively for this issue. Achieving higher accuracy than that of human taxonomists is not yet possible, but an expeditious analysis is essential for discovering the world beyond plankton. Recent studies have shown the imminent development of real-time, in situ plankton image classification systems, which have only been slowed down by the complex implementations of algorithms on low-power processing hardware. This article compiles the techniques that have been proposed for classifying marine plankton, focusing on automatic methods that utilize image processing, from the beginnings of this field to the present day.Funding for open access charge: Universidad de Málaga / CBUA. Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. The authors wish to thank Alonso Hernández-Guerra for his frm support in the development of oceanographic technology. Special thanks to Laia Armengol for her help in the domain of plankton. This study has been funded by Feder of the UE through the RES-COAST Mac-Interreg pro ject (MAC2/3.5b/314). We also acknowledge the European Union projects SUMMER (Grant Agreement 817806) and TRIATLAS (Grant Agreement 817578) from the Horizon 2020 Research and Innovation Programme and the Ministry of Science from the Spanish Government through the Project DESAFÍO (PID2020-118118RB-I00)

Repositorio Institucional Universidad de Málaga

Low-Rank and Sparse Decomposition for Hyperspectral Image Enhancement and Clustering

Author: Tian Long
Publication venue: Scholars Junction
Publication date: 03/05/2019
Field of study

In this dissertation, some new algorithms are developed for hyperspectral imaging analysis enhancement. Tensor data format is applied in hyperspectral dataset sparse and low-rank decomposition, which could enhance the classification and detection performance. And multi-view learning technique is applied in hyperspectral imaging clustering. Furthermore, kernel version of multi-view learning technique has been proposed, which could improve clustering performance. Most of low-rank and sparse decomposition algorithms are based on matrix data format for HSI analysis. As HSI contains high spectral dimensions, tensor based extended low-rank and sparse decomposition (TELRSD) is proposed in this dissertation for better performance of HSI classification with low-rank tensor part, and HSI detection with sparse tensor part. With this tensor based method, HSI is processed in 3D data format, and information between spectral bands and pixels maintain integrated during decomposition process. This proposed algorithm is compared with other state-of-art methods. And the experiment results show that TELRSD has the best performance among all those comparison algorithms. HSI clustering is an unsupervised task, which aims to group pixels into different groups without labeled information. Low-rank sparse subspace clustering (LRSSC) is the most popular algorithms for this clustering task. The spatial-spectral based multi-view low-rank sparse subspace clustering (SSMLC) algorithms is proposed in this dissertation, which extended LRSSC with multi-view learning technique. In this algorithm, spectral and spatial views are created to generate multi-view dataset of HSI, where spectral partition, morphological component analysis (MCA) and principle component analysis (PCA) are applied to create others views. Furthermore, kernel version of SSMLC (k-SSMLC) also has been investigated. The performance of SSMLC and k-SSMLC are compared with sparse subspace clustering (SSC), low-rank sparse subspace clustering (LRSSC), and spectral-spatial sparse subspace clustering (S4C). It has shown that SSMLC could improve the performance of LRSSC, and k-SSMLC has the best performance. The spectral clustering has been proved that it equivalent to non-negative matrix factorization (NMF) problem. In this case, NMF could be applied to the clustering problem. In order to include local and nonlinear features in data source, orthogonal NMF (ONMF), graph-regularized NMF (GNMF) and kernel NMF (k-NMF) has been proposed for better clustering performance. The non-linear orthogonal graph NMF combine both kernel, orthogonal and graph constraints in NMF (k-OGNMF), which push up the clustering performance further. In the HSI domain, kernel multi-view based orthogonal graph NMF (k-MOGNMF) is applied for subspace clustering, where k-OGNMF is extended with multi-view algorithm, and it has better performance and computation efficiency

Mississippi State University Libraries ETD database

Scholars Junction - Mississippi State University Institutional Repository

Recommended from our members

Exploiting Concepts In Videos For Video Event Detection

Author: Can Ethem
Publication venue: ScholarWorks@UMass Amherst
Publication date: 09/11/2015
Field of study

Video event detection is the task of searching videos for events of interest to a user where an event is a complex activity which is localized in time and space. The video event detection problem has gained more importance as the amount of online video is increasing by more than 300 hours every minute on Youtube alone. In this thesis, we tackle three major video event detection problems: video event detection with exemplars (VED-ex), where a large number of example videos are associated with queries; video event detection with few exemplars (VED-ex_few), in which only a small number of example videos are associated with queries; and zero-shot video event detection (VED-zero), where no exemplar videos are associated with queries. We first define a new way of describing videos concisely, one that is built around using query-independent concepts (e.g., a fixed set of concepts for all queries) with a space-efficient representation. Using query-independent concepts enables us to learn a retrieval model for any query without requiring a new set of concepts. Our space-efficient representation helps reduce the amount of time required to train/test a retrieval model and the amount of space to store video representations on disk. When the number of example videos associated with a query decreases, the retrieval accuracy decreases as well. We present a method that incorporates multiple one-exemplar models into video event detection aiming at improving retrieval accuracies when there are few exemplars available. By incorporating multiple one-exemplar models into video event detection with few exemplars, we are able to obtain significant improvements in terms of mean average precision compared to the case of a monolithic model. Having no exemplar videos associated with queries makes the video event detection problem more challenging as we cannot train a retrieval model using example videos. It is also more realistic since compiling a number of example videos might be costly. We tackle this problem by providing a new and effective zero-shot video event detection model that exploits dependencies of concepts in videos. Our dependency work uses a Markov Random Field (MRF) based retrieval model and assumes three dependency settings: 1) full independence, where each concept is considered independently; 2) spatial dependence, where the co-occurrence of two concepts in the same video frame is treated as important; and 3) temporal dependence, where having concepts co-occur in consecutive frames is treated as important. Our MRF based retrieval model improves retrieval accuracies significantly compared to the common bag-of-concepts approach with an independence assumption

ScholarWorks@UMass Amherst