Search CORE

101 research outputs found

Deep Learning based Novel Anomaly Detection Methods for Diabetic Retinopathy Screening

Author: Sutradhar Shaon
Publication venue
Publication date: 01/01/2023
Field of study

Programa Oficial de Doutoramento en Computación. 5009V01[Abstract] Computer-Aided Screening (CAS) systems are getting popularity in disease diagnosis. Modern CAS systems exploit data driven machine learning algorithms including supervised and unsupervised methods. In medical imaging, annotating pathological samples are much harder and time consuming work than healthy samples. Therefore, there is always an abundance of healthy samples and scarcity of annotated and labelled pathological samples. Unsupervised anomaly detection algorithms can be implemented for the development of CAS system using the largely available healthy samples, especially when disease/nodisease decision is important for screening. This thesis proposes unsupervised machine learning methodologies for anomaly detection in retinal fundus images. A novel patchbased image reconstructor architecture for DR detection is presented, that addresses the shortcomings of standard autoencoders-based reconstructors. Furthermore, a full-size image based anomaly map generation methodology is presented, where the potential DR lesions can be visualized at the pixel-level. Afterwards, a novel methodology is proposed to extend the patch-based architecture to a fully-convolutional architecture for one-shot full-size image reconstruction. Finally, a novel methodology for supervised DR classification is proposed that utilizes the anomaly maps

Repositorio da Universidade da Coruña

Towards Universal Image Embeddings: A Large-Scale Dataset and Challenge for Generic Image Representations

Author: Araujo André
Bluntschli Boris
Cao Bingyi
Chen Kaifeng
Chum Ondřej
Dogan-Schönberger Pelin
Lipovský Mário
Makosa Grzegorz
Seyedhosseini Mojtaba
Ypsilantis Nikolaos-Antonios
Publication venue
Publication date: 04/09/2023
Field of study

Fine-grained and instance-level recognition methods are commonly trained and evaluated on specific domains, in a model per domain scenario. Such an approach, however, is impractical in real large-scale applications. In this work, we address the problem of universal image embedding, where a single universal model is trained and used in multiple domains. First, we leverage existing domain-specific datasets to carefully construct a new large-scale public benchmark for the evaluation of universal image embeddings, with 241k query images, 1.4M index images and 2.8M training images across 8 different domains and 349k classes. We define suitable metrics, training and evaluation protocols to foster future research in this area. Second, we provide a comprehensive experimental evaluation on the new dataset, demonstrating that existing approaches and simplistic extensions lead to worse performance than an assembly of models trained for each domain separately. Finally, we conducted a public research competition on this topic, leveraging industrial datasets, which attracted the participation of more than 1k teams worldwide. This exercise generated many interesting research ideas and findings which we present in detail. Project webpage: https://cmp.felk.cvut.cz/univ_emb/Comment: ICCV 2023 Accepte

arXiv.org e-Print Archive

Violence detection based on spatio-temporal feature and fisher vector

Author: Cai H
He X
Huang X
Jiang H
Yang J
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

© Springer Nature Switzerland AG 2018. A novel framework based on local spatio-temporal features and a Bag-of-Words (BoW) model is proposed for violence detection. The framework utilizes Dense Trajectories (DT) and MPEG flow video descriptor (MF) as feature descriptors and employs Fisher Vector (FV) in feature coding. DT and MF algorithms are more descriptive and robust, because they are combinations of various feature descriptors, which describe trajectory shape, appearance, motion and motion boundary, respectively. FV is applied to transform low level features to high level features. FV method preserves much information, because not only the affiliations of descriptors are found in the codebook, but also the first and second order statistics are used to represent videos. Some tricks, that PCA, K-means++ and codebook size, are used to improve the final performance of video classification. In comprehensive consideration of accuracy, speed and application scenarios, the proposed method for violence detection is analysed. Experimental results show that the proposed approach outperforms the state-of-the-art approaches for violence detection in both crowd scenes and non-crowd scenes

OPUS - University of Technology Sydney

Median K-flats for hybrid linear modeling with many outliers

Author: Lerman Gilad
Szlam Arthur
Zhang Teng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 16/09/2009
Field of study

We describe the Median K-Flats (MKF) algorithm, a simple online method for hybrid linear modeling, i.e., for approximating data by a mixture of flats. This algorithm simultaneously partitions the data into clusters while finding their corresponding best approximating l1 d-flats, so that the cumulative l1 error is minimized. The current implementation restricts d-flats to be d-dimensional linear subspaces. It requires a negligible amount of storage, and its complexity, when modeling data consisting of N points in D-dimensional Euclidean space with K d-dimensional linear subspaces, is of order O(n K d D+n d^2 D), where n is the number of iterations required for convergence (empirically on the order of 10^4). Since it is an online algorithm, data can be supplied to it incrementally and it can incrementally produce the corresponding output. The performance of the algorithm is carefully evaluated using synthetic and real data

arXiv.org e-Print Archive

Crossref