Search CORE

155 research outputs found

Deep Neural Network with l2-norm Unit for Brain Lesions Detection

Author: B Menze
C Gulcehre
J Uijlings
M Havaei
O Ronneberger
Y LeCun
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 17/08/2017
Field of study

Automated brain lesions detection is an important and very challenging clinical diagnostic task because the lesions have different sizes, shapes, contrasts, and locations. Deep Learning recently has shown promising progress in many application fields, which motivates us to apply this technology for such important problem. In this paper, we propose a novel and end-to-end trainable approach for brain lesions classification and detection by using deep Convolutional Neural Network (CNN). In order to investigate the applicability, we applied our approach on several brain diseases including high and low-grade glioma tumor, ischemic stroke, Alzheimer diseases, by which the brain Magnetic Resonance Images (MRI) have been applied as an input for the analysis. We proposed a new operating unit which receives features from several projections of a subset units of the bottom layer and computes a normalized l2-norm for next layer. We evaluated the proposed approach on two different CNN architectures and number of popular benchmark datasets. The experimental results demonstrate the superior ability of the proposed approach.Comment: Accepted for presentation in ICONIP-201

arXiv.org e-Print Archive

Crossref

Multiple instance cancer detection by boosting regularised trees

Author: H. Deng
J. Amores
J.R.R. Uijlings
M. Kandemir
Y. Chen
Y. Xu
Z. Fu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/12/2099
Field of study

Crossref

University of Dundee Online Publications

Learning to Group Objects

Author: Sebe N.
Uijlings J. R. R.
Yanulevskaya V.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/06/2014
Field of study

Crossref

Edinburgh Research Explorer

End-to-End Localization and Ranking for Relative Attributes

Author: A Shrivastava
CL Zitnick
J. R. R. Uijlings
M Rastegari
MH Kiapour
N Kumar
S Branson
S Li
Publication venue
Publication date: 08/08/2016
Field of study

We propose an end-to-end deep convolutional network to simultaneously localize and rank relative visual attributes, given only weakly-supervised pairwise image comparisons. Unlike previous methods, our network jointly learns the attribute's features, localization, and ranker. The localization module of our network discovers the most informative image region for the attribute, which is then used by the ranking module to learn a ranking model of the attribute. Our end-to-end framework also significantly speeds up processing and is much faster than previous methods. We show state-of-the-art ranking results on various relative attribute datasets, and our qualitative localization results clearly demonstrate our network's ability to learn meaningful image patches.Comment: Appears in European Conference on Computer Vision (ECCV), 201

arXiv.org e-Print Archive

Crossref

The Visual Extent of an Object

Author: A. W. M. Smeulders
J. R. R. Uijlings
R. J. H. Scha
Publication venue: Springer Nature
Publication date: 01/01/2011
Field of study

The visual extent of an object reaches beyond the object itself. This is a long standing fact in psychology and is reflected in image retrieval techniques which aggregate statistics from the whole image in order to identify the object within. However, it is unclear to what degree and how the visual extent of an object affects classification performance. In this paper we investigate the visual extent of an object on the Pascal VOC dataset using a Bag-of-Words implementation with (colour) SIFT descriptors. Our analysis is performed from two angles. (a) Not knowing the object location, we determine where in the image the support for object classification resides. We call this the normal situation. (b) Assuming that the object location is known, we evaluate the relative potential of the object and its surround, and of the object border and object interior. We call this the ideal situation. Our most important discoveries are: (i) Surroundings can adequately distinguish between groups of classes: furniture, animals, and land-vehicles. For distinguishing categories within one group the surroundings become a source of confusion. (ii) The physically rigid plane, bike, bus, car, and train classes are recognised by interior boundaries and shape, not by texture. The non-rigid animals dog, cat, cow, and sheep are recognised primarily by texture, i.e. fur, as their projected shape varies greatly. (iii) We confirm an early observation from human psychology (Biederman in Perceptual Organization, pp. 213-263, 1981): in the ideal situation with known object locations, recognition is no longer improved by considering surroundings. In contrast, in the normal situation with unknown object locations, the surroundings significantly contribute to the recognition of most classes

Crossref

Springer - Publisher Connector

CWI's Institutional Repository

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Receptive Field Block Net for Accurate and Fast Object Detection

Author: Brian A. Wandell
D Huang
D Weng
J. R. R. Uijlings
Karen Simonyan
M Brown
Mark Everingham
Olga Russakovsky
T-Y Lin
W Liu
Publication venue
Publication date: 26/07/2018
Field of study

Current top-performing object detectors depend on deep CNN backbones, such as ResNet-101 and Inception, benefiting from their powerful feature representations but suffering from high computational costs. Conversely, some lightweight model based detectors fulfil real time processing, while their accuracies are often criticized. In this paper, we explore an alternative to build a fast and accurate detector by strengthening lightweight features using a hand-crafted mechanism. Inspired by the structure of Receptive Fields (RFs) in human visual systems, we propose a novel RF Block (RFB) module, which takes the relationship between the size and eccentricity of RFs into account, to enhance the feature discriminability and robustness. We further assemble RFB to the top of SSD, constructing the RFB Net detector. To evaluate its effectiveness, experiments are conducted on two major benchmarks and the results show that RFB Net is able to reach the performance of advanced very deep detectors while keeping the real-time speed. Code is available at https://github.com/ruinmessi/RFBNet.Comment: Accepted by ECCV 201

arXiv.org e-Print Archive

Crossref

A Diagram Is Worth A Dozen Images

Author: B Alexe
CL Zitnick
F Pedregosa
J von Engelhardt
JRR Uijlings
M Twyman
R Horn
R Koncel-Kedziorski
RK Srihari
RW Ferguson
S Antol
S Hochreiter
SC Zhu
SK Card
Publication venue
Publication date: 23/03/2016
Field of study

Diagrams are common tools for representing complex concepts, relationships and events, often when it would be difficult to portray the same information with natural images. Understanding natural images has been extensively studied in computer vision, while diagram understanding has received little attention. In this paper, we study the problem of diagram interpretation and reasoning, the challenging task of identifying the structure of a diagram and the semantics of its constituents and their relationships. We introduce Diagram Parse Graphs (DPG) as our representation to model the structure of diagrams. We define syntactic parsing of diagrams as learning to infer DPGs for diagrams and study semantic interpretation and reasoning of diagrams in the context of diagram question answering. We devise an LSTM-based method for syntactic parsing of diagrams and introduce a DPG-based attention model for diagram question answering. We compile a new dataset of diagrams with exhaustive annotations of constituents and relationships for over 5,000 diagrams and 15,000 questions and answers. Our results show the significance of our models for syntactic parsing and question answering in diagrams using DPGs

arXiv.org e-Print Archive

Crossref

The devil is in the decoder

Author: Chen LC
Fathi A
Ferrari V
Guadarrama S
Silberman N
Uijlings J
Wojna Z
Publication venue: BMVA Press
Publication date: 01/01/2017
Field of study

Many machine vision applications require predictions for every pixel of the input image (for example semantic segmentation, boundary detection). Models for such problems usually consist of encoders which decreases spatial resolution while learning a high-dimensional representation, followed by decoders who recover the original input resolution and result in low-dimensional predictions. While encoders have been studied rigorously, relatively few studies address the decoder side. Therefore this paper presents an extensive comparison of a variety of decoders for a variety of pixel-wise prediction tasks. Our contributions are: (1) Decoders matter: we observe significant variance in results between different types of decoders on various problems. (2) We introduce a novel decoder: bilinear additive upsampling. (3) We introduce new residual-like connections for decoders. (4) We identify two decoder types which give a consistently high performance

Crossref

UCL Discovery

UvA-DARE (Digital Academic Repository) In a hurry to work with high-speed video at school?

Author: A J P
P H M Uijlings
Publication venue
Publication date: 05/03/2020
Field of study

CiteSeerX

Realtime Video Classification Using Dense HOF/HOG

Author: Duta I. C.
Rostamzadeh N.
Sebe N.
Uijlings J. R. R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2014
Field of study

ABSTRACT The current state-of-the-art in Video Classification is based on Bag-of-Words using local visual descriptors. Most commonly these are Histogram of Oriented Gradient (HOG) and Histogram of Optical Flow (HOF) descriptors. While such system is very powerful for classification, it is also computationally expensive. This paper addresses the problem of computational efficiency. Specifically: (1) We propose several speed-ups for densely sampled HOG and HOF descriptors and release Matlab code. (2) We investigate the trade-off between accuracy and computational efficiency of descriptors in terms of frame sampling rate and type of Optical Flow method. (3) We investigate the trade-off between accuracy and computational efficiency for the video representation, using either a k-means or hierarchical k-means based visual vocabulary, a Random Forest based vocabulary or the Fisher kernel

CiteSeerX

Crossref

Edinburgh Research Explorer