283 research outputs found
The Visual Extent of an Object
The visual extent of an object reaches beyond the object itself. This is a long standing fact in psychology and is reflected in image retrieval techniques which aggregate statistics from the whole image in order to identify the object within. However, it is unclear to what degree and how the visual extent of an object affects classification performance. In this paper we investigate the visual extent of an object on the Pascal VOC dataset using a Bag-of-Words implementation with (colour) SIFT descriptors. Our analysis is performed from two angles. (a) Not knowing the object location, we determine where in the image the support for object classification resides. We call this the normal situation. (b) Assuming that the object location is known, we evaluate the relative potential of the object and its surround, and of the object border and object interior. We call this the ideal situation. Our most important discoveries are: (i) Surroundings can adequately distinguish between groups of classes: furniture, animals, and land-vehicles. For distinguishing categories within one group the surroundings become a source of confusion. (ii) The physically rigid plane, bike, bus, car, and train classes are recognised by interior boundaries and shape, not by texture. The non-rigid animals dog, cat, cow, and sheep are recognised primarily by texture, i.e. fur, as their projected shape varies greatly. (iii) We confirm an early observation from human psychology (Biederman in Perceptual Organization, pp. 213-263, 1981): in the ideal situation with known object locations, recognition is no longer improved by considering surroundings. In contrast, in the normal situation with unknown object locations, the surroundings significantly contribute to the recognition of most classes
High-level feature detection from video in TRECVid: a 5-year retrospective of achievements
Successful and effective content-based access to digital
video requires fast, accurate and scalable methods to determine the video content automatically. A variety of contemporary approaches to this rely on text taken from speech within the video, or on matching one video frame against others using low-level characteristics like
colour, texture, or shapes, or on determining and matching objects appearing within the video. Possibly the most important technique, however, is one which determines the presence or absence of a high-level or semantic feature, within a video clip or shot. By utilizing dozens, hundreds or even thousands of such semantic features we can support many kinds of content-based video navigation. Critically however, this depends on being able to determine whether each feature is or is not present in a video clip.
The last 5 years have seen much progress in the development of techniques to determine the presence of semantic features within video. This progress can be tracked in the annual TRECVid benchmarking activity where dozens of research groups measure the effectiveness of their techniques on common data and using an open, metrics-based approach. In this chapter we summarise the work
done on the TRECVid high-level feature task, showing the
progress made year-on-year. This provides a fairly comprehensive statement on where the state-of-the-art is regarding this important task, not just for one research group or for one approach, but across the spectrum. We then use this past and on-going work as a basis for highlighting the trends that are emerging in this area, and the questions which remain to be addressed before we can
achieve large-scale, fast and reliable high-level feature detection on video
Fast anisotropic Gauss filtering
Abstract. We derive the decomposition of the anisotropic Gaussian in a one dimensional Gauss filter in the x-direction followed by a one dimensional filter in a non-orthogonal direction Ï. So also the anisotropic Gaussian can be decomposed by dimension. This appears to be extremely efficient from a computing perspective. An implementation scheme for normal convolution and for recursive filtering is proposed. Also directed derivative filters are demonstrated. For the recursive implementation, filtering an 512 Ă 512 image is performed within 65 msec, independent of the standard deviations and orientation of the filter. Accuracy of the filters is still reasonable when compared to truncation error or recursive approximation error. The anisotropic Gaussian filtering method allows fast calculation of edge and ridge maps, with high spatial and angular accuracy. For tracking applications, the normal anisotropic convolution scheme is more advantageous, with applications in the detection of dashed lines in engineering drawings. The recursive implementation is more attractive in feature detection applications, for instance in affine invariant edge and ridge detection in computer vision. The proposed computational filtering method enables the practical applicability of orientation scale-space analysis
Shockwaves and the Rolling Stones: An Overview of Pediatric Stone Disease
Urinary stone disease is a common problem in adults, with an estimated 10% to 20% lifetime risk of developing a stone and an annual incidence of almost 1%. In contrast, in children, even though the incidence appears to be increasing, urinary tract stones are a rare problem, with an estimated incidence of approximately 5 to 36 per 100,000 children. Consequently, typical complications of rare diseases, such as delayed diagnosis, lack of awareness, and specialist knowledge, as well as difficulties accessing specific treatments also affect children with stone disease. Indeed, because stone disease is such a common problem in adults, frequently, it is adult practitioners who will first be asked to manage affected children. Yet, there are unique aspects to pediatric urolithiasis such that treatment practices common in adults cannot necessarily be transferred to children. Here, we review the epidemiology, etiology, presentation, investigation, and management of pediatric stone disease; we highlight those aspects that separate its management from that in adults and make a case for a specialized, multidisciplinary approach to pediatric stone disease
The State of the Art of Medical Imaging Technology: from Creation to Archive and Back
Medical imaging has learnt itself well into modern medicine and revolutionized medical industry in the last 30 years. Stemming from the discovery of X-ray by Nobel laureate Wilhelm Roentgen, radiology was born, leading to the creation of large quantities of digital images as opposed to film-based medium. While this rich supply of images provides immeasurable information that would otherwise not be possible to obtain, medical images pose great challenges in archiving them safe from corrupted, lost and misuse, retrievable from databases of huge sizes with varying forms of metadata, and reusable when new tools for data mining and new media for data storing become available. This paper provides a summative account on the creation of medical imaging tomography, the development of image archiving systems and the innovation from the existing acquired image data pools. The focus of this paper is on content-based image retrieval (CBIR), in particular, for 3D images, which is exemplified by our developed online e-learning system, MIRAGE, home to a repository of medical images with variety of domains and different dimensions. In terms of novelties, the facilities of CBIR for 3D images coupled with image annotation in a fully automatic fashion have been developed and implemented in the system, resonating with future versatile, flexible and sustainable medical image databases that can reap new innovations
Interpretation, Evaluation and the Semantic Gap ... What if we Were on a Side-Track?
International audienceA significant amount of research in Document Image Analysis, and Machine Perception in general, relies on the extraction and analysis of signal cues with the goal of interpreting them into higher level information. This paper gives an overview on how this interpretation process is usually considered, and how the research communities proceed in evaluating existing approaches and methods developed for realizing these processes. Evaluation being an essential part to measuring the quality of research and assessing the progress of the state-of-the art, our work aims at showing that classical evaluation methods are not necessarily well suited for interpretation problems, or, at least, that they introduce a strong bias, not necessarily visible at first sight, and that new ways of comparing methods and measuring performance are necessary. It also shows that the infamous {\em Semantic Gap} seems to be an inherent and unavoidable part of the general interpretation process, especially when considered within the framework of traditional evaluation. The use of Formal Concept Analysis is put forward to leverage these limitations into a new tool to the analysis and comparison of interpretation contexts
- âŠ