Search CORE

374 research outputs found

Visual region understanding: unsupervised extraction and abstraction

Author: Gupta G.
Gupta G.
Publication venue
Publication date
Field of study

The ability to gain a conceptual understanding of the world in uncontrolled environments is the ultimate goal of vision-based computer systems. Technological societies today are heavily reliant on surveillance and security infrastructure, robotics, medical image analysis, visual data categorisation and search, and smart device user interaction, to name a few. Out of all the complex problems tackled by computer vision today in context of these technologies, that which lies closest to the original goals of the field is the subarea of unsupervised scene analysis or scene modelling. However, its common use of low level features does not provide a good balance between generality and discriminative ability, both a result and a symptom of the sensory and semantic gaps existing between low level computer representations and high level human descriptions. In this research we explore a general framework that addresses the fundamental problem of universal unsupervised extraction of semantically meaningful visual regions and their behaviours. For this purpose we address issues related to (i) spatial and spatiotemporal segmentation for region extraction, (ii) region shape modelling, and (iii) the online categorisation of visual object classes and the spatiotemporal analysis of their behaviours. Under this framework we propose (a) a unified region merging method and spatiotemporal region reduction, (b) shape representation by the optimisation and novel simplication of contour-based growing neural gases, and (c) a foundation for the analysis of visual object motion properties using a shape and appearance based nearest-centroid classification algorithm and trajectory plots for the obtained region classes. 1 Specifically, we formulate a region merging spatial segmentation mechanism that combines and adapts features shown previously to be individually useful, namely parallel region growing, the best merge criterion, a time adaptive threshold, and region reduction techniques. For spatiotemporal region refinement we consider both scalar intensity differences and vector optical flow. To model the shapes of the visual regions thus obtained, we adapt the growing neural gas for rapid region contour representation and propose a contour simplication technique. A fast unsupervised nearest-centroid online learning technique next groups observed region instances into classes, for which we are then able to analyse spatial presence and spatiotemporal trajectories. The analysis results show semantic correlations to real world object behaviour. Performance evaluation of all steps across standard metrics and datasets validate their performance

WestminsterResearch

Recommended from our members

VastMM-Tag: Semantic Indexing and Browsing of Videos for E-Learning

Author: Morris Mitchell Joseph
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2012
Field of study

Quickly accessing the contents of a video is challenging for users, particularly for unstructured video, which contains no intentional shot boundaries, no chapters, and no apparent edited format. We approach this problem in the domain of lecture videos though the use of machine learning, to gather semantic information about the videos; and through user interface design, to enable users to fully utilize this new information. First, we use machine learning techniques to gather the semantic information. We develop a system for rapid automatic semantic tagging using a heuristic-based feature selection algorithm called Sort-Merge, by using large initial heterogeneous low-level feature sets (cardinality greater than 1K). We explore applying Sort-Merge to heterogeneous feature sets though two methods: early fusion and late fusion. Each takes different approaches to handling the different kinds of features in the heterogeneous set. We determine the most predictive feature sets for key-frame filters such as "has text", "has computer source code", or "has instructor motion". Specifically we explore the usefulness of Harr Wavelets, Fast Fourier Transforms, Color Coherence Vectors, Line Detectors, Ink Features and Pan/Tilt/Zoom detectors. For evaluation, we introduce a "keeper" heuristic for feature sets, which provides a method of performance comparison against a baseline. Second, we create a user interface to allow the user to make use of the semantic tags we gathered though our computer vision and machine learning process. The interface is integrated into an existing video browser, which detected shot-like boundaries and presented a multi-timeline view. The content within shot-like boundaries is represented by frames to which our new interface applies the generated semantic tags. Specifically, we make accessible the semantic concepts of 'text', 'code', 'presenter', and 'person motion'. The tags are detected in the simulated shots using the filters generated with our machine learning approach and are displayed to users using a user-customizable multi-timeline view. We also generate tags based on ASR-generated transcripts that have been limited to the words provided in the index of the course text book. Each of these occurrences is aligned with the simulated shots. Each spoken word becomes a tag analogous to the visual concepts. A full Boolean algebra over the tags is provided to enable new composite tags such as 'text or code, but no presenter'. Finally, we quantify the effectiveness of our features and our browser through user studies, both observational and task driven. We find that users that use the full suite of tools performed a search task in 60% of the time of users without access to tags. We find that when users are asked to perform search tasks they follow a nearly fixed pattern of accesses, alternating between the use of tags and Keyframes, or between the use of Word Bubbles and the media player. Based on user behavior and feedback, we redesigned the interface to group spatially interface components that are used together, removed un-used components, and redesigned the display of Word Bubbles to match that of the Visual Tags. We found that users strongly preferred the Keyframe tool, as well as both kinds of tags. Users also either found the algebra very useful or not useful at all

Columbia University Academic Commons

Image segmentation, evaluation, and applications

Author: McGuinness Kevin
Publication venue: Dublin City University. CLARITY: The Centre for Sensor Web Technologies
Publication date: 01/03/2010
Field of study

This thesis aims to advance research in image segmentation by developing robust techniques for evaluating image segmentation algorithms. The key contributions of this work are as follows. First, we investigate the characteristics of existing measures for supervised evaluation of automatic image segmentation algorithms. We show which of these measures is most effective at distinguishing perceptually accurate image segmentation from inaccurate segmentation. We then apply these measures to evaluating four state-of-the-art automatic image segmentation algorithms, and establish which best emulates human perceptual grouping. Second, we develop a complete framework for evaluating interactive segmentation algorithms by means of user experiments. Our system comprises evaluation measures, ground truth data, and implementation software. We validate our proposed measures by showing their correlation with perceived accuracy. We then use our framework to evaluate four popular interactive segmentation algorithms, and demonstrate their performance. Finally, acknowledging that user experiments are sometimes prohibitive in practice, we propose a method of evaluating interactive segmentation by algorithmically simulating the user interactions. We explore four strategies for this simulation, and demonstrate that the best of these produces results very similar to those from the user experiments

Irish Universities

DCU Online Research Access Service

Algorithms for video retargeting

Author: A Fox
A Shamir
A Vetro
A Vetro
A Vetro
B Bai
B Tseng
Benjamin Guthier
D Farin
DG Lowe
F Mokhtarian
H Bay
H Schneiderman
HA Rowley
I Nurnett
JF Canny
Johannes Kiess
JS Kim
K Curran
L Itti
M Fischler
M Hossain
M Rubinstein
M Zwicker
N Björk
O Steiger
P Beek
P Krähenbühl
P Schaber
R Han
R Mohan
RO Duda
S Kopf
S Kopf
S Kopf
S Kopf
S Nepal
Stephan Kopf
T Ren
T Shanableh
Thomas Haenselmann
V Cardellini
W Dong
W Lum
WH Cheng
WH Cheng
Wolfgang Effelsberg
Y Boykov
Y Guo
Y Li
Y Li
Y Linde
YF Ma
YS Wang
Z Lei
Z Lei
Z Obrenovic
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Pattern Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition

Directory of Open Access Books (DOAB)

Image similarity in medical images

Author: Aluru N. R.
Cheng Chuan
Ng K. Y.
Ngan A. H. W.
Publication venue: Ghent University. Faculty of Sciences
Publication date: 01/01/2013
Field of study

Recent experiments have indicated a strong influence of the substrate grain orientation on the self-ordering in anodic porous alumina. Anodic porous alumina with straight pore channels grown in a stable, self-ordered manner is formed on (001) oriented Al grain, while disordered porous pattern is formed on (101) oriented Al grain with tilted pore channels growing in an unstable manner. In this work, numerical simulation of the pore growth process is carried out to understand this phenomenon. The rate-determining step of the oxide growth is assumed to be the Cabrera-Mott barrier at the oxide/electrolyte (o/e) interface, while the substrate is assumed to determine the ratio β between the ionization and oxidation reactions at the metal/oxide (m/o) interface. By numerically solving the electric field inside a growing porous alumina during anodization, the migration rates of the ions and hence the evolution of the o/e and m/o interfaces are computed. The simulated results show that pore growth is more stable when β is higher. A higher β corresponds to more Al ionized and migrating away from the m/o interface rather than being oxidized, and hence a higher retained O:Al ratio in the oxide. Experimentally measured oxygen content in the self-ordered porous alumina on (001) Al is indeed found to be about 3% higher than that in the disordered alumina on (101) Al, in agreement with the theoretical prediction. The results, therefore, suggest that ionization on (001) Al substrate is relatively easier than on (101) Al, and this leads to the more stable growth of the pore channels on (001) Al

Crossref

Ghent University Academic Bibliography

Warwick Research Archives Portal Repository

HKU Scholars Hub

Image similarity in medical images

Author: Gál Viktor
Publication venue: Ghent University. Faculty of Sciences
Publication date: 01/01/2016
Field of study

Ghent University Academic Bibliography

Coronal loop detection from solar images and extraction of salient contour groups from cluttered images.

Author: Durak Nurcan
Publication venue: ThinkIR: The University of Louisville\u27s Institutional Repository
Publication date: 01/08/2011
Field of study

This dissertation addresses two different problems: 1) coronal loop detection from solar images: and 2) salient contour group extraction from cluttered images. In the first part, we propose two different solutions to the coronal loop detection problem. The first solution is a block-based coronal loop mining method that detects coronal loops from solar images by dividing the solar image into fixed sized blocks, labeling the blocks as Loop or Non-Loop , extracting features from the labeled blocks, and finally training classifiers to generate learning models that can classify new image blocks. The block-based approach achieves 64% accuracy in IO-fold cross validation experiments. To improve the accuracy and scalability, we propose a contour-based coronal loop detection method that extracts contours from cluttered regions, then labels the contours as Loop and Non-Loop , and extracts geometric features from the labeled contours. The contour-based approach achieves 85% accuracy in IO-fold cross validation experiments, which is a 20% increase compared to the block-based approach. In the second part, we propose a method to extract semi-elliptical open curves from cluttered regions. Our method consists of the following steps: obtaining individual smooth contours along with their saliency measures; then starting from the most salient contour, searching for possible grouping options for each contour; and continuing the grouping until an optimum solution is reached. Our work involved the design and development of a complete system for coronal loop mining in solar images, which required the formulation of new Gestalt perceptual rules and a systematic methodology to select and combine them in a fully automated judicious manner using machine learning techniques that eliminate the need to manually set various weight and threshold values to define an effective cost function. After finding salient contour groups, we close the gaps within the contours in each group and perform B-spline fitting to obtain smooth curves. Our methods were successfully applied on cluttered solar images from TRACE and STEREO/SECCHI to discern coronal loops. Aerial road images were also used to demonstrate the applicability of our grouping techniques to other contour-types in other real applications

University of Louisville

Text Segmentation in Web Images Using Colour Perception and Topological Features

Author: Karatzas Dimosthenis
Publication venue
Publication date: 01/01/2002
Field of study

The research presented in this thesis addresses the problem of Text Segmentation in Web images. Text is routinely created in image form (headers, banners etc.) on Web pages, as an attempt to overcome the stylistic limitations of HTML. This text however, has a potentially high semantic value in terms of indexing and searching for the corresponding Web pages. As current search engine technology does not allow for text extraction and recognition in images, the text in image form is ignored. Moreover, it is desirable to obtain a uniform representation of all visible text of a Web page (for applications such as voice browsing or automated content analysis). This thesis presents two methods for text segmentation in Web images using colour perception and topological features. The nature of Web images and the implicit problems to text segmentation are described, and a study is performed to assess the magnitude of the problem and establish the need for automated text segmentation methods. Two segmentation methods are subsequently presented: the Split-and-Merge segmentation method and the Fuzzy segmentation method. Although approached in a distinctly different way in each method, the safe assumption that a human being should be able to read the text in any given Web Image is the foundation of both methods’ reasoning. This anthropocentric character of the methods along with the use of topological features of connected components, comprise the underlying working principles of the methods. An approach for classifying the connected components resulting from the segmentation methods as either characters or parts of the background is also presented

CiteSeerX

Southampton (e-Prints Soton)

OpenGrey Repository