Search CORE

168 research outputs found

Semi-Supervised Pattern Recognition and Machine Learning for Eye-Tracking

Author: Kinsman Thomas B
Publication venue: RIT Scholar Works
Publication date: 20/11/2015
Field of study

The first step in monitoring an observer’s eye gaze is identifying and locating the image of their pupils in video recordings of their eyes. Current systems work under a range of conditions, but fail in bright sunlight and rapidly varying illumination. A computer vision system was developed to assist with the recognition of the pupil in every frame of a video, in spite of the presence of strong first-surface reflections off of the cornea. A modified Hough Circle detector was developed that incorporates knowledge that the pupil is darker than the surrounding iris of the eye, and is able to detect imperfect circles, partial circles, and ellipses. As part of processing the image is modified to compensate for the distortion of the pupil caused by the out-of-plane rotation of the eye. A sophisticated noise cleaning technique was developed to mitigate first surface reflections, enhance edge contrast, and reduce image flare. Semi-supervised human input and validation is used to train the algorithm. The final results are comparable to those achieved using a human analyst, but require only a tenth of the human interaction

RIT Scholar Works

Model-driven and Data-driven Approaches for some Object Recognition Problems

Author: Gopalan Raghuraman
Publication venue
Publication date: 01/01/2011
Field of study

Recognizing objects from images and videos has been a long standing problem in computer vision. The recent surge in the prevalence of visual cameras has given rise to two main challenges where, (i) it is important to understand different sources of object variations in more unconstrained scenarios, and (ii) rather than describing an object in isolation, efficient learning methods for modeling object-scene `contextual' relations are required to resolve visual ambiguities. This dissertation addresses some aspects of these challenges, and consists of two parts. First part of the work focuses on obtaining object descriptors that are largely preserved across certain sources of variations, by utilizing models for image formation and local image features. Given a single instance of an object, we investigate the following three problems. (i) Representing a 2D projection of a 3D non-planar shape invariant to articulations, when there are no self-occlusions. We propose an articulation invariant distance that is preserved across piece-wise affine transformations of a non-rigid object `parts', under a weak perspective imaging model, and then obtain a shape context-like descriptor to perform recognition; (ii) Understanding the space of `arbitrary' blurred images of an object, by representing an unknown blur kernel of a known maximum size using a complete set of orthonormal basis functions spanning that space, and showing that subspaces resulting from convolving a clean object and its blurred versions with these basis functions are equal under some assumptions. We then view the invariant subspaces as points on a Grassmann manifold, and use statistical tools that account for the underlying non-Euclidean nature of the space of these invariants to perform recognition across blur; (iii) Analyzing the robustness of local feature descriptors to different illumination conditions. We perform an empirical study of these descriptors for the problem of face recognition under lighting change, and show that the direction of image gradient largely preserves object properties across varying lighting conditions. The second part of the dissertation utilizes information conveyed by large quantity of data to learn contextual information shared by an object (or an entity) with its surroundings. (i) We first consider a supervised two-class problem of detecting lane markings from road video sequences, where we learn relevant feature-level contextual information through a machine learning algorithm based on boosting. We then focus on unsupervised object classification scenarios where, (ii) we perform clustering using maximum margin principles, by deriving some basic properties on the affinity of `a pair of points' belonging to the same cluster using the information conveyed by `all' points in the system, and (iii) then consider correspondence-free adaptation of statistical classifiers across domain shifting transformations, by generating meaningful `intermediate domains' that incrementally convey potential information about the domain change

CiteSeerX

Digital Repository at the University of Maryland

Recommended from our members

Automatic Multilevel Feature Abstraction in Adaptable Machine Vision Systems

Author: Rose Valerie
Publication venue
Publication date: 01/01/2010
Field of study

Vision is a complex task which can be accomplished with apparent ease by biological systems, but for which the design of artificial systems is difficult. Although machine vision systems can be successfully designed for a specific task, under certain conditions, they are likely to fail if circumstances change. This was the motivation for the research into ways in which systems can be self-designing and adaptable to new visual tasks. The research was conducted in three vital areas of concern for machine vision systems. The first area is finding a suitable architecture for forming an appropriate representation for the current task. The research investigated the application of Hypernetworks theory to building a multilevel, generally-applicable representation, through repeated application of a fundamental 'self-similarity' principle, that parts of objects assembled under a particular relation at one level, form whole objects at the next. Results show that this is potentially a powerful approach for autonomously generating an adaptable system-architecture suitable for multiple visual tasks. The second area is the autonomous extraction of suitable low-level features, which the research investigated through random generation of minimally-constrained pixel-configurations and algorithmic generation of homogeneous and heterogeneous polygons. The results suggest that, despite the simplicity of the features making them vulnerable to image transformations, these are promising approaches worth developing further. The third area is automatic feature selection. The research explored management of 'dimensionality' and of 'combinatorial explosion', as well as how to locate relevant features at multiple representation levels, in the context of 'emergence' of structure. Results indicate that this approach can find useful 'intermediate-level' constructs through analysis of the connectivity of the simplices representing objects at higher levels. The research concludes that the proposed novel approaches to tackling the above issues, in particular the application of hypernetworks to the formation of multilevel representations and the resulting emergence of higher-level structure, is fruitful

Open Research Online (The Open University)

Environment detection and classification for safety-critical railway systems

Author: Vasco Macedo da Costa
Publication venue
Publication date: 20/07/2023
Field of study

Repositório Aberto da Universidade do Porto

Feature based dynamic intra-video indexing

Author: Asghar Muhammad Nabeel
Publication venue: University of Bedfordshire
Publication date: 01/09/2014
Field of study

A thesis submitted in partial fulfillment for the degree of Doctor of PhilosophyWith the advent of digital imagery and its wide spread application in all vistas of life, it has become an important component in the world of communication. Video content ranging from broadcast news, sports, personal videos, surveillance, movies and entertainment and similar domains is increasing exponentially in quantity and it is becoming a challenge to retrieve content of interest from the corpora. This has led to an increased interest amongst the researchers to investigate concepts of video structure analysis, feature extraction, content annotation, tagging, video indexing, querying and retrieval to fulfil the requirements. However, most of the previous work is confined within specific domain and constrained by the quality, processing and storage capabilities. This thesis presents a novel framework agglomerating the established approaches from feature extraction to browsing in one system of content based video retrieval. The proposed framework significantly fills the gap identified while satisfying the imposed constraints of processing, storage, quality and retrieval times. The output entails a framework, methodology and prototype application to allow the user to efficiently and effectively retrieved content of interest such as age, gender and activity by specifying the relevant query. Experiments have shown plausible results with an average precision and recall of 0.91 and 0.92 respectively for face detection using Haar wavelets based approach. Precision of age ranges from 0.82 to 0.91 and recall from 0.78 to 0.84. The recognition of gender gives better precision with males (0.89) compared to females while recall gives a higher value with females (0.92). Activity of the subject has been detected using Hough transform and classified using Hiddell Markov Model. A comprehensive dataset to support similar studies has also been developed as part of the research process. A Graphical User Interface (GUI) providing a friendly and intuitive interface has been integrated into the developed system to facilitate the retrieval process. The comparison results of the intraclass correlation coefficient (ICC) shows that the performance of the system closely resembles with that of the human annotator. The performance has been optimised for time and error rate

University of Bedfordshire Repository

A fast robust geometric fitting method for parabolic curves.

Author: Blázquez-Parra Elidia Beatriz
De-Cózar-Macías Óscar
Ladrón-de-Guevara-Muñoz María del Carmen
López-Rubio Ezequiel
Thurnhofer-Hemsi Karl
Publication venue: Elsevier
Publication date: 18/07/2018
Field of study

Fitting discrete data obtained by image acquisition devices to a curve is a common task in many fields of science and engineering. In particular, the parabola is some of the most employed shape features in electrical engineering and telecommunication applications. Standard curve fitting techniques to solve this problem involve the minimization of squared errors. However, most of these procedures are sensitive to noise. Here, we propose an algorithm based on the minimization of absolute errors accompanied by a normalization of the directrix vector that leads to an improved stability of the method. This way, our proposal is substantially resilient to noisy samples in the input dataset. Experimental results demonstrate the good performance of the algorithm in terms of speed and accuracy when compared to previous approaches, both for synthetic and real data.This work is partially supported by the Ministry of Economy and Competitiveness of Spain [grant number TIN2014-53465-R], project name Video surveillance by active search of anomalous events. It is also partially supported by the Autonomous Government of Andalusia (Spain) [grant number TIC-6213], project name Development of Self-Organizing Neural Networks for Information Technologies; and [grant number TIC-657], project name Self-organizing systems and robust estimators for video surveillance. All of them include funds from the European Regional Development Fund (ERDF). The authors thankfully acknowledge the computer resources, technical expertise and assistance provided by the SCBI (Supercomputing and Bioinformatics) center of the University of Málaga. They have also been supported by the Biomedic Research Institute of Málaga (IBIMA). They also gratefully acknowledge the support of NVIDIA Corporation with the donation of the Titan X GPU. Karl Thurnhofer-Hemsi is funded by a Ph.D. scholarship from the Spanish Ministry of Education, Culture and Sport under the FPU program [grant number FPU15/06512]

Repositorio Institucional Universidad de Málaga

Vision-Based 2D and 3D Human Activity Recognition

Author: Holte Michael Boelstoft
Publication venue: Department of Architecture, Design & Media Technology, Aalborg University
Publication date: 01/01/2012
Field of study

VBN

State of the Art in Face Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Notwithstanding the tremendous effort to solve the face recognition problem, it is not possible yet to design a face recognition system with a potential close to human performance. New computer vision and pattern recognition approaches need to be investigated. Even new knowledge and perspectives from different fields like, psychology and neuroscience must be incorporated into the current field of face recognition to design a robust face recognition system. Indeed, many more efforts are required to end up with a human like face recognition system. This book tries to make an effort to reduce the gap between the previous face recognition research state and the future state

Directory of Open Access Books (DOAB)

Digital Image Access & Retrieval

Author: Heidorn P. Bryan
Sandore Beth
Publication venue: Graduate School of Library and Information Science, University of Illinois at Urbana-Champaign
Publication date: 01/01/1997
Field of study

The 33th Annual Clinic on Library Applications of Data Processing, held at the University of Illinois at Urbana-Champaign in March of 1996, addressed the theme of "Digital Image Access & Retrieval." The papers from this conference cover a wide range of topics concerning digital imaging technology for visual resource collections. Papers covered three general areas: (1) systems, planning, and implementation; (2) automatic and semi-automatic indexing; and (3) preservation with the bulk of the conference focusing on indexing and retrieval.published or submitted for publicatio

Illinois Digital Environment for Access to Learning and Scholarship Repository