Search CORE

65 research outputs found

Visual Human Tracking and Group Activity Analysis: A Video Mining System for Retail Marketing

Author: Leykin Alex
Publication venue: [Bloomington, Ind.] : Indiana University
Publication date: 01/01/2007
Field of study

Thesis (PhD) - Indiana University, Computer Sciences, 2007In this thesis we present a system for automatic human tracking and activity recognition from video sequences. The problem of automated analysis of visual information in order to derive descriptors of high level human activities has intrigued computer vision community for decades and is considered to be largely unsolved. A part of this interest is derived from the vast range of applications in which such a solution may be useful. We attempt to find efficient formulations of these tasks as applied to the extracting customer behavior information in a retail marketing context. Based on these formulations, we present a system that visually tracks customers in a retail store and performs a number of activity analysis tasks based on the output from the tracker. In tracking we introduce new techniques for pedestrian detection, initialization of the body model and a formulation of the temporal tracking as a global trans-dimensional optimization problem. Initial human detection is addressed by a novel method for head detection, which incorporates the knowledge of the camera projection model.The initialization of the human body model is addressed by newly developed shape and appearance descriptors. Temporal tracking of customer trajectories is performed by employing a human body tracking system designed as a Bayesian jump-diffusion filter. This approach demonstrates the ability to overcome model dimensionality ambiguities as people are leaving and entering the scene. Following the tracking, we developed a two-stage group activity formulation based upon the ideas from swarming research. For modeling purposes, all moving actors in the scene are viewed here as simplistic agents in the swarm. This allows to effectively define a set of inter-agent interactions, which combine to derive a distance metric used in further swarm clustering. This way, in the first stage the shoppers that belong to the same group are identified by deterministically clustering bodies to detect short term events and in the second stage events are post-processed to form clusters of group activities with fuzzy memberships. Quantitative analysis of the tracking subsystem shows an improvement over the state of the art methods, if used under similar conditions. Finally, based on the output from the tracker, the activity recognition procedure achieves over 80% correct shopper group detection, as validated by the human generated ground truth results

IUScholarWorks (University of Indiana)

Perceptually Optimized Visualization on Autostereoscopic 3D Displays

Author: Boev Atanas
Publication venue: Tampere University of Technology
Publication date: 01/01/2012
Field of study

The family of displays, which aims to visualize a 3D scene with realistic depth, are known as "3D displays". Due to technical limitations and design decisions, such displays create visible distortions, which are interpreted by the human vision as artefacts. In absence of visual reference (e.g. the original scene is not available for comparison) one can improve the perceived quality of the representations by making the distortions less visible. This thesis proposes a number of signal processing techniques for decreasing the visibility of artefacts on 3D displays. The visual perception of depth is discussed, and the properties (depth cues) of a scene which the brain uses for assessing an image in 3D are identified. Following the physiology of vision, a taxonomy of 3D artefacts is proposed. The taxonomy classifies the artefacts based on their origin and on the way they are interpreted by the human visual system. The principles of operation of the most popular types of 3D displays are explained. Based on the display operation principles, 3D displays are modelled as a signal processing channel. The model is used to explain the process of introducing distortions. It also allows one to identify which optical properties of a display are most relevant to the creation of artefacts. A set of optical properties for dual-view and multiview 3D displays are identified, and a methodology for measuring them is introduced. The measurement methodology allows one to derive the angular visibility and crosstalk of each display element without the need for precision measurement equipment. Based on the measurements, a methodology for creating a quality profile of 3D displays is proposed. The quality profile can be either simulated using the angular brightness function or directly measured from a series of photographs. A comparative study introducing the measurement results on the visual quality and position of the sweet-spots of eleven 3D displays of different types is presented. Knowing the sweet-spot position and the quality profile allows for easy comparison between 3D displays. The shape and size of the passband allows depth and textures of a 3D content to be optimized for a given 3D display. Based on knowledge of 3D artefact visibility and an understanding of distortions introduced by 3D displays, a number of signal processing techniques for artefact mitigation are created. A methodology for creating anti-aliasing filters for 3D displays is proposed. For multiview displays, the methodology is extended towards so-called passband optimization which addresses Moiré, fixed-pattern-noise and ghosting artefacts, which are characteristic for such displays. Additionally, design of tuneable anti-aliasing filters is presented, along with a framework which allows the user to select the so-called 3d sharpness parameter according to his or her preferences. Finally, a set of real-time algorithms for view-point-based optimization are presented. These algorithms require active user-tracking, which is implemented as a combination of face and eye-tracking. Once the observer position is known, the image on a stereoscopic display is optimised for the derived observation angle and distance. For multiview displays, the combination of precise light re-direction and less-precise face-tracking is used for extending the head parallax. For some user-tracking algorithms, implementation details are given, regarding execution of the algorithm on a mobile device or on desktop computer with graphical accelerator

Trepo - Institutional Repository of Tampere University

A Neurophysiologic Study Of Visual Fatigue In Stereoscopic Related Displays

Author: Nutung-Wiyor Hanniebey Dolphyne
Publication venue: Aggie Digital Collections and Scholarship
Publication date: 01/01/2013
Field of study

Two tasks were investigated in this study. The first study investigated the effects of alignment display errors on visual fatigue. The experiment revealed the following conclusive results: First, EEG data suggested the possibility of cognitively-induced time compensation changes due to a corresponding effect in real-time brain activity by the eyes trying to compensate for the alignment. The magnification difference error showed more significant effects on all EEG band waves, which were indications of likely visual fatigue as shown by the prevalence of simulator sickness questionnaire (SSQ) increases across all task levels. Vertical shift errors were observed to be prevalent in theta and beta bands of EEG which probably induced alertness (in theta band) as a result of possible stress. Rotation errors were significant in the gamma band, implying the likelihood of cognitive decline because of theta band influence. Second, the hemodynamic responses revealed that significant differences exist between the left and right dorsolateral prefrontal due to alignment errors. There was also a significant difference between the main effect for power band hemisphere and the ATC task sessions. The analyses revealed that there were significant differences between the dorsal frontal lobes in task processing and interaction effects between the processing lobes and tasks processing. The second study investigated the effects of cognitive response variables on visual fatigue. Third, the physiologic indicator of pupil dilation was 0.95mm that occurred at a mean time of 38.1min, after which the pupil dilation begins to decrease. After the average saccade rest time of 33.71min, saccade speeds leaned toward a decrease as a possible result of fatigue on-set. Fourth, the neural network classifier showed visual response data from eye movement were identified as the best predictor of visual fatigue with a classification accuracy of 90.42%. Experimental data confirmed that 11.43% of the participants actually experienced visual fatigue symptoms after the prolonged task

North Carolina Agricultural and Technical State University: NC A&T SU Bluford Library's Aggie Digital Collections and Scholarship

Scalable light field representation and coding

Author: Monteiro Ricardo Jorge Santos
Publication venue
Publication date: 25/06/2020
Field of study

This Thesis aims to advance the state-of-the-art in light field representation and coding. In this context, proposals to improve functionalities like light field random access and scalability are also presented. As the light field representation constrains the coding approach to be used, several light field coding techniques to exploit the inherent characteristics of the most popular types of light field representations are proposed and studied, which are normally based on micro-images or sub-aperture-images. To encode micro-images, two solutions are proposed, aiming to exploit the redundancy between neighboring micro-images using a high order prediction model, where the model parameters are either explicitly transmitted or inferred at the decoder, respectively. In both cases, the proposed solutions are able to outperform low order prediction solutions. To encode sub-aperture-images, an HEVC-based solution that exploits their inherent intra and inter redundancies is proposed. In this case, the light field image is encoded as a pseudo video sequence, where the scanning order is signaled, allowing the encoder and decoder to optimize the reference picture lists to improve coding efficiency. A novel hybrid light field representation coding approach is also proposed, by exploiting the combined use of both micro-image and sub-aperture-image representation types, instead of using each representation individually. In order to aid the fast deployment of the light field technology, this Thesis also proposes scalable coding and representation approaches that enable adequate compatibility with legacy displays (e.g., 2D, stereoscopic or multiview) and with future light field displays, while maintaining high coding efficiency. Additionally, viewpoint random access, allowing to improve the light field navigation and to reduce the decoding delay, is also enabled with a flexible trade-off between coding efficiency and viewpoint random access.Esta Tese tem como objetivo avançar o estado da arte em representação e codificação de campos de luz. Neste contexto, são também apresentadas propostas para melhorar funcionalidades como o acesso aleatório ao campo de luz e a escalabilidade. Como a representação do campo de luz limita a abordagem de codificação a ser utilizada, são propostas e estudadas várias técnicas de codificação de campos de luz para explorar as características inerentes aos seus tipos mais populares de representação, que são normalmente baseadas em micro-imagens ou imagens de sub-abertura. Para codificar as micro-imagens, são propostas duas soluções, visando explorar a redundância entre micro-imagens vizinhas utilizando um modelo de predição de alta ordem, onde os parâmetros do modelo são explicitamente transmitidos ou inferidos no decodificador, respetivamente. Em ambos os casos, as soluções propostas são capazes de superar as soluções de predição de baixa ordem. Para codificar imagens de sub-abertura, é proposta uma solução baseada em HEVC que explora a inerente redundância intra e inter deste tipo de imagens. Neste caso, a imagem do campo de luz é codificada como uma pseudo-sequência de vídeo, onde a ordem de varrimento é sinalizada, permitindo ao codificador e decodificador otimizar as listas de imagens de referência para melhorar a eficiência da codificação. Também é proposta uma nova abordagem de codificação baseada na representação híbrida do campo de luz, explorando o uso combinado dos tipos de representação de micro-imagem e sub-imagem, em vez de usar cada representação individualmente. A fim de facilitar a rápida implantação da tecnologia de campo de luz, esta Tese também propõe abordagens escaláveis de codificação e representação que permitem uma compatibilidade adequada com monitores tradicionais (e.g., 2D, estereoscópicos ou multivista) e com futuros monitores de campo de luz, mantendo ao mesmo tempo uma alta eficiência de codificação. Além disso, o acesso aleatório de pontos de vista, permitindo melhorar a navegação no campo de luz e reduzir o atraso na descodificação, também é permitido com um equilíbrio flexível entre eficiência de codificação e acesso aleatório de pontos de vista

Repositório Institucional do ISCTE-IUL

Advanced Visualization and Intuitive User Interface Systems for Biomedical Applications

Author: Quam David
Publication venue: e-Publications@Marquette
Publication date: 01/04/2012
Field of study

Modern scientific research produces data at rates that far outpace our ability to comprehend and analyze it. Such sources include medical imaging data and computer simulations, where technological advancements and spatiotemporal resolution generate increasing amounts of data from each scan or simulation. A bottleneck has developed whereby medical professionals and researchers are unable to fully use the advanced information available to them. By integrating computer science, computer graphics, artistic ability and medical expertise, scientific visualization of medical data has become a new field of study. The objective of this thesis is to develop two visualization systems that use advanced visualization, natural user interface technologies and the large amount of biomedical data available to produce results that are of clinical utility and overcome the data bottleneck that has developed. Computational Fluid Dynamics (CFD) is a tool used to study the quantities associated with the movement of blood by computer simulation. We developed methods of processing spatiotemporal CFD data and displaying it in stereoscopic 3D with the ability to spatially navigate through the data. We used this method with two sets of display hardware: a full-scale visualization environment and a small-scale desktop system. The advanced display and data navigation abilities provide the user with the means to better understand the relationship between the vessel\u27s form and function. Low-cost 3D, depth-sensing cameras capture and process user body motion to recognize motions and gestures. Such devices allow users to use hand motions as an intuitive interface to computer applications. We developed algorithms to process and prepare the biomedical and scientific data for use with a custom control application. The application interprets user gestures as commands to a visualization tool and allows the user to control the visualization of multi-dimensional data. The intuitive interface allows the user to control the visualization of data without manual contact with an interaction device. In developing these methods and software tools we have leveraged recent trends in advanced visualization and intuitive interfaces in order to efficiently visualize biomedical data in such a way that provides meaningful information that can be used to further appreciate it

epublications@Marquette

Towards markerless orthopaedic navigation with intuitive Optical See-through Head-mounted displays

Author: Hu Xue
Publication venue: Mechanical Engineering, Imperial College London
Publication date: 01/09/2022
Field of study

The potential of image-guided orthopaedic navigation to improve surgical outcomes has been well-recognised during the last two decades. According to the tracked pose of target bone, the anatomical information and preoperative plans are updated and displayed to surgeons, so that they can follow the guidance to reach the goal with higher accuracy, efficiency and reproducibility. Despite their success, current orthopaedic navigation systems have two main limitations: for target tracking, artificial markers have to be drilled into the bone and calibrated manually to the bone, which introduces the risk of additional harm to patients and increases operating complexity; for guidance visualisation, surgeons have to shift their attention from the patient to an external 2D monitor, which is disruptive and can be mentally stressful. Motivated by these limitations, this thesis explores the development of an intuitive, compact and reliable navigation system for orthopaedic surgery. To this end, conventional marker-based tracking is replaced by a novel markerless tracking algorithm, and the 2D display is replaced by a 3D holographic Optical see-through (OST) Head-mounted display (HMD) precisely calibrated to a user's perspective. Our markerless tracking, facilitated by a commercial RGBD camera, is achieved through deep learning-based bone segmentation followed by real-time pose registration. For robust segmentation, a new network is designed and efficiently augmented by a synthetic dataset. Our segmentation network outperforms the state-of-the-art regarding occlusion-robustness, device-agnostic behaviour, and target generalisability. For reliable pose registration, a novel Bounded Iterative Closest Point (BICP) workflow is proposed. The improved markerless tracking can achieve a clinically acceptable error of 0.95 deg and 2.17 mm according to a phantom test. OST displays allow ubiquitous enrichment of perceived real world with contextually blended virtual aids through semi-transparent glasses. They have been recognised as a suitable visual tool for surgical assistance, since they do not hinder the surgeon's natural eyesight and require no attention shift or perspective conversion. The OST calibration is crucial to ensure locational-coherent surgical guidance. Current calibration methods are either human error-prone or hardly applicable to commercial devices. To this end, we propose an offline camera-based calibration method that is highly accurate yet easy to implement in commercial products, and an online alignment-based refinement that is user-centric and robust against user error. The proposed methods are proven to be superior to other similar State-of- the-art (SOTA)s regarding calibration convenience and display accuracy. Motivated by the ambition to develop the world's first markerless OST navigation system, we integrated the developed markerless tracking and calibration scheme into a complete navigation workflow designed for femur drilling tasks during knee replacement surgery. We verify the usability of our designed OST system with an experienced orthopaedic surgeon by a cadaver study. Our test validates the potential of the proposed markerless navigation system for surgical assistance, although further improvement is required for clinical acceptance.Open Acces

Spiral - Imperial College Digital Repository

Selective Darkening Filter and Welding Arc Observation for the Manual Welding Process

Author: Hillers Bernd
Publication venue
Publication date: 01/01/2011
Field of study

An optical see-through LCD (GLCD) with a resolution of n x m pixels gives the ability to selectively control the darkening in the welders view. The setup of such a Selective Auto Darkening Filter is developed and its applicability tested. The setup is done by integrating a camera into the welding operation for extracting the welding arc position properly. A prototype of a GLCD taylored for welding is mounted in the welder's view. The extraction of the welding arc position requires an enhanced video acquisition during welding. The observation of scenes with high dynamic contrast is an outstanding problem which occurs if very high differences between the darkest and the brightest spot in a scene occur. The application to welding with its harsh conditions needs the development of supporting hardware. The synchronization of the camera with the flickering light conditions of pulsed welding processes in Gas Metal Arc Welding (GMAW) stabilizes the acquisition process and allows the scene to be flashed precisely if required by compact high power LEDs. The image acquisition is enhanced by merging two different exposed images for the resulting image. These source images cover a wider histogram range than it is possible by using only a single shot image with optimal camera parameters. After testing different standard contrast enhancement algorithm a novel content based algorithm is developed. It segments the image into areas with similar content and enhances these independently

E-LIB Dokumentserver - Staats und Universitätsbibliothek Bremen

Design of an endoscopic 3-D Particle-Tracking Velocimetry system and its application in flow measurements within a gravel layer

Author: Klar Michael
Publication venue
Publication date: 01/01/2005
Field of study

In this thesis a novel method for 3-D flow measurements within a permeable gravel layer is developed. Two fiberoptic endoscopes are used in a stereoscopic arrangement to acquire image sequences of the flow field within a single gravel pore. The images are processed by a 3-D Particle-Tracking Velocimetry (3-D PTV) algorithm, which yields the three-dimensional reconstruction of Lagrangian particle trajectories. The underlying image processing algorithms are significantly enhanced and adapted to the special conditions of endoscopic imagery. This includes methods for image preprocessing, robust camera calibration, image segmentation and particle-tracking. After a performance and accuracy analysis, the measurement technique is applied in extensive systematic investigations of the flow within a gravel layer in an experimental flume at the Federal Waterways Engineering and Research Institute in Karlsruhe. In addition to measurements of the pore flow within three gravel pores, an extended experimental setup enables the simultaneous observation of the near-bed 3-D flow field in the turbulent open-channel flow above the gravel layer and of grain motions in a sand layer beneath the gravel layer. The interaction of the free surface flow and the pore flow can be analyzed for the first time with a high temporal and spatial resolution. The experiments are part of a research project initiated by an international cooperation called Filter and Erosion Research Club (FERC). The longterm goal of this project is to quantify the influence of turbulent velocity and pressure fluctuations on the bed stability of waterways. The obtained experimental data provide new insight into the damping behaviour of a gravel bed and can be used for comparison with numerical, analytical and phenomenological models

Heidelberger Dokumentenserver

Earth resources: A continuing bibliography with indexes (issue 52)

Author
Publication venue
Publication date
Field of study

This bibliography lists 454 reports, articles, and other documents introduced into the NASA scientific and technical information system between October 1 and December 31, 1986. Emphasis is placed on the use of remote sensing and geophysical instrumentation in spacecraft and aircraft to survey and inventory natural resources and urban areas. Subject matter is grouped according to agriculture and forestry, environmental changes and cultural resources, geodesy and cartography, geology and mineral resources, hydrology and water management, data processing and distribution systems, instrumentation and sensors, and economic analysis

NASA Technical Reports Server