Search CORE

4,464 research outputs found

Design of a High-Speed Architecture for Stabilization of Video Captured Under Non-Uniform Lighting Conditions

Author: Zhang Ming Zhu
Publication venue: ODU Digital Commons
Publication date: 01/01/2008
Field of study

Video captured in shaky conditions may lead to vibrations. A robust algorithm to immobilize the video by compensating for the vibrations from physical settings of the camera is presented in this dissertation. A very high performance hardware architecture on Field Programmable Gate Array (FPGA) technology is also developed for the implementation of the stabilization system. Stabilization of video sequences captured under non-uniform lighting conditions begins with a nonlinear enhancement process. This improves the visibility of the scene captured from physical sensing devices which have limited dynamic range. This physical limitation causes the saturated region of the image to shadow out the rest of the scene. It is therefore desirable to bring back a more uniform scene which eliminates the shadows to a certain extent. Stabilization of video requires the estimation of global motion parameters. By obtaining reliable background motion, the video can be spatially transformed to the reference sequence thereby eliminating the unintended motion of the camera. A reflectance-illuminance model for video enhancement is used in this research work to improve the visibility and quality of the scene. With fast color space conversion, the computational complexity is reduced to a minimum. The basic video stabilization model is formulated and configured for hardware implementation. Such a model involves evaluation of reliable features for tracking, motion estimation, and affine transformation to map the display coordinates of a stabilized sequence. The multiplications, divisions and exponentiations are replaced by simple arithmetic and logic operations using improved log-domain computations in the hardware modules. On Xilinx\u27s Virtex II 2V8000-5 FPGA platform, the prototype system consumes 59% logic slices, 30% flip-flops, 34% lookup tables, 35% embedded RAMs and two ZBT frame buffers. The system is capable of rendering 180.9 million pixels per second (mpps) and consumes approximately 30.6 watts of power at 1.5 volts. With a 1024×1024 frame, the throughput is equivalent to 172 frames per second (fps). Future work will optimize the performance-resource trade-off to meet the specific needs of the applications. It further extends the model for extraction and tracking of moving objects as our model inherently encapsulates the attributes of spatial distortion and motion prediction to reduce complexity. With these parameters to narrow down the processing range, it is possible to achieve a minimum of 20 fps on desktop computers with Intel Core 2 Duo or Quad Core CPUs and 2GB DDR2 memory without a dedicated hardware

Old Dominion University

A Person-Centric Design Framework for At-Home Motor Learning in Serious Games

Author
Publication venue
Publication date: 01/01/2017
Field of study

abstract: In motor learning, real-time multi-modal feedback is a critical element in guided training. Serious games have been introduced as a platform for at-home motor training due to their highly interactive and multi-modal nature. This dissertation explores the design of a multimodal environment for at-home training in which an autonomous system observes and guides the user in the place of a live trainer, providing real-time assessment, feedback and difficulty adaptation as the subject masters a motor skill. After an in-depth review of the latest solutions in this field, this dissertation proposes a person-centric approach to the design of this environment, in contrast to the standard techniques implemented in related work, to address many of the limitations of these approaches. The unique advantages and restrictions of this approach are presented in the form of a case study in which a system entitled the "Autonomous Training Assistant" consisting of both hardware and software for guided at-home motor learning is designed and adapted for a specific individual and trainer. In this work, the design of an autonomous motor learning environment is approached from three areas: motor assessment, multimodal feedback, and serious game design. For motor assessment, a 3-dimensional assessment framework is proposed which comprises of 2 spatial (posture, progression) and 1 temporal (pacing) domains of real-time motor assessment. For multimodal feedback, a rod-shaped device called the "Intelligent Stick" is combined with an audio-visual interface to provide feedback to the subject in three domains (audio, visual, haptic). Feedback domains are mapped to modalities and feedback is provided whenever the user's performance deviates from the ideal performance level by an adaptive threshold. Approaches for multi-modal integration and feedback fading are discussed. Finally, a novel approach for stealth adaptation in serious game design is presented. This approach allows serious games to incorporate motor tasks in a more natural way, facilitating self-assessment by the subject. An evaluation of three different stealth adaptation approaches are presented and evaluated using the flow-state ratio metric. The dissertation concludes with directions for future work in the integration of stealth adaptation techniques across the field of exergames.Dissertation/ThesisDoctoral Dissertation Computer Science 201

ASU Digital Repository

An Analytic Training Approach for Recognition in Still Images and Videos

Author: Minhas Rashid
Publication venue: 'University of Windsor Leddy Library'
Publication date: 01/01/2010
Field of study

This dissertation proposes a general framework to efficiently identify the objects of interest (OI) in still images and its application can be further extended to human action recognition in videos. The frameworks utilized in this research to process still images and videos are similar in architecture except they have different content representations. Initially, global level analysis is employed to extract distinctive feature sets from an input data. For the global analysis of data the bidirectional two dimensional principal component analysis (2D-PCA) is employed to preserve correlation amongst neighborhood pixels. Furthermore, to cope with the inherent limitations within the holistic approach local information is introduced into the framework. The local information of OI is identified utilizing FERNS and affine SIFT (ASIFT) approaches for spatial and temporal datasets, respectively. For supportive local information, the feature detection is followed by an effective pruning strategy to divide these features into inliers and outliers. A cluster of inliers represents local features which exhibit stable behavior and geometric consistency. Incremental learning is a significant but often overlooked problem in action recognition. The final part of this dissertation proposes a new action recognition algorithm based on sequential learning and adaptive representation of the human body using Pyramid of Histogram of Oriented Gradients (PHOG) features. The changing shape and appearance of human body parts is tracked based on the weak appearance constancy assumption. The constantly changing shape of an OI is maximally covered by the small blocks to approximate the body contour of a segmented foreground object. In addition, the analytically determined learning phase guarantees lower computational burden for classification. The utilization of a minimum number of video frames in a causal way to recognize an action is also explored in this dissertation. The use of PHOG features adaptively extracted from individual frames allows the recognition of an incoming action video using a small group of frames which eliminates the need of large look-ahead

Scholarship at UWindsor

Application of advanced technology to space automation

Author: Chang C. Y.
Hughes C. A.
Lowrie J. W.
Polhemus J. T.
Schappell R. T.
Stephens J. R.
Publication venue
Publication date
Field of study

Automated operations in space provide the key to optimized mission design and data acquisition at minimum cost for the future. The results of this study strongly accentuate this statement and should provide further incentive for immediate development of specific automtion technology as defined herein. Essential automation technology requirements were identified for future programs. The study was undertaken to address the future role of automation in the space program, the potential benefits to be derived, and the technology efforts that should be directed toward obtaining these benefits

NASA Technical Reports Server

Unmanned aerial vehicle visual Simultaneous Localization and Mapping : a survey

Author: Ren J
Tian Y
Yang B
Yue H
Publication venue: 'IOP Publishing'
Publication date: 02/06/2022
Field of study

Simultaneous Localization and Mapping (SLAM) has been widely applied in robotics and other vision applications, such as navigation and path planning for unmanned aerial vehicles (UAVs). UAV navigation can be regarded as the process of robot planning to reach the target location safely and quickly. In order to complete the predetermined task, the drone must fully understand its state, including position, navigation speed, heading, starting point, and target position. With the rapid development of computer vision technology, vision-based navigation has become a powerful tool for autonomous navigation. A visual sensor can provide a wealth of online environmental information, has high sensitivity, strong anti-interference ability, and is suitable for perceiving dynamic environments. Most visual sensors are passive sensors, which prevent sensing systems from being detected. Compared with traditional sensors such as global positioning system (GPS), laser lightning, and ultrasonic sensors, visual SLAM can obtain rich visual information such as color, texture and depth. In this paper, a survey is provided on the development of relevant techniques of visual SLAM, visual odometry, image stabilization and image denoising with applications to UAVs. By analyzing the existing development, some future perspectives are briefed

University of Strathclyde Institutional Repository

A Patch-Based Method for Repetitive and Transient Event Detection in Fluorescence Imaging

Author: Alexandre Gidon
Charles Kervran
Jean Salamero
Jérôme Boulanger
Jörg Langowski
Publication venue: Public Library of Science
Publication date: 15/10/2010
Field of study

Automatic detection and characterization of molecular behavior in large data sets obtained by fast imaging in advanced light microscopy become key issues to decipher the dynamic architectures and their coordination in the living cell. Automatic quantification of the number of sudden and transient events observed in fluorescence microscopy is discussed in this paper. We propose a calibrated method based on the comparison of image patches expected to distinguish sudden appearing/vanishing fluorescent spots from other motion behaviors such as lateral movements. We analyze the performances of two statistical control procedures and compare the proposed approach to a frame difference approach using the same controls on a benchmark of synthetic image sequences. We have then selected a molecular model related to membrane trafficking and considered real image sequences obtained in cells stably expressing an endocytic-recycling trans-membrane protein, the Langerin-YFP, for validation. With this model, we targeted the efficient detection of fast and transient local fluorescence concentration arising in image sequences from a data base provided by two different microscopy modalities, wide field (WF) video microscopy using maximum intensity projection along the axial direction and total internal reflection fluorescence microscopy. Finally, the proposed detection method is briefly used to statistically explore the effect of several perturbations on the rate of transient events detected on the pilot biological model

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

INRIA a CCSD electronic archive server

NTIRE 2023 Quality Assessment of Video Enhancement Challenge

Author: Azadeh Mansouri
Chunzheng Zhu
Guangtao Zhai
Hanene Brachemi Meftah
Hang Shi
Haoning Wu
Haotian Fan
Heng Cong
Hongye Liu
Ironhead Chuang
Kai Zhang
Kai Zhao
Mirko Agarla
Radu Timofte
Shiling Zhao
Shiqi Zhou
Tengchuan Kou
Tengfei Shi
Wei Sun
Wenqi Wang
Xiaohong Liu
Xiongkuo Min
Yilin Li
Yixuan Gao
Yu Lai
Yulun Zhang
Yunlong Dong
Yuqin Cao
Zhiliang Ma
Zhiwei Huang
Ziheng Jia
Publication venue: IEEE/CVF
Publication date: 01/01/2023
Field of study

This paper reports on the NTIRE 2023 Quality Assessment of Video Enhancement Challenge, which will be held in conjunction with the New Trends in Image Restoration and Enhancement Workshop (NTIRE) at CVPR 2023. This challenge is to address a major challenge in the field of video processing, namely, video quality assessment (VQA) for enhanced videos. The challenge uses the VQA Dataset for Perceptual Video Enhancement (VDPVE), which has a total of 1211 enhanced videos, including 600 videos with color, brightness, and contrast enhancements, 310 videos with deblurring, and 301 deshaked videos. The challenge has a total of 167 registered participants. 61 participating teams submitted their prediction results during the development phase, with a total of 3168 submissions. A total of 176 submissions were submitted by 37 participating teams during the final testing phase. Finally, 19 participating teams submitted their models and fact sheets, and detailed the methods they used. Some methods have achieved better results than baseline methods, and the winning methods have demonstrated superior prediction performance

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)