Search CORE

136 research outputs found

Deep learning-based anomalous object detection system powered by microcontroller for PTZ cameras

Author: Benito Picazo Jesús
Domínguez-Merino Enrique
López-Rubio Ezequiel
Ortiz-de-lazcano-Lobato Juan Miguel
Palomo Esteban J.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

Automatic video surveillance systems are usually designed to detect anomalous objects being present in a scene or behaving dangerously. In order to perform adequately, they must incorporate models able to achieve accurate pattern recognition in an image, and deep learning neural networks excel at this task. However, exhaustive scan of the full image results in multiple image blocks or windows to analyze, which could make the time performance of the system very poor when implemented on low cost devices. This paper presents a system which attempts to detect abnormal moving objects within an area covered by a PTZ camera while it is panning. The decision about the block of the image to analyze is based on a mixture distribution composed of two components: a uniform probability distribution, which represents a blind random selection, and a mixture of Gaussian probability distributions. Gaussian distributions represent windows in the image where anomalous objects were detected previously and contribute to generate the next window to analyze close to those windows of interest. The system is implemented on a Raspberry Pi microcontroller-based board, which enables the design and implementation of a low-cost monitoring system that is able to perform image processing.Universidad de Málaga. Campus de Excelencia Internacional Andalucía Tech

Crossref

Repositorio Institucional Universidad de Málaga

Storytelling with salient stills

Author: Massey Michael J
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1996
Field of study

Thesis (M.S.)--Massachusetts Institute of Technology, Program in Media Arts & Sciences, 1996.Includes bibliographical references (p. 59-63).Michale J. Massey.M.S

DSpace@MIT

Salient stills

Author: Teodosio Laura A
Publication venue: Massachusetts Institute of Technology
Publication date: 01/01/1992
Field of study

Thesis (M.S.)--Massachusetts Institute of Technology, Dept. of Architecture, 1992.Includes bibliographical references (leaves 67-70).by Laura A. Teodosio.M.S

CiteSeerX

DSpace@MIT

Model- and image-based scene representation.

Author
Publication venue
Publication date: 01/01/1999
Field of study

Lee Kam Sum.Thesis (M.Phil.)--Chinese University of Hong Kong, 1999.Includes bibliographical references (leaves 97-101).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.2Chapter 1.1 --- Video representation using panorama mosaic and 3D face model --- p.2Chapter 1.2 --- Mosaic-based Video Representation --- p.3Chapter 1.3 --- "3D Human Face modeling ," --- p.7Chapter 2 --- Background --- p.13Chapter 2.1 --- Video Representation using Mosaic Image --- p.13Chapter 2.1.1 --- Traditional Video Compression --- p.17Chapter 2.2 --- 3D Face model Reconstruction via Multiple Views --- p.19Chapter 2.2.1 --- Shape from Silhouettes --- p.19Chapter 2.2.2 --- Head and Face Model Reconstruction --- p.22Chapter 2.2.3 --- Reconstruction using Generic Model --- p.24Chapter 3 --- System Overview --- p.27Chapter 3.1 --- Panoramic Video Coding Process --- p.27Chapter 3.2 --- 3D Face model Reconstruction Process --- p.28Chapter 4 --- Panoramic Video Representation --- p.32Chapter 4.1 --- Mosaic Construction --- p.32Chapter 4.1.1 --- Cylindrical Panorama Mosaic --- p.32Chapter 4.1.2 --- Cylindrical Projection of Mosaic Image --- p.34Chapter 4.2 --- Foreground Segmentation and Registration --- p.37Chapter 4.2.1 --- Segmentation Using Panorama Mosaic --- p.37Chapter 4.2.2 --- Determination of Background by Local Processing --- p.38Chapter 4.2.3 --- Segmentation from Frame-Mosaic Comparison --- p.40Chapter 4.3 --- Compression of the Foreground Regions --- p.44Chapter 4.3.1 --- MPEG-1 Compression --- p.44Chapter 4.3.2 --- MPEG Coding Method: I/P/B Frames --- p.45Chapter 4.4 --- Video Stream Reconstruction --- p.48Chapter 5 --- Three Dimensional Human Face modeling --- p.52Chapter 5.1 --- Capturing Images for 3D Face modeling --- p.53Chapter 5.2 --- Shape Estimation and Model Deformation --- p.55Chapter 5.2.1 --- Head Shape Estimation and Model deformation --- p.55Chapter 5.2.2 --- Face organs shaping and positioning --- p.58Chapter 5.2.3 --- Reconstruction with both intrinsic and extrinsic parameters --- p.59Chapter 5.2.4 --- Reconstruction with only Intrinsic Parameter --- p.63Chapter 5.2.5 --- Essential Matrix --- p.65Chapter 5.2.6 --- Estimation of Essential Matrix --- p.66Chapter 5.2.7 --- Recovery of 3D Coordinates from Essential Matrix --- p.67Chapter 5.3 --- Integration of Head Shape and Face Organs --- p.70Chapter 5.4 --- Texture-Mapping --- p.71Chapter 6 --- Experimental Result & Discussion --- p.74Chapter 6.1 --- Panoramic Video Representation --- p.74Chapter 6.1.1 --- Compression Improvement from Foreground Extraction --- p.76Chapter 6.1.2 --- Video Compression Performance --- p.78Chapter 6.1.3 --- Quality of Reconstructed Video Sequence --- p.80Chapter 6.2 --- 3D Face model Reconstruction --- p.91Chapter 7 --- Conclusion and Future Direction --- p.94Bibliography --- p.10

CUHK Digital Repository

Novel Methods and Algorithms for Presenting 3D Scenes

Author: BALDACCI ANDREA
Publication venue: 'Pisa University Press'
Publication date: 12/01/2017
Field of study

In recent years, improvements in the acquisition and creation of 3D models gave rise to an increasing availability of 3D content and to a widening of the audience such content is created for, which brought into focus the need for effective ways to visualize and interact with it. Until recently, the task of virtual inspection of a 3D object or navigation inside a 3D scene was carried out by using human machine interaction (HMI) metaphors controlled through mouse and keyboard events. However, this interaction approach may be cumbersome for the general audience. Furthermore, the inception and spread of touch-based mobile devices, such as smartphones and tablets, redefined the interaction problem entirely, since neither mouse nor keyboards are available anymore. The problem is made even worse by the fact that these devices are typically lower power if compared to desktop machines, while high-quality rendering is a computationally intensive task. In this thesis, we present a series of novel methods for the easy presentation of 3D content both when it is already available in a digitized form and when it must be acquired from the real world by image-based techniques. In the first case, we propose a method which takes as input the 3D scene of interest and an example video, and it automatically produces a video of the input scene that resembles the given video example. In other words, our algorithm allows the user to replicate an existing video, for example, a video created by a professional animator, on a different 3D scene. In the context of image-based techniques, exploiting the inherent spatial organization of photographs taken for the 3D reconstruction of a scene, we propose an intuitive interface for the smooth stereoscopic navigation of the acquired scene providing an immersive experience without the need of a complete 3D reconstruction. Finally, we propose an interactive framework for improving low-quality 3D reconstructions obtained through image-based reconstruction algorithms. Using few strokes on the input images, the user can specify high-level geometric hints to improve incomplete or noisy reconstructions which are caused by various quite common conditions often arising for objects such as buildings, streets and numerous other human-made functional elements

Electronic Thesis and Dissertation Archive - Università di Pisa

Applying image processing techniques to pose estimation and view synthesis.

Author
Publication venue
Publication date: 01/01/1999
Field of study

Fung Yiu-fai Phineas.Thesis (M.Phil.)--Chinese University of Hong Kong, 1999.Includes bibliographical references (leaves 142-148).Abstracts in English and Chinese.Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Model-based Pose Estimation --- p.3Chapter 1.1.1 --- Application - 3D Motion Tracking --- p.4Chapter 1.2 --- Image-based View Synthesis --- p.4Chapter 1.3 --- Thesis Contribution --- p.7Chapter 1.4 --- Thesis Outline --- p.8Chapter 2 --- General Background --- p.9Chapter 2.1 --- Notations --- p.9Chapter 2.2 --- Camera Models --- p.10Chapter 2.2.1 --- Generic Camera Model --- p.10Chapter 2.2.2 --- Full-perspective Camera Model --- p.11Chapter 2.2.3 --- Affine Camera Model --- p.12Chapter 2.2.4 --- Weak-perspective Camera Model --- p.13Chapter 2.2.5 --- Paraperspective Camera Model --- p.14Chapter 2.3 --- Model-based Motion Analysis --- p.15Chapter 2.3.1 --- Point Correspondences --- p.16Chapter 2.3.2 --- Line Correspondences --- p.18Chapter 2.3.3 --- Angle Correspondences --- p.19Chapter 2.4 --- Panoramic Representation --- p.20Chapter 2.4.1 --- Static Mosaic --- p.21Chapter 2.4.2 --- Dynamic Mosaic --- p.22Chapter 2.4.3 --- Temporal Pyramid --- p.23Chapter 2.4.4 --- Spatial Pyramid --- p.23Chapter 2.5 --- Image Pre-processing --- p.24Chapter 2.5.1 --- Feature Extraction --- p.24Chapter 2.5.2 --- Spatial Filtering --- p.27Chapter 2.5.3 --- Local Enhancement --- p.31Chapter 2.5.4 --- Dynamic Range Stretching or Compression --- p.32Chapter 2.5.5 --- YIQ Color Model --- p.33Chapter 3 --- Model-based Pose Estimation --- p.35Chapter 3.1 --- Previous Work --- p.35Chapter 3.1.1 --- Estimation from Established Correspondences --- p.36Chapter 3.1.2 --- Direct Estimation from Image Intensities --- p.49Chapter 3.1.3 --- Perspective-3-Point Problem --- p.51Chapter 3.2 --- Our Iterative P3P Algorithm --- p.58Chapter 3.2.1 --- Gauss-Newton Method --- p.60Chapter 3.2.2 --- Dealing with Ambiguity --- p.61Chapter 3.2.3 --- 3D-to-3D Motion Estimation --- p.66Chapter 3.3 --- Experimental Results --- p.68Chapter 3.3.1 --- Synthetic Data --- p.68Chapter 3.3.2 --- Real Images --- p.72Chapter 3.4 --- Discussions --- p.73Chapter 4 --- Panoramic View Analysis --- p.76Chapter 4.1 --- Advanced Mosaic Representation --- p.76Chapter 4.1.1 --- Frame Alignment Policy --- p.77Chapter 4.1.2 --- Multi-resolution Representation --- p.77Chapter 4.1.3 --- Parallax-based Representation --- p.78Chapter 4.1.4 --- Multiple Moving Objects --- p.79Chapter 4.1.5 --- Layers and Tiles --- p.79Chapter 4.2 --- Panorama Construction --- p.79Chapter 4.2.1 --- Image Acquisition --- p.80Chapter 4.2.2 --- Image Alignment --- p.82Chapter 4.2.3 --- Image Integration --- p.88Chapter 4.2.4 --- Significant Residual Estimation --- p.89Chapter 4.3 --- Advanced Alignment Algorithms --- p.90Chapter 4.3.1 --- Patch-based Alignment --- p.91Chapter 4.3.2 --- Global Alignment (Block Adjustment) --- p.92Chapter 4.3.3 --- Local Alignment (Deghosting) --- p.93Chapter 4.4 --- Mosaic Application --- p.94Chapter 4.4.1 --- Visualization Tool --- p.94Chapter 4.4.2 --- Video Manipulation --- p.95Chapter 4.5 --- Experimental Results --- p.96Chapter 5 --- Panoramic Walkthrough --- p.99Chapter 5.1 --- Problem Statement and Notations --- p.100Chapter 5.2 --- Previous Work --- p.101Chapter 5.2.1 --- 3D Modeling and Rendering --- p.102Chapter 5.2.2 --- Branching Movies --- p.103Chapter 5.2.3 --- Texture Window Scaling --- p.104Chapter 5.2.4 --- Problems with Simple Texture Window Scaling --- p.105Chapter 5.3 --- Our Walkthrough Approach --- p.106Chapter 5.3.1 --- Cylindrical Projection onto Image Plane --- p.106Chapter 5.3.2 --- Generating Intermediate Frames --- p.108Chapter 5.3.3 --- Occlusion Handling --- p.114Chapter 5.4 --- Experimental Results --- p.116Chapter 5.5 --- Discussions --- p.116Chapter 6 --- Conclusion --- p.121Chapter A --- Formulation of Fischler and Bolles' Method for P3P Problems --- p.123Chapter B --- Derivation of z1 and z3 in terms of z2 --- p.127Chapter C --- Derivation of e1 and e2 --- p.129Chapter D --- Derivation of the Update Rule for Gauss-Newton Method --- p.130Chapter E --- Proof of (λ1λ2-λ 4）>〉0 --- p.132Chapter F --- Derivation of φ and hi --- p.133Chapter G --- Derivation of w1j to w4j --- p.134Chapter H --- More Experimental Results on Panoramic Stitching Algorithms --- p.138Bibliography --- p.14

CUHK Digital Repository

Multimodal extraction of events and of information about the recording activity in user generated videos

Author: CGM Snoek
Francesco Cricri
Igor D. D. Curcio
Kostadin Dabov
Moncef Gabbouj
Sujeet Mate
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Towards Data-Driven Large Scale Scientific Visualization and Exploration

Author: Ip Cheuk Yiu
Publication venue
Publication date: 01/01/2013
Field of study

Technological advances have enabled us to acquire extremely large datasets but it remains a challenge to store, process, and extract information from them. This dissertation builds upon recent advances in machine learning, visualization, and user interactions to facilitate exploration of large-scale scientific datasets. First, we use data-driven approaches to computationally identify regions of interest in the datasets. Second, we use visual presentation for effective user comprehension. Third, we provide interactions for human users to integrate domain knowledge and semantic information into this exploration process. Our research shows how to extract, visualize, and explore informative regions on very large 2D landscape images, 3D volumetric datasets, high-dimensional volumetric mouse brain datasets with thousands of spatially-mapped gene expression profiles, and geospatial trajectories that evolve over time. The contribution of this dissertation include: (1) We introduce a sliding-window saliency model that discovers regions of user interest in very large images; (2) We develop visual segmentation of intensity-gradient histograms to identify meaningful components from volumetric datasets; (3) We extract boundary surfaces from a wealth of volumetric gene expression mouse brain profiles to personalize the reference brain atlas; (4) We show how to efficiently cluster geospatial trajectories by mapping each sequence of locations to a high-dimensional point with the kernel distance framework. We aim to discover patterns, relationships, and anomalies that would lead to new scientific, engineering, and medical advances. This work represents one of the first steps toward better visual understanding of large-scale scientific data by combining machine learning and human intelligence

Digital Repository at the University of Maryland

Augmented reality device for first response scenarios

Author: Bogucki Robert Andrzej
Publication venue: University of New Hampshire Scholars\u27 Repository
Publication date: 01/01/2006
Field of study

A prototype of a wearable computer system is proposed and implemented using commercial off-shelf components. The system is designed to allow the user to access location-specific information about an environment, and to provide capability for user tracking. Areas of applicability include primarily first response scenarios, with possible applications in maintenance or construction of buildings and other structures. Necessary preparation of the target environment prior to system\u27s deployment is limited to noninvasive labeling using optical fiducial markers. The system relies on computational vision methods for registration of labels and user position. With the system the user has access to on-demand information relevant to a particular real-world location. Team collaboration is assisted by user tracking and real-time visualizations of team member positions within the environment. The user interface and display methods are inspired by Augmented Reality1 (AR) techniques, incorporating a video-see-through Head Mounted Display (HMD) and fingerbending sensor glove.*. 1Augmented reality (AR) is a field of computer research which deals with the combination of real world and computer generated data. At present, most AR research is concerned with the use of live video imagery which is digitally processed and augmented by the addition of computer generated graphics. Advanced research includes the use of motion tracking data, fiducial marker recognition using machine vision, and the construction of controlled environments containing any number of sensors and actuators. (Source: Wikipedia) *This dissertation is a compound document (contains both a paper copy and a CD as part of the dissertation). The CD requires the following system requirements: Adobe Acrobat; Microsoft Office; Windows MediaPlayer or RealPlayer

UNH Scholars' Repository

Spherical mosaic construction using physical analogy for consistent image alignment

Author: Gonzalez Manuel Guillen
Publication venue
Publication date
Field of study

The research contained in this thesis is an investigation into mosaic construction. Mosaic techniques are used to obtain images with a large field of view by assembling a sequence of smaller individual overlapping images. In existing methods of mosaic construction only successive images are aligned. Accumulation of small alignment errors occur, and in the case of the image path returning to a previous position in the mosaic, a significant mismatch between nonconsecutive images will result (looping path problem). A new method for consistently aligning all the images in a mosaic is proposed in this thesis. This is achieved by distribution of the small alignment errors. Each image is allowed to modify its position relative to its neighbour images in the mosaic by a small amount with respect to the computed registration. Two images recorded by a rotating ideal camera are related by the same transformation that relates the camera's sensor plane at the time the images were captured. When two images overlap, the intensity values in both images coincide through the intersection line of the sensor planes. This intersection line has the property that the images can be seamlessly joined through that line. An analogy between the images and the physical world is proposed to solve the looping path problem. The images correspond to rigid objects, and these are linked with forces which pull them towards the right positions with respect to their neighbours. That is, every pair of overlapping images are "hinged" through their corresponding intersection line. Aided by another constraint named the spherical constraint, this network of selforganising images has the ability of distributing itself on the surface of a sphere. As a direct result of the new concepts developed in this research work, spherical mosaics (i.e. mosaics with unlimited horizontal and vertical field of view) can be created

CLoK