144 research outputs found
Flying Target Detection and Recognition by Feature Fusion
This paper presents a near-realtime visual detection and
recognition approach for flying target detection and recognition. Detection is based on fast and robust background modeling and shape extraction, while recognition of target classes is based on shape and texture fused querying on a-priori built real datasets. Main application areas are
passive defense and surveillance scenarios
Multi-modal video analysis for early fire detection
In dit proefschrift worden verschillende aspecten van een intelligent videogebaseerd branddetectiesysteem onderzocht. In een eerste luik ligt de nadruk op de multimodale verwerking van visuele, infrarood en time-of-flight videobeelden, die de louter visuele detectie verbetert. Om de verwerkingskost zo minimaal mogelijk te houden, met het oog op real-time detectie, is er voor elk van het type sensoren een set ’low-cost’ brandkarakteristieken geselecteerd die vuur en vlammen uniek beschrijven. Door het samenvoegen van de verschillende typen informatie kunnen het aantal gemiste detecties en valse alarmen worden gereduceerd, wat resulteert in een significante verbetering van videogebaseerde branddetectie. Om de multimodale detectieresultaten te kunnen combineren, dienen de multimodale beelden wel geregistreerd (~gealigneerd) te zijn. Het tweede luik van dit proefschrift focust zich hoofdzakelijk op dit samenvoegen van multimodale data en behandelt een nieuwe silhouet gebaseerde registratiemethode. In het derde en tevens laatste luik van dit proefschrift worden methodes voorgesteld om videogebaseerde brandanalyse, en in een latere fase ook brandmodellering, uit te voeren. Elk van de voorgestelde technieken voor multimodale detectie en multi-view lokalisatie zijn uitvoerig getest in de praktijk. Zo werden onder andere succesvolle testen uitgevoerd voor de vroegtijdige detectie van wagenbranden in ondergrondse parkeergarages
Non-destructive technologies for fruit and vegetable size determination - a review
Here, we review different methods for non-destructive horticultural produce size determination, focusing on electronic technologies capable of measuring fruit volume. The usefulness of produce size estimation is justified and a comprehensive classification system of the existing electronic techniques to determine dimensional size is proposed. The different systems identified are compared in terms of their versatility, precision and throughput. There is general agreement in considering that online measurement of axes, perimeter and projected area has now been achieved. Nevertheless, rapid and accurate volume determination of irregular-shaped produce, as needed for density sorting, has only become available in the past few years. An important application of density measurement is soluble solids content (SSC) sorting. If the range of SSC in the batch is narrow and a large number of classes are desired, accurate volume determination becomes important. A good alternative for fruit three-dimensional surface reconstruction, from which volume and surface area can be computed, is the combination of height profiles from a range sensor with a two-dimensional object image boundary from a solid-state camera (brightness image) or from the range sensor itself (intensity image). However, one of the most promising technologies in this field is 3-D multispectral scanning, which combines multispectral data with 3-D surface reconstructio
Video fire detection - Review
Cataloged from PDF version of article.This is a review article describing the recent developments in Video based Fire Detection (VFD). Video
surveillance cameras and computer vision methods are widely used in many security applications. It is
also possible to use security cameras and special purpose infrared surveillance cameras for fire detection.
This requires intelligent video processing techniques for detection and analysis of uncontrolled fire
behavior. VFD may help reduce the detection time compared to the currently available sensors in both
indoors and outdoors because cameras can monitor “volumes” and do not have transport delay that the
traditional “point” sensors suffer from. It is possible to cover an area of 100 km2 using a single pan-tiltzoom
camera placed on a hilltop for wildfire detection. Another benefit of the VFD systems is that they
can provide crucial information about the size and growth of the fire, direction of smoke propagation.
© 2013 Elsevier Inc. All rights reserve
Use of Microsoft Kinect in a dual camera setup for action recognition applications
Conventional human action recognition methods use a single light camera to extract all the necessary information needed to perform the recognition. However, the use of a single light camera poses limitations which can not be addressed without a hardware change. In this thesis, we propose a novel approach to the multi camera setup. Our approach utilizes the skeletal pose estimation capabilities of the Microsoft Kinect camera, and uses this estimated pose on the image of the non-depth camera. The approach aims at improving performance of image analysis of multiple camera, which would not be as easy in a typical multiple camera setup. The depth information sharing between the camera is in the form of pose projection, which depends on location awareness between them, where the locations can be found using chessboard pattern calibration techniques. Due to the limitations of pattern calibration, we propose a novel calibration refinement approach to increase the detection distance, and simplify the long calibration process. The two tests performed demonstrate that the pose projection process performs with good accuracy with a successful calibration and good Kinect pose estimation, however not so with a failed one. Three tests were performed to determine the calibration performance. Distance calculations were prone to error with a mean accuracy of 96% under 60cm difference, and dropping drastically beyond that, and a stable orientation calculation with mean accuracy of 97%. Last test also proves that our new refinement approach improves the outcome of the projection significantly with a failed pattern calibration, and allows for almost double the camera difference detection of about 120cm. While the orientation mean calculation accuracy achieved similar results to pattern calibration, the distance was less so at around 92%, however, it did maintain a stable standard deviation, while the pattern calibration increased as distance increased
Modeling and Simulation in Engineering
This book provides an open platform to establish and share knowledge developed by scholars, scientists, and engineers from all over the world, about various applications of the modeling and simulation in the design process of products, in various engineering fields. The book consists of 12 chapters arranged in two sections (3D Modeling and Virtual Prototyping), reflecting the multidimensionality of applications related to modeling and simulation. Some of the most recent modeling and simulation techniques, as well as some of the most accurate and sophisticated software in treating complex systems, are applied. All the original contributions in this book are jointed by the basic principle of a successful modeling and simulation process: as complex as necessary, and as simple as possible. The idea is to manipulate the simplifying assumptions in a way that reduces the complexity of the model (in order to make a real-time simulation), but without altering the precision of the results
Hybrid machine learning approaches for scene understanding: From segmentation and recognition to image parsing
We alleviate the problem of semantic scene understanding by studies on object segmentation/recognition and scene labeling methods respectively. We propose new techniques for joint recognition, segmentation and pose estimation of infrared (IR) targets. The problem is formulated in a probabilistic level set framework where a shape constrained generative model is used to provide a multi-class and multi-view shape prior and where the shape model involves a couplet of view and identity manifolds (CVIM). A level set energy function is then iteratively optimized under the shape constraints provided by the CVIM. Since both the view and identity variables are expressed explicitly in the objective function, this approach naturally accomplishes recognition, segmentation and pose estimation as joint products of the optimization process. For realistic target chips, we solve the resulting multi-modal optimization problem by adopting a particle swarm optimization (PSO) algorithm and then improve the computational efficiency by implementing a gradient-boosted PSO (GB-PSO). Evaluation was performed using the Military Sensing Information Analysis Center (SENSIAC) ATR database, and experimental results show that both of the PSO algorithms reduce the cost of shape matching during CVIM-based shape inference. Particularly, GB-PSO outperforms other recent ATR algorithms, which require intensive shape matching, either explicitly (with pre-segmentation) or implicitly (without pre-segmentation). On the other hand, under situations when target boundaries are not obviously observed and object shapes are not preferably detected, we explored some sparse representation classification (SRC) methods on ATR applications, and developed a fusion technique that combines the traditional SRC and a group constrained SRC algorithm regulated by a sparsity concentration index for improved classification accuracy on the Comanche dataset. Moreover, we present a compact rare class-oriented scene labeling framework (RCSL) with a global scene assisted rare class retrieval process, where the retrieved subset was expanded by choosing scene regulated rare class patches. A complementary rare class balanced CNN is learned to alleviate imbalanced data distribution problem at lower cost. A superpixels-based re-segmentation was implemented to produce more perceptually meaningful object boundaries. Quantitative results demonstrate the promising performances of proposed framework on both pixel and class accuracy for scene labeling on the SIFTflow dataset, especially for rare class objects
Articulated human tracking and behavioural analysis in video sequences
Recently, there has been a dramatic growth of interest in the observation and tracking
of human subjects through video sequences. Arguably, the principal impetus has come
from the perceived demand for technological surveillance, however applications in entertainment,
intelligent domiciles and medicine are also increasing. This thesis examines
human articulated tracking and the classi cation of human movement, rst separately
and then as a sequential process.
First, this thesis considers the development and training of a 3D model of human body
structure and dynamics. To process video sequences, an observation model is also designed
with a multi-component likelihood based on edge, silhouette and colour. This is de ned on
the articulated limbs, and visible from a single or multiple cameras, each of which may be
calibrated from that sequence. Second, for behavioural analysis, we develop a methodology
in which actions and activities are described by semantic labels generated from a Movement
Cluster Model (MCM). Third, a Hierarchical Partitioned Particle Filter (HPPF) was
developed for human tracking that allows multi-level parameter search consistent with the
body structure. This tracker relies on the articulated motion prediction provided by the
MCM at pose or limb level. Fourth, tracking and movement analysis are integrated to
generate a probabilistic activity description with action labels.
The implemented algorithms for tracking and behavioural analysis are tested extensively
and independently against ground truth on human tracking and surveillance
datasets. Dynamic models are shown to predict and generate synthetic motion, while
MCM recovers both periodic and non-periodic activities, de ned either on the whole body
or at the limb level. Tracking results are comparable with the state of the art, however
the integrated behaviour analysis adds to the value of the approach.Overseas Research Students Awards Scheme (ORSAS
Spatiotemporal analysis of human actions using RGB-D cameras
Markerless human motion analysis has strong potential to provide cost-efficient solution for action recognition and body pose estimation. Many applications including humancomputer interaction, video surveillance, content-based video indexing, and automatic annotation among others will benefit from a robust solution to these problems. Depth sensing technologies in recent years have positively changed the climate of the automated vision-based human action recognition problem, deemed to be very difficult due to the various ambiguities inherent to conventional video. In this work, first a large set of invariant spatiotemporal features is extracted from skeleton joints (retrieved from depth sensor) in motion and evaluated as baseline performance. Next we introduce a discriminative Random Decision Forest-based feature selection framework capable of reaching impressive action recognition performance when combined with a linear SVM classifier. This approach improves upon the baseline performance obtained using the whole feature set with a significantly less number of features (one tenth of the original). The approach can also be used to provide insights on the spatiotemporal dynamics of human actions. A novel therapeutic action recognition dataset (WorkoutSU-10) is presented. We took advantage of this dataset as a benchmark in our tests to evaluate the reliability of our proposed methods. Recently the dataset has been published publically as a contribution to the action recognition community. In addition, an interactive action evaluation application is developed by utilizing the proposed methods to help with real life problems such as 'fall detection' in the elderly people or automated therapy program for patients with motor disabilities
- …