1,037 research outputs found

    Buffer Capacity Computation for Throughput Constrained Streaming Applications with Data-Dependent Inter-Task Communication

    Get PDF
    Streaming applications are often implemented as task graphs, in which data is communicated from task to task over buffers. Currently, techniques exist to compute buffer capacities that guarantee satisfaction of the throughput constraint if the amount of data produced and consumed by the tasks is known at design-time. However, applications such as audio and video decoders have tasks that produce and consume an amount of data that depends on the decoded stream. This paper introduces a dataflow model that allows for data-dependent communication, together with an algorithm that computes buffer capacities that guarantee satisfaction of a throughput constraint. The applicability of this algorithm is demonstrated by computing buffer capacities for an H.263 video decoder

    WAVELET BASED DATA HIDING OF DEM IN THE CONTEXT OF REALTIME 3D VISUALIZATION (Visualisation 3D Temps-Réel à Distance de MNT par Insertion de Données Cachées Basée Ondelettes)

    No full text
    The use of aerial photographs, satellite images, scanned maps and digital elevation models necessitates the setting up of strategies for the storage and visualization of these data. In order to obtain a three dimensional visualization it is necessary to drape the images, called textures, onto the terrain geometry, called Digital Elevation Model (DEM). Practically, all these information are stored in three different files: DEM, texture and position/projection of the data in a geo-referential system. In this paper we propose to stock all these information in a single file for the purpose of synchronization. For this we have developed a wavelet-based embedding method for hiding the data in a colored image. The texture images containing hidden DEM data can then be sent from the server to a client in order to effect 3D visualization of terrains. The embedding method is integrable with the JPEG2000 coder to accommodate compression and multi-resolution visualization. Résumé L'utilisation de photographies aériennes, d'images satellites, de cartes scannées et de modèles numériques de terrains amène à mettre en place des stratégies de stockage et de visualisation de ces données. Afin d'obtenir une visualisation en trois dimensions, il est nécessaire de lier ces images appelées textures avec la géométrie du terrain nommée Modèle Numérique de Terrain (MNT). Ces informations sont en pratiques stockées dans trois fichiers différents : MNT, texture, position et projection des données dans un système géo-référencé. Dans cet article, nous proposons de stocker toutes ces informations dans un seul fichier afin de les synchroniser. Nous avons développé pour cela une méthode d'insertion de données cachées basée ondelettes dans une image couleur. Les images de texture contenant les données MNT cachées peuvent ensuite être envoyées du serveur au client afin d'effectuer une visualisation 3D de terrains. Afin de combiner une visualisation en multirésolution et une compression, l'insertion des données cachées est intégrable dans le codeur JPEG 2000

    Spiking neural networks for computer vision

    Get PDF
    State-of-the-art computer vision systems use frame-based cameras that sample the visual scene as a series of high-resolution images. These are then processed using convolutional neural networks using neurons with continuous outputs. Biological vision systems use a quite different approach, where the eyes (cameras) sample the visual scene continuously, often with a non-uniform resolution, and generate neural spike events in response to changes in the scene. The resulting spatio-temporal patterns of events are then processed through networks of spiking neurons. Such event-based processing offers advantages in terms of focusing constrained resources on the most salient features of the perceived scene, and those advantages should also accrue to engineered vision systems based upon similar principles. Event-based vision sensors, and event-based processing exemplified by the SpiNNaker (Spiking Neural Network Architecture) machine, can be used to model the biological vision pathway at various levels of detail. Here we use this approach to explore structural synaptic plasticity as a possible mechanism whereby biological vision systems may learn the statistics of their inputs without supervision, pointing the way to engineered vision systems with similar online learning capabilities

    Visual change detection on tunnel linings

    Get PDF
    We describe an automated system for detecting, localising, clustering and ranking visual changes on tunnel surfaces. The system is designed to provide assistance to expert human inspectors carrying out structural health monitoring and maintenance on ageing tunnel networks. A three-dimensional tunnel surface model is first recovered from a set of reference images using Structure from Motion techniques. New images are localised accurately within the model and changes are detected versus the reference images and model geometry. We formulate the problem of detecting changes probabilistically and evaluate the use of different feature maps and a novel geometric prior to achieve invariance to noise and nuisance sources such as parallax and lighting changes. A clustering and ranking method is proposed which efficiently presents detected changes and further improves the inspection efficiency. System performance is assessed on a real data set collected using a low-cost prototype capture device and labelled with ground truth. Results demonstrate that our system is a step towards higher frequency visual inspection at a reduced cost.The authors gratefully acknowledge the support by Toshiba Research Europe.This is the accepted manuscript. The final publication is available at Springer via http://dx.doi.org/10.1007/s00138-014-0648-8

    Two-stage sparse representation based abnormal crowd event detection in videos

    Get PDF
    Ubiquitous surveillance has become part of our lives to increase security and safety. Despite the wide application of surveillance systems, their efficiency is limited by human factors, such as boredom and fatigue; because most of the time, nothing unusual happens. In safety-critical applications, time is essential and it is vital to act fast to prevent costly incidents. This thesis proposes a two-stage abnormal crowd event detection framework based on k-means clustering in the first stage, and sparse representation based methods in the second stage, to alleviate the laborious task of video monitoring. We conduct a literature review of 18 studies, where we specifically focus on sparse representation based methods. Accordingly, we choose the spatio-temporal gradient feature due to its simplicity, efficiency, and effectiveness in motion representation. After extracting features only from normal events, k-means clustering is applied to separate different motion feature clusters. Then, clusters with smaller samples, which are deemed to contain mostly abnormal features, are removed according to a threshold. In the second stage, we learn a dictionary for each remaining cluster using the approximate K-SVD algorithm. In testing, the reconstruction error of a feature against a learned dictionary and its sparse representation is used to determine an abnormality. We conduct extensive experiments on a standard dataset to evaluate the detection performance of the method. Furthermore, the effect of hyper-parameters in our method is investigated. We also compare our method with different methods to examine its effectiveness. Results indicate that our abnormal event detection framework can successfully understand abnormal events in a scene while running in real-time at 161 frames per second. With a few exceptions, no significant advantage of the two-stage sparse representation approach over a single large dictionary was found. We speculate that these results may be influenced by a small sample size. Nevertheless, our approach, due to its unsupervised nature, can be adapted to different contexts without additional annotation effort and using only normal events from videos. Therefore it motivates us for further development
    corecore