1,112 research outputs found

    Segmentation and tracking of video objects for a content-based video indexing context

    Get PDF
    This paper examines the problem of segmentation and tracking of video objects for content-based information retrieval. Segmentation and tracking of video objects plays an important role in index creation and user request definition steps. The object is initially selected using a semi-automatic approach. For this purpose, a user-based selection is required to define roughly the object to be tracked. In this paper, we propose two different methods to allow an accurate contour definition from the user selection. The first one is based on an active contour model which progressively refines the selection by fitting the natural edges of the object while the second used a binary partition tree with aPeer ReviewedPostprint (published version

    The Proceedings of 15th Australian Information Security Management Conference, 5-6 December, 2017, Edith Cowan University, Perth, Australia

    Get PDF
    Conference Foreword The annual Security Congress, run by the Security Research Institute at Edith Cowan University, includes the Australian Information Security and Management Conference. Now in its fifteenth year, the conference remains popular for its diverse content and mixture of technical research and discussion papers. The area of information security and management continues to be varied, as is reflected by the wide variety of subject matter covered by the papers this year. The papers cover topics from vulnerabilities in “Internet of Things” protocols through to improvements in biometric identification algorithms and surveillance camera weaknesses. The conference has drawn interest and papers from within Australia and internationally. All submitted papers were subject to a double blind peer review process. Twenty two papers were submitted from Australia and overseas, of which eighteen were accepted for final presentation and publication. We wish to thank the reviewers for kindly volunteering their time and expertise in support of this event. We would also like to thank the conference committee who have organised yet another successful congress. Events such as this are impossible without the tireless efforts of such people in reviewing and editing the conference papers, and assisting with the planning, organisation and execution of the conference. To our sponsors, also a vote of thanks for both the financial and moral support provided to the conference. Finally, thank you to the administrative and technical staff, and students of the ECU Security Research Institute for their contributions to the running of the conference

    Segmentation-based mesh design for motion estimation

    Get PDF
    Dans la plupart des codec vidéo standard, l'estimation des mouvements entre deux images se fait généralement par l'algorithme de concordance des blocs ou encore BMA pour « Block Matching Algorithm ». BMA permet de représenter l'évolution du contenu des images en décomposant normalement une image par blocs 2D en mouvement translationnel. Cette technique de prédiction conduit habituellement à de sévères distorsions de 1'artefact de bloc lorsque Ie mouvement est important. De plus, la décomposition systématique en blocs réguliers ne dent pas compte nullement du contenu de l'image. Certains paramètres associes aux blocs, mais inutiles, doivent être transmis; ce qui résulte d'une augmentation de débit de transmission. Pour paillier a ces défauts de BMA, on considère les deux objectifs importants dans Ie codage vidéo, qui sont de recevoir une bonne qualité d'une part et de réduire la transmission a très bas débit d'autre part. Dans Ie but de combiner les deux exigences quasi contradictoires, il est nécessaire d'utiliser une technique de compensation de mouvement qui donne, comme transformation, de bonnes caractéristiques subjectives et requiert uniquement, pour la transmission, l'information de mouvement. Ce mémoire propose une technique de compensation de mouvement en concevant des mailles 2D triangulaires a partir d'une segmentation de l'image. La décomposition des mailles est construite a partir des nœuds repartis irrégulièrement Ie long des contours dans l'image. La décomposition résultant est ainsi basée sur Ie contenu de l'image. De plus, étant donné la même méthode de sélection des nœuds appliquée à l'encodage et au décodage, la seule information requise est leurs vecteurs de mouvement et un très bas débit de transmission peut ainsi être réalise. Notre approche, comparée avec BMA, améliore à la fois la qualité subjective et objective avec beaucoup moins d'informations de mouvement. Dans la premier chapitre, une introduction au projet sera présentée. Dans Ie deuxième chapitre, on analysera quelques techniques de compression dans les codec standard et, surtout, la populaire BMA et ses défauts. Dans Ie troisième chapitre, notre algorithme propose et appelé la conception active des mailles a base de segmentation, sera discute en détail. Ensuite, les estimation et compensation de mouvement seront décrites dans Ie chapitre 4. Finalement, au chapitre 5, les résultats de simulation et la conclusion seront présentés.Abstract: In most video compression standards today, the generally accepted method for temporal prediction is motion compensation using block matching algorithm (BMA). BMA represents the scene content evolution with 2-D rigid translational moving blocks. This kind of predictive scheme usually leads to distortions such as block artefacts especially when the motion is important. The two most important aims in video coding are to receive a good quality on one hand and a low bit-rate on the other. This thesis proposes a motion compensation scheme using segmentation-based 2-D triangular mesh design method. The mesh is constructed by irregularly spread nodal points selected along image contour. Based on this, the generated mesh is, to a great extent, image content based. Moreover, the nodes are selected with the same method on the encoder and decoder sides, so that the only information that has to be transmitted are their motion vectors, and thus very low bit-rate can be achieved. Compared with BMA, our approach could improve subjective and objective quality with much less motion information."--Résumé abrégé par UM

    Emotion recognition in talking-face videos using persistent entropy and neural networks

    Get PDF
    The automatic recognition of a person’s emotional state has become a very active research field that involves scientists specialized in different areas such as artificial intelligence, computer vi sion, or psychology, among others. Our main objective in this work is to develop a novel approach, using persistent entropy and neural networks as main tools, to recognise and classify emotions from talking-face videos. Specifically, we combine audio-signal and image-sequence information to com pute a topology signature (a 9-dimensional vector) for each video. We prove that small changes in the video produce small changes in the signature, ensuring the stability of the method. These topological signatures are used to feed a neural network to distinguish between the following emotions: calm, happy, sad, angry, fearful, disgust, and surprised. The results reached are promising and competitive, beating the performances achieved in other state-of-the-art works found in the literature.Agencia Estatal de Investigación PID2019-107339GB-100Agencia Andaluza del Conocimiento P20-0114

    Development of a text reading system on video images

    Get PDF
    Since the early days of computer science researchers sought to devise a machine which could automatically read text to help people with visual impairments. The problem of extracting and recognising text on document images has been largely resolved, but reading text from images of natural scenes remains a challenge. Scene text can present uneven lighting, complex backgrounds or perspective and lens distortion; it usually appears as short sentences or isolated words and shows a very diverse set of typefaces. However, video sequences of natural scenes provide a temporal redundancy that can be exploited to compensate for some of these deficiencies. Here we present a complete end-to-end, real-time scene text reading system on video images based on perspective aware text tracking. The main contribution of this work is a system that automatically detects, recognises and tracks text in videos of natural scenes in real-time. The focus of our method is on large text found in outdoor environments, such as shop signs, street names and billboards. We introduce novel efficient techniques for text detection, text aggregation and text perspective estimation. Furthermore, we propose using a set of Unscented Kalman Filters (UKF) to maintain each text region¿s identity and to continuously track the homography transformation of the text into a fronto-parallel view, thereby being resilient to erratic camera motion and wide baseline changes in orientation. The orientation of each text line is estimated using a method that relies on the geometry of the characters themselves to estimate a rectifying homography. This is done irrespective of the view of the text over a large range of orientations. We also demonstrate a wearable head-mounted device for text reading that encases a camera for image acquisition and a pair of headphones for synthesized speech output. Our system is designed for continuous and unsupervised operation over long periods of time. It is completely automatic and features quick failure recovery and interactive text reading. It is also highly parallelised in order to maximize the usage of available processing power and to achieve real-time operation. We show comparative results that improve the current state-of-the-art when correcting perspective deformation of scene text. The end-to-end system performance is demonstrated on sequences recorded in outdoor scenarios. Finally, we also release a dataset of text tracking videos along with the annotated ground-truth of text regions

    A Framework for Modeling the Growth and Development of Neurons and Networks

    Get PDF
    The development of neural tissue is a complex organizing process, in which it is difficult to grasp how the various localized interactions between dividing cells leads relentlessly to global network organization. Simulation is a useful tool for exploring such complex processes because it permits rigorous analysis of observed global behavior in terms of the mechanistic axioms declared in the simulated model. We describe a novel simulation tool, CX3D, for modeling the development of large realistic neural networks such as the neocortex, in a physical 3D space. In CX3D, as in biology, neurons arise by the replication and migration of precursors, which mature into cells able to extend axons and dendrites. Individual neurons are discretized into spherical (for the soma) and cylindrical (for neurites) elements that have appropriate mechanical properties. The growth functions of each neuron are encapsulated in set of pre-defined modules that are automatically distributed across its segments during growth. The extracellular space is also discretized, and allows for the diffusion of extracellular signaling molecules, as well as the physical interactions of the many developing neurons. We demonstrate the utility of CX3D by simulating three interesting developmental processes: neocortical lamination based on mechanical properties of tissues; a growth model of a neocortical pyramidal cell based on layer-specific guidance cues; and the formation of a neural network in vitro by employing neurite fasciculation. We also provide some examples in which previous models from the literature are re-implemented in CX3D. Our results suggest that CX3D is a powerful tool for understanding neural development

    Cross-layer Optimized Wireless Video Surveillance

    Get PDF
    A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system. The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion. In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality. The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos. In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work. Adviser: Song C

    Cross-layer Optimized Wireless Video Surveillance

    Get PDF
    A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system. The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion. In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality. The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos. In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work. Adviser: Song C
    corecore