75 research outputs found

    Robust real-time tracking in smart camera networks

    Get PDF

    Visual Object Tracking Approach Based on Wavelet Transforms

    Get PDF
    In this Thesis, a new visual object tracking (VOT) approach is proposed to overcome the main challenging problem encountered within the existing approaches known as the significant appearance changes which is due mainly to the heavy occlusion and illumination variations. Indeed, the proposed approach is based on combining the deep convolutional neural networks (CNN), the histograms of oriented gradients (HOG) features, and the discrete wavelet packet transform to ensure the implementation of three ideas. Firstly, solving the problem of illumination variation by incorporating the coefficients of the image discrete wavelet packet transform instead of the image template to handle the case of images with high saturation in the input of the used CNN, whereas the inverse discrete wavelet packet transform is used at the output for extracting the CNN features. Secondly, by combining four learned correlation filters with convolutional features, the target location is deduced using multichannel correlation maps at the CNNs output. On the other side, the maximum value of the resulting maps from correlation filters with convolutional features produced by HOG feature of the image template previously obtained are calculated and which are used as an updating parameter of the correlation filters extracted from CNN and from HOG where the major aim is to ensure long-term memory of target appearance so that the target item may be recovered if tracking fails. Thirdly, to increase the performance of HOG, the coefficients of the discrete packet wavelet transform are employed instead of the image template. Finally, for the validation and the evaluation of the proposed tracking approach performance based on specific performance metrics in comparison to the state-of-the-art counterparts, extensive simulation experiments on benchmark datasets have been conducted out, such as OTB50, OTB100 , TC128 ,and UAV20. The obtained results clearly prove the validity of the proposed approach in solving the encountered problems of visual object tracking in almost the experiment cases presented in this thesis compared to other existing tracking approaches

    De-identification for privacy protection in multimedia content : A survey

    Get PDF
    This document is the Accepted Manuscript version of the following article: Slobodan Ribaric, Aladdin Ariyaeeinia, and Nikola Pavesic, ‘De-identification for privacy protection in multimedia content: A survey’, Signal Processing: Image Communication, Vol. 47, pp. 131-151, September 2016, doi: https://doi.org/10.1016/j.image.2016.05.020. This manuscript version is distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives License CC BY NC-ND 4.0 (http://creativecommons.org/licenses/by-nc-nd/4.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited, and is not altered, transformed, or built upon in any way.Privacy is one of the most important social and political issues in our information society, characterized by a growing range of enabling and supporting technologies and services. Amongst these are communications, multimedia, biometrics, big data, cloud computing, data mining, internet, social networks, and audio-video surveillance. Each of these can potentially provide the means for privacy intrusion. De-identification is one of the main approaches to privacy protection in multimedia contents (text, still images, audio and video sequences and their combinations). It is a process for concealing or removing personal identifiers, or replacing them by surrogate personal identifiers in personal information in order to prevent the disclosure and use of data for purposes unrelated to the purpose for which the information was originally obtained. Based on the proposed taxonomy inspired by the Safe Harbour approach, the personal identifiers, i.e., the personal identifiable information, are classified as non-biometric, physiological and behavioural biometric, and soft biometric identifiers. In order to protect the privacy of an individual, all of the above identifiers will have to be de-identified in multimedia content. This paper presents a review of the concepts of privacy and the linkage among privacy, privacy protection, and the methods and technologies designed specifically for privacy protection in multimedia contents. The study provides an overview of de-identification approaches for non-biometric identifiers (text, hairstyle, dressing style, license plates), as well as for the physiological (face, fingerprint, iris, ear), behavioural (voice, gait, gesture) and soft-biometric (body silhouette, gender, age, race, tattoo) identifiers in multimedia documents.Peer reviewe

    Temporal Image Forensics for Picture Dating based on Machine Learning

    Get PDF
    Temporal image forensics involves the investigation of multi-media digital forensic material related to crime with the goal of obtaining accurate evidence concerning activity and timing to be presented in a court of law. Because of the ever-increasing complexity of crime in the digital age, forensic investigations are increasingly dependent on timing information. The simplest way to extract such forensic information would be the use of the EXIF header of picture files as it contains most of the information. However, these header data can be easily removed or manipulated and hence cannot be evidential, and so estimating the acquisition time of digital photographs has become more challenging. This PhD research proposes to use image contents instead of file headers to solve this problem. In this thesis, a number of contributions are presented in the area of temporal image forensics to predict picture dating. Firstly, the present research introduces the unique Northumbria Temporal Image Forensics (NTIF) database of pictures for the purpose of temporal image forensic purposes. As digital sensors age, the changes in Photo Response Non-Uniformity (PRNU) over time have been highlighted using the NTIF database, and it is concluded that PRNU cannot be useful feature for picture dating application. Apart from the PRNU, defective pixels also constitute another sensor imperfection of forensic relevance. Secondly, this thesis highlights the fact that the filter-based PRNU technique is useful for source camera identification application as compared to deep convolutional neural networks when limited amounts of images under investigation are available to the forensic analyst. The results concluded that due to sensor pattern noise feature which is location-sensitive, the performance of CNN-based approach declines because sensor pattern noise image blocks are fed at different locations into CNN for the same category. Thirdly, the deep learning technique is applied for picture dating, which has shown promising results with performance levels up to 80% to 88% depending on the digital camera used. The key findings indicate that a deep learning approach can successfully learn the temporal changes in image contents, rather than the sensor pattern noise. Finally, this thesis proposes a technique to estimate the acquisition time slots of digital pictures using a set of candidate defective pixel locations in non-overlapping image blocks. The temporal behaviour of camera sensor defects in digital pictures are analyzed using a machine learning technique in which potential candidate defective pixels are determined according to the related pixel neighbourhood and two proposed features called local variation features. The idea of virtual timescales using halves of real time slots and a combination of prediction scores for image blocks has been proposed to enhance performance. When assessed using the NTIF image dataset, the proposed system has been shown to achieve very promising results with an estimated accuracy of the acquisition times of digital pictures between 88% and 93%, exhibiting clear superiority over relevant state-of-the-art systems

    Learning as a Nonlinear Line of Attraction for Pattern Association, Classification and Recognition

    Get PDF
    Development of a mathematical model for learning a nonlinear line of attraction is presented in this dissertation, in contrast to the conventional recurrent neural network model in which the memory is stored in an attractive fixed point at discrete location in state space. A nonlinear line of attraction is the encapsulation of attractive fixed points scattered in state space as an attractive nonlinear line, describing patterns with similar characteristics as a family of patterns. It is usually of prime imperative to guarantee the convergence of the dynamics of the recurrent network for associative learning and recall. We propose to alter this picture. That is, if the brain remembers by converging to the state representing familiar patterns, it should also diverge from such states when presented by an unknown encoded representation of a visual image. The conception of the dynamics of the nonlinear line attractor network to operate between stable and unstable states is the second contribution in this dissertation research. These criteria can be used to circumvent the plasticity-stability dilemma by using the unstable state as an indicator to create a new line for an unfamiliar pattern. This novel learning strategy utilizes stability (convergence) and instability (divergence) criteria of the designed dynamics to induce self-organizing behavior. The self-organizing behavior of the nonlinear line attractor model can manifest complex dynamics in an unsupervised manner. The third contribution of this dissertation is the introduction of the concept of manifold of color perception. The fourth contribution of this dissertation is the development of a nonlinear dimensionality reduction technique by embedding a set of related observations into a low-dimensional space utilizing the result attained by the learned memory matrices of the nonlinear line attractor network. Development of a system for affective states computation is also presented in this dissertation. This system is capable of extracting the user\u27s mental state in real time using a low cost computer. It is successfully interfaced with an advanced learning environment for human-computer interaction

    State of the art of audio- and video based solutions for AAL

    Get PDF
    Working Group 3. Audio- and Video-based AAL ApplicationsIt is a matter of fact that Europe is facing more and more crucial challenges regarding health and social care due to the demographic change and the current economic context. The recent COVID-19 pandemic has stressed this situation even further, thus highlighting the need for taking action. Active and Assisted Living (AAL) technologies come as a viable approach to help facing these challenges, thanks to the high potential they have in enabling remote care and support. Broadly speaking, AAL can be referred to as the use of innovative and advanced Information and Communication Technologies to create supportive, inclusive and empowering applications and environments that enable older, impaired or frail people to live independently and stay active longer in society. AAL capitalizes on the growing pervasiveness and effectiveness of sensing and computing facilities to supply the persons in need with smart assistance, by responding to their necessities of autonomy, independence, comfort, security and safety. The application scenarios addressed by AAL are complex, due to the inherent heterogeneity of the end-user population, their living arrangements, and their physical conditions or impairment. Despite aiming at diverse goals, AAL systems should share some common characteristics. They are designed to provide support in daily life in an invisible, unobtrusive and user-friendly manner. Moreover, they are conceived to be intelligent, to be able to learn and adapt to the requirements and requests of the assisted people, and to synchronise with their specific needs. Nevertheless, to ensure the uptake of AAL in society, potential users must be willing to use AAL applications and to integrate them in their daily environments and lives. In this respect, video- and audio-based AAL applications have several advantages, in terms of unobtrusiveness and information richness. Indeed, cameras and microphones are far less obtrusive with respect to the hindrance other wearable sensors may cause to one’s activities. In addition, a single camera placed in a room can record most of the activities performed in the room, thus replacing many other non-visual sensors. Currently, video-based applications are effective in recognising and monitoring the activities, the movements, and the overall conditions of the assisted individuals as well as to assess their vital parameters (e.g., heart rate, respiratory rate). Similarly, audio sensors have the potential to become one of the most important modalities for interaction with AAL systems, as they can have a large range of sensing, do not require physical presence at a particular location and are physically intangible. Moreover, relevant information about individuals’ activities and health status can derive from processing audio signals (e.g., speech recordings). Nevertheless, as the other side of the coin, cameras and microphones are often perceived as the most intrusive technologies from the viewpoint of the privacy of the monitored individuals. This is due to the richness of the information these technologies convey and the intimate setting where they may be deployed. Solutions able to ensure privacy preservation by context and by design, as well as to ensure high legal and ethical standards are in high demand. After the review of the current state of play and the discussion in GoodBrother, we may claim that the first solutions in this direction are starting to appear in the literature. A multidisciplinary 4 debate among experts and stakeholders is paving the way towards AAL ensuring ergonomics, usability, acceptance and privacy preservation. The DIANA, PAAL, and VisuAAL projects are examples of this fresh approach. This report provides the reader with a review of the most recent advances in audio- and video-based monitoring technologies for AAL. It has been drafted as a collective effort of WG3 to supply an introduction to AAL, its evolution over time and its main functional and technological underpinnings. In this respect, the report contributes to the field with the outline of a new generation of ethical-aware AAL technologies and a proposal for a novel comprehensive taxonomy of AAL systems and applications. Moreover, the report allows non-technical readers to gather an overview of the main components of an AAL system and how these function and interact with the end-users. The report illustrates the state of the art of the most successful AAL applications and functions based on audio and video data, namely (i) lifelogging and self-monitoring, (ii) remote monitoring of vital signs, (iii) emotional state recognition, (iv) food intake monitoring, activity and behaviour recognition, (v) activity and personal assistance, (vi) gesture recognition, (vii) fall detection and prevention, (viii) mobility assessment and frailty recognition, and (ix) cognitive and motor rehabilitation. For these application scenarios, the report illustrates the state of play in terms of scientific advances, available products and research project. The open challenges are also highlighted. The report ends with an overview of the challenges, the hindrances and the opportunities posed by the uptake in real world settings of AAL technologies. In this respect, the report illustrates the current procedural and technological approaches to cope with acceptability, usability and trust in the AAL technology, by surveying strategies and approaches to co-design, to privacy preservation in video and audio data, to transparency and explainability in data processing, and to data transmission and communication. User acceptance and ethical considerations are also debated. Finally, the potentials coming from the silver economy are overviewed.publishedVersio

    Tracking dynamic regions of texture and shape

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2007.Includes bibliographical references (p. 137-142).The tracking of visual phenomena is a problem of fundamental importance in computer vision. Tracks are used in many contexts, including object recognition, classification, camera calibration, and scene understanding. However, the use of such data is limited by the types of objects we are able to track and the environments in which we can track them. Objects whose shape or appearance can change in complex ways are difficult to track as it is difficult to represent or predict the appearance of such objects. Furthermore, other elements of the scene may interact with the tracked object, changing its appearance, or hiding part or all of it from view. In this thesis, we address the problem of tracking deformable, dynamically textured regions under challenging conditions involving visual clutter, distractions, and multiple and prolonged occlusion. We introduce a model of appearance capable of compactly representing regions undergoing nonuniform, nonrepeating changes to both its textured appearance and shape. We describe methods of maintaining such a model and show how it enables efficient and effective occlusion reasoning. By treating the visual appearance as a dynamically changing textured region, we show how such a model enables the tracking of groups of people. By tracking groups of people instead of each individual independently, we are able to track in environments where it would otherwise be difficult, or impossible. We demonstrate the utility of the model by tracking many regions under diverse conditions, including indoor and outdoor scenes, near-field and far-field camera positions, through occlusion and through complex interactions with other visual elements, and by tracking such varied phenomena as meteorological data, seismic imagery, and groups of people.by Joshua Migdal.Ph.D

    GEOBIA 2016 : Solutions and Synergies., 14-16 September 2016, University of Twente Faculty of Geo-Information and Earth Observation (ITC): open access e-book

    Get PDF
    corecore