31 research outputs found
Communication framework for distributed computer vision on stationary and mobile platforms
Recent advances in the complexity and manufacturability of digital video cameras coupled with the ubiquity of high speed computers and communication networks have led to burgeoning research in the fields of computer vision and image understanding. As the generated vision algorithms become increasingly complex, a need arises for robust communication between remote cameras on mobile units and their associated distributed vision algorithms. A communication framework would provide a basis for modularization and abstraction of a collection of computer vision algorithms; the resulting system would allow for straightforward image capture, simplified communication between algorithms, and easy replacement or upgrade of existing component algorithms. The objective of this thesis is to create such a communication framework and demonstrate its viability and applicability by implementing a relatively complex system of distributed computer vision algorithms. These multi-camera algorithms include body tracking, pose estimation and face recognition. Although a plethora of research exists documenting individual algorithms which may utilize multiple networked cameras, this thesis aims to develop a novel way of sharing information between cameras and algorithms in a distributed computation system. In addition, this thesis strives to extend such an approach to using both stationary and mobile cameras. For this application, a mobile computer vision platform was developed that integrates seamlessly with the aforementioned communication framework, extending both its functionality and robustness
Fast ADMM Algorithm for Distributed Optimization with Adaptive Penalty
We propose new methods to speed up convergence of the Alternating Direction
Method of Multipliers (ADMM), a common optimization tool in the context of
large scale and distributed learning. The proposed method accelerates the speed
of convergence by automatically deciding the constraint penalty needed for
parameter consensus in each iteration. In addition, we also propose an
extension of the method that adaptively determines the maximum number of
iterations to update the penalty. We show that this approach effectively leads
to an adaptive, dynamic network topology underlying the distributed
optimization. The utility of the new penalty update schemes is demonstrated on
both synthetic and real data, including a computer vision application of
distributed structure from motion.Comment: 8 pages manuscript, 2 pages appendix, 5 figure
Attentive monitoring of multiple video streams driven by a Bayesian foraging strategy
In this paper we shall consider the problem of deploying attention to subsets
of the video streams for collating the most relevant data and information of
interest related to a given task. We formalize this monitoring problem as a
foraging problem. We propose a probabilistic framework to model observer's
attentive behavior as the behavior of a forager. The forager, moment to moment,
focuses its attention on the most informative stream/camera, detects
interesting objects or activities, or switches to a more profitable stream. The
approach proposed here is suitable to be exploited for multi-stream video
summarization. Meanwhile, it can serve as a preliminary step for more
sophisticated video surveillance, e.g. activity and behavior analysis.
Experimental results achieved on the UCR Videoweb Activities Dataset, a
publicly available dataset, are presented to illustrate the utility of the
proposed technique.Comment: Accepted to IEEE Transactions on Image Processin
A novel decentralised system architecture for multi-camera target tracking
Target tracking in a multi-camera system is an active and challenging research that in many systems requires video synchronisation and knowledge of the camera set-up and layout. In this paper a highly flexible, modular and decentralised system architecture is presented for multi-camera target tracking with relaxed synchronisation constraints among camera views. Moreover, the system does not rely on positional information to handle camera hand-off events. As a practical application, the system itself can, at any time, automatically select the best target view available, to implicitly solve occlusion. Further, to validate the proposed architecture, an extension to a multi-camera environment of the colour-based IMS-SWAD tracker is used. The experimental results show that the tracker can successfully track a chosen target in multiple views, in both indoor and outdoor environments, with non-overlapping and overlapping camera views
Grassmann Averages for Scalable Robust PCA
As the collection of large datasets becomes increasingly automated, the occurrence of outliers will increase – “big data” implies “big outliers”. While principal component analysis (PCA) is often used to reduce the size of data, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA do not scale beyond small-to-medium sized datasets. To address this, we introduce the Grassmann Average (GA), which expresses dimensionality reduction as an average of the subspaces spanned by the data. Because averages can be efficiently computed, we immediately gain scalability. GA is inherently more robust than PCA, but we show that they coincide for Gaussian data. We exploit that averages can be made robust to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. Robustness can be with respect to vectors (subspaces) or elements of vectors; we focus on the latter and use a trimmed average. The resulting Trimmed Grassmann Average (TGA) is particularly appropriate for computer vision because it is robust to pixel outliers. The algorithm has low computational complexity and minimal memory requirements, making it scalable to “big noisy data.” We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie
Web service for automatic generation of 3D models through video
Esta dissertação tem como objetivo criar um serviço web escalável capaz de gerar modelos 3D através de vídeo. De forma a melhorar a usabilidade e reduzir custos, é esperado que o vídeo seja adquirido através de um sistema de aquisição monocular. De forma a tornar esta ferramenta o mais simples e flexível possível vai ser criado um serviço web. Isto permite que developers criem outras aplicações que podem ser acedidas por dispositivos menos poderosos, como smartphones, desde que tenham uma ligação à Internet. A maior contribuição desta dissertação é a criação da arquitetura do sistema cloud, adaptando os algoritmos de reconstrução 3D existentes a esta nova realidade, uma vez que a maioria dos algoritmos de visão por computador não são distribuídos.This thesis aims to create a scalable web service capable of generating 3D models through video. In order to improve usability and reduce costs, it is expected that the video is acquired through a monocular aquisition system. In order to make this tool as simple and flexible as possible, a web service is going to be developed. This allows developers to create other applications that can be accessed in lesser powerful devices, such as smartphones, provided that they have an Internet connection. The major contribution of this thesis is the creation of the cloud's system architecture, adapting the existing 3D reconstruction algorithms to this new reality, since the majority of the computer vision algorithms are not distributed