837 research outputs found

    High-performance computing for vision

    Get PDF
    Vision is a challenging application for high-performance computing (HPC). Many vision tasks have stringent latency and throughput requirements. Further, the vision process has a heterogeneous computational profile. Low-level vision consists of structured computations, with regular data dependencies. The subsequent, higher level operations consist of symbolic computations with irregular data dependencies. Over the years, many approaches to high-speed vision have been pursued. VLSI hardware solutions such as ASIC's and digital signal processors (DSP's) have provided good processing speeds on structured low-level vision tasks. Special purpose systems for vision have also been designed. Currently, there is growing interest in using general purpose parallel systems for vision problems. These systems offer advantages of higher performance, sofavare programmability, generality, and architectural flexibility over the earlier approaches. The choice of low-cost commercial-off-theshelf (COTS) components as building blocks for these systems leads to easy upgradability and increased system life. The main focus of the paper is on effectively using the COTSbased general purpose parallel computing platforms to realize high-speed implementations of vision tasks. Due to the successful use of the COTS-based systems in a variety of high performance applications, it is attractive to consider their use for vision applications as well. However, the irregular data dependencies in vision tasks lead to large communication overheads in the HPC systems. At the University of Southern California, our research efforts have been directed toward designing scalable parallel algorithms for vision tasks on the HPC systems. In our approach, we use the message passing programming model to develop portable code. Our algorithms are specified using C and MPI. In this paper, we summarize our efforts, and illustrate our approach using several example vision tasks. To facilitate the analysis and development of scalable algorithms, a realistic computational model of the parallel system must be used. Several such models have been proposed in the literature. We use the General-purpose Distributed Memory (GDM) model which is a simple but realistic model of state-of-theart parallel machines. Using the GDM model, generic algorithmic techniques such as data remapping, overlapping of communication with computation, message packing, asynchronous execution, and communication scheduling are developed. Using these techniques, we have developed scalable algorithms for many vision tasks. For instance, a scalable algorithm for linear approximation has been developed using the asynchronous execution technique. Using this algorithm, linear feature extraction can be performed in 0.065 s on a 64 node SP-2 for a 512 × 512 image. A serial implementation takes 3.45 s for the same task. Similarly, the communication scheduling and decomposition techniques lead to a scalable algorithm for the line grouping task. We believe that such an algorithmic approach can result in the development of scalable and portable solutions for vision tasks. © 1996 IEEE Publisher Item Identifier S 0018-9219(96)04992-4.published_or_final_versio

    Efficient Implementation of Stochastic Inference on Heterogeneous Clusters and Spiking Neural Networks

    Get PDF
    Neuromorphic computing refers to brain inspired algorithms and architectures. This paradigm of computing can solve complex problems which were not possible with traditional computing methods. This is because such implementations learn to identify the required features and classify them based on its training, akin to how brains function. This task involves performing computation on large quantities of data. With this inspiration, a comprehensive multi-pronged approach is employed to study and efficiently implement neuromorphic inference model using heterogeneous clusters to address the problem using traditional Von Neumann architectures and by developing spiking neural networks (SNN) for native and ultra-low power implementation. In this regard, an extendable high-performance computing (HPC) framework and optimizations are proposed for heterogeneous clusters to modularize complex neuromorphic applications in a distributed manner. To achieve best possible throughput and load balancing for such modularized architectures a set of algorithms are proposed to suggest the optimal mapping of different modules as an asynchronous pipeline to the available cluster resources while considering the complex data dependencies between stages. On the other hand, SNNs are more biologically plausible and can achieve ultra-low power implementation due to its sparse spike based communication, which is possible with emerging non-Von Neumann computing platforms. As a significant progress in this direction, spiking neuron models capable of distributed online learning are proposed. A high performance SNN simulator (SpNSim) is developed for simulation of large scale mixed neuron model networks. An accompanying digital hardware neuron RTL is also proposed for efficient real time implementation of SNNs capable of online learning. Finally, a methodology for mapping probabilistic graphical model to off-the-shelf neurosynaptic processor (IBM TrueNorth) as a stochastic SNN is presented with ultra-low power consumption

    Massively Parallel Simulation of Structured Connectionist Networks: An Interim Report

    Get PDF
    We map structured connectionist models of knowledge representation and reasoning onto existing general purpose massively parallel architectures with the objective of developing and implementing practical, real-time knowledge base systems. Shruti, a connectionist knowledge representation and reasoning system which attempts to model reflexive reasoning, will serve as our representative connectionist model. Efficient simulation systems for shruti are developed on the Connection Machine CM-2 - an SIMD architecture - and on the Connection Machine CM-5 - an MIMD architecture. The resulting simulators are evaluated and tested using large, random knowledge bases with up to half a million rules and facts. Though SIMD simulations on the CM-2 are reasonably fast - requiring a few seconds to tens of seconds for answering simple queries - experiments indicate that MIMD simulations are vastly superior to SIMD simulations and offer hundred- to thousand-fold speedups. This work provides new insights into the simulation of structured connectionist networks on massively parallel machines and is a step toward developing large yet efficient knowledge representation and reasoning systems

    Computer vision algorithms on reconfigurable logic arrays

    Full text link

    A multimodal framework for interactive sonification and sound-based communication

    Get PDF

    Distributed Spectral Graph Methods for Analyzing Large-Scale Unstructured Biomedical Data

    Get PDF
    There is an ever-expanding body of biological data, growing in size and complexity, out- stripping the capabilities of standard database tools or traditional analysis techniques. Such examples include molecular dynamics simulations, drug-target interactions, gene regulatory networks, and high-throughput imaging. Large-scale acquisition and curation biological data has already yielded results in the form of lower costs for genome sequencing and greater cov- erage in databases such as GenBank, and is viewed as the future of biocuration. The “big data” philosophy and its associated paradigms and frameworks have the potential to uncover solutions to problems otherwise intractable with more traditional investigative techniques. Here, we focus on two biological systems whose data form large, undirected graphs. First, we develop a quantitative model of ciliary motion phenotypes, using spectral graph methods for unsupervised latent pattern discovery. Second, we apply similar techniques to identify a mapping between physiochemical structure and odor percept in human olfaction. In both cases, we experienced computational bottlenecks in our statistical machinery, necessitating the creation of a new analysis framework. At the core of this framework is a distributed hierarchical eigensolver, which we compare directly to other popular solvers. We demon- strate its essential role in enabling the discovery of novel ciliary motion phenotypes and in identifying physiochemical-perceptual associations

    A machine learning approach to the unsupervised segmentation of mitochondria in subcellular electron microscopy data

    Get PDF
    Recent advances in cellular and subcellular microscopy demonstrated its potential towards unravelling the mechanisms of various diseases at the molecular level. The biggest challenge in both human- and computer-based visual analysis of micrographs is the variety of nanostructures and mitochondrial morphologies. The state-of-the-art is, however, dominated by supervised manual data annotation and early attempts to automate the segmentation process were based on supervised machine learning techniques which require large datasets for training. Given a minimal number of training sequences or none at all, unsupervised machine learning formulations, such as spectral dimensionality reduction, are known to be superior in detecting salient image structures. This thesis presents three major contributions developed around the spectral clustering framework which is proven to capture perceptual organization features. Firstly, we approach the problem of mitochondria localization. We propose a novel grouping method for the extracted line segments which describes the normal mitochondrial morphology. Experimental findings show that the clusters obtained successfully model the inner mitochondrial membrane folding and therefore can be used as markers for the subsequent segmentation approaches. Secondly, we developed an unsupervised mitochondria segmentation framework. This method follows the evolutional ability of human vision to extrapolate salient membrane structures in a micrograph. Furthermore, we designed robust non-parametric similarity models according to Gestaltic laws of visual segregation. Experiments demonstrate that such models automatically adapt to the statistical structure of the biological domain and return optimal performance in pixel classification tasks under the wide variety of distributional assumptions. The last major contribution addresses the computational complexity of spectral clustering. Here, we introduced a new anticorrelation-based spectral clustering formulation with the objective to improve both: speed and quality of segmentation. The experimental findings showed the applicability of our dimensionality reduction algorithm to very large scale problems as well as asymmetric, dense and non-Euclidean datasets

    Foveated Video Streaming for Cloud Gaming

    Get PDF
    Video gaming is generally a computationally intensive application and to provide a pleasant user experience specialized hardware like Graphic Processing Units may be required. Computational resources and power consumption are constraints which limit visually complex gaming on, for example, laptops, tablets and smart phones. Cloud gaming may be a possible approach towards providing a pleasant gaming experience on thin clients which have limited computational and energy resources. In a cloud gaming architecture, the game-play video is rendered and encoded in the cloud and streamed to a client where it is displayed. User inputs are captured at the client and streamed back to the server, where they are relayed to the game. High quality of experience requires the streamed video to be of high visual quality which translates to substantial downstream bandwidth requirements. The visual perception of the human eye is non-uniform, being maximum along the optical axis of the eye and dropping off rapidly away from it. This phenomenon, called foveation, makes the practice of encoding all areas of a video frame with the same resolution wasteful. In this thesis, foveated video streaming from a cloud gaming server to a cloud gaming client is investigated. A prototype cloud gaming system with foveated video streaming is implemented. The cloud gaming server of the prototype is configured to encode gameplay video in a foveated fashion based on gaze location data provided by the cloud gaming client. The effect of foveated encoding on the output bitrate of the streamed video is investigated. Measurements are performed using games from various genres and with different player points of view to explore changes in video bitrate with different parameters of foveation. Latencies involved in foveated video streaming for cloud gaming, including latency of the eye tracker used in the thesis, are also briefly discussed

    Cloud Computing in VANETs: Architecture, Taxonomy, and Challenges

    Get PDF
    Cloud Computing in VANETs (CC-V) has been investigated into two major themes of research including Vehicular Cloud Computing (VCC) and Vehicle using Cloud (VuC). VCC is the realization of autonomous cloud among vehicles to share their abundant resources. VuC is the efficient usage of conventional cloud by on-road vehicles via a reliable Internet connection. Recently, number of advancements have been made to address the issues and challenges in VCC and VuC. This paper qualitatively reviews CC-V with the emphasis on layered architecture, network component, taxonomy, and future challenges. Specifically, a four-layered architecture for CC-V is proposed including perception, co-ordination, artificial intelligence and smart application layers. Three network component of CC-V namely, vehicle, connection and computation are explored with their cooperative roles. A taxonomy for CC-V is presented considering major themes of research in the area including design of architecture, data dissemination, security, and applications. Related literature on each theme are critically investigated with comparative assessment of recent advances. Finally, some open research challenges are identified as future issues. The challenges are the outcome of the critical and qualitative assessment of literature on CC-V

    A cortical model of object perception based on Bayesian networks and belief propagation.

    Get PDF
    Evidence suggests that high-level feedback plays an important role in visual perception by shaping the response in lower cortical levels (Sillito et al. 2006, Angelucci and Bullier 2003, Bullier 2001, Harrison et al. 2007). A notable example of this is reflected by the retinotopic activation of V1 and V2 neurons in response to illusory contours, such as Kanizsa figures, which has been reported in numerous studies (Maertens et al. 2008, Seghier and Vuilleumier 2006, Halgren et al. 2003, Lee 2003, Lee and Nguyen 2001). The illusory contour activity emerges first in lateral occipital cortex (LOC), then in V2 and finally in V1, strongly suggesting that the response is driven by feedback connections. Generative models and Bayesian belief propagation have been suggested to provide a theoretical framework that can account for feedback connectivity, explain psychophysical and physiological results, and map well onto the hierarchical distributed cortical connectivity (Friston and Kiebel 2009, Dayan et al. 1995, Knill and Richards 1996, Geisler and Kersten 2002, Yuille and Kersten 2006, Deneve 2008a, George and Hawkins 2009, Lee and Mumford 2003, Rao 2006, Litvak and Ullman 2009, Steimer et al. 2009). The present study explores the role of feedback in object perception, taking as a starting point the HMAX model, a biologically inspired hierarchical model of object recognition (Riesenhuber and Poggio 1999, Serre et al. 2007b), and extending it to include feedback connectivity. A Bayesian network that captures the structure and properties of the HMAX model is developed, replacing the classical deterministic view with a probabilistic interpretation. The proposed model approximates the selectivity and invariance operations of the HMAX model using the belief propagation algorithm. Hence, the model not only achieves successful feedforward recognition invariant to position and size, but is also able to reproduce modulatory effects of higher-level feedback, such as illusory contour completion, attention and mental imagery. Overall, the model provides a biophysiologically plausible interpretation, based on state-of-theart probabilistic approaches and supported by current experimental evidence, of the interaction between top-down global feedback and bottom-up local evidence in the context of hierarchical object perception
    corecore