1,259 research outputs found

    Memory and information processing in neuromorphic systems

    Full text link
    A striking difference between brain-inspired neuromorphic processors and current von Neumann processors architectures is the way in which memory and processing is organized. As Information and Communication Technologies continue to address the need for increased computational power through the increase of cores within a digital processor, neuromorphic engineers and scientists can complement this need by building processor architectures where memory is distributed with the processing. In this paper we present a survey of brain-inspired processor architectures that support models of cortical networks and deep neural networks. These architectures range from serial clocked implementations of multi-neuron systems to massively parallel asynchronous ones and from purely digital systems to mixed analog/digital systems which implement more biological-like models of neurons and synapses together with a suite of adaptation and learning mechanisms analogous to the ones found in biological nervous systems. We describe the advantages of the different approaches being pursued and present the challenges that need to be addressed for building artificial neural processing systems that can display the richness of behaviors seen in biological systems.Comment: Submitted to Proceedings of IEEE, review of recently proposed neuromorphic computing platforms and system

    From surfaces to objects : Recognizing objects using surface information and object models.

    Get PDF
    This thesis describes research on recognizing partially obscured objects using surface information like Marr's 2D sketch ([MAR82]) and surface-based geometrical object models. The goal of the recognition process is to produce a fully instantiated object hypotheses, with either image evidence for each feature or explanations for their absence, in terms of self or external occlusion. The central point of the thesis is that using surface information should be an important part of the image understanding process. This is because surfaces are the features that directly link perception to the objects perceived (for normal "camera-like" sensing) and because surfaces make explicit information needed to understand and cope with some visual problems (e.g. obscured features). Further, because surfaces are both the data and model primitive, detailed recognition can be made both simpler and more complete. Recognition input is a surface image, which represents surface orientation and absolute depth. Segmentation criteria are proposed for forming surface patches with constant curvature character, based on surface shape discontinuities which become labeled segmentation- boundaries. Partially obscured object surfaces are reconstructed using stronger surface based constraints. Surfaces are grouped to form surface clusters, which are 3D identity-independent solids that often correspond to model primitives. These are used here as a context within which to select models and find all object features. True three-dimensional properties of image boundaries, surfaces and surface clusters are directly estimated using the surface data. Models are invoked using a network formulation, where individual nodes represent potential identities for image structures. The links between nodes are defined by generic and structural relationships. They define indirect evidence relationships for an identity. Direct evidence for the identities comes from the data properties. A plausibility computation is defined according to the constraints inherent in the evidence types. When a node acquires sufficient plausibility, the model is invoked for the corresponding image structure.Objects are primarily represented using a surface-based geometrical model. Assemblies are formed from subassemblies and surface primitives, which are defined using surface shape and boundaries. Variable affixments between assemblies allow flexibly connected objects. The initial object reference frame is estimated from model-data surface relationships, using correspondences suggested by invocation. With the reference frame, back-facing, tangential, partially self-obscured, totally self-obscured and fully visible image features are deduced. From these, the oriented model is used for finding evidence for missing visible model features. IT no evidence is found, the program attempts to find evidence to justify the features obscured by an unrelated object. Structured objects are constructed using a hierarchical synthesis process. Fully completed hypotheses are verified using both existence and identity constraints based on surface evidence. Each of these processes is defined by its computational constraints and are demonstrated on two test images. These test scenes are interesting because they contain partially and fully obscured object features, a variety of surface and solid types and flexibly connected objects. All modeled objects were fully identified and analyzed to the level represented in their models and were also acceptably spatially located. Portions of this work have been reported elsewhere ([FIS83], [FIS85a], [FIS85b], [FIS86]) by the author

    HIGH QUALITY HUMAN 3D BODY MODELING, TRACKING AND APPLICATION

    Get PDF
    Geometric reconstruction of dynamic objects is a fundamental task of computer vision and graphics, and modeling human body of high fidelity is considered to be a core of this problem. Traditional human shape and motion capture techniques require an array of surrounding cameras or subjects wear reflective markers, resulting in a limitation of working space and portability. In this dissertation, a complete process is designed from geometric modeling detailed 3D human full body and capturing shape dynamics over time using a flexible setup to guiding clothes/person re-targeting with such data-driven models. As the mechanical movement of human body can be considered as an articulate motion, which is easy to guide the skin animation but has difficulties in the reverse process to find parameters from images without manual intervention, we present a novel parametric model, GMM-BlendSCAPE, jointly taking both linear skinning model and the prior art of BlendSCAPE (Blend Shape Completion and Animation for PEople) into consideration and develop a Gaussian Mixture Model (GMM) to infer both body shape and pose from incomplete observations. We show the increased accuracy of joints and skin surface estimation using our model compared to the skeleton based motion tracking. To model the detailed body, we start with capturing high-quality partial 3D scans by using a single-view commercial depth camera. Based on GMM-BlendSCAPE, we can then reconstruct multiple complete static models of large pose difference via our novel non-rigid registration algorithm. With vertex correspondences established, these models can be further converted into a personalized drivable template and used for robust pose tracking in a similar GMM framework. Moreover, we design a general purpose real-time non-rigid deformation algorithm to accelerate this registration. Last but not least, we demonstrate a novel virtual clothes try-on application based on our personalized model utilizing both image and depth cues to synthesize and re-target clothes for single-view videos of different people

    Detecting irregularity in videos using spatiotemporal volumes.

    Get PDF
    Li, Yun.Thesis (M.Phil.)--Chinese University of Hong Kong, 2007.Includes bibliographical references (leaves 68-72).Abstracts in English and Chinese.Abstract --- p.I摘要 --- p.IIIAcknowledgments --- p.IVList of Contents --- p.VIList of Figures --- p.VIIChapter Chapter 1 --- Introduction --- p.1Chapter 1.1 --- Visual Detection --- p.2Chapter 1.2 --- Irregularity Detection --- p.4Chapter Chapter 2 --- System Overview --- p.7Chapter 2.1 --- Definition of Irregularity --- p.7Chapter 2.2 --- Contributions --- p.8Chapter 2.3 --- Review of previous work --- p.9Chapter 2.3.1 --- Model-based Methods --- p.9Chapter 2.3.2 --- Statistical Methods --- p.11Chapter 2.4 --- System Outline --- p.14Chapter Chapter 3 --- Background Subtraction --- p.16Chapter 3.1 --- Related Work --- p.17Chapter 3.2 --- Adaptive Mixture Model --- p.18Chapter 3.2.1 --- Online Model Update --- p.20Chapter 3.2.2 --- Background Model Estimation --- p.22Chapter 3.2.3 --- Foreground Segmentation --- p.24Chapter Chapter 4 --- Feature Extraction --- p.28Chapter 4.1 --- Various Feature Descriptors --- p.29Chapter 4.2 --- Histogram of Oriented Gradients --- p.30Chapter 4.2.1 --- Feature Descriptor --- p.31Chapter 4.2.2 --- Feature Merits --- p.33Chapter 4.3 --- Subspace Analysis --- p.35Chapter 4.3.1 --- Principal Component Analysis --- p.35Chapter 4.3.2 --- Subspace Projection --- p.37Chapter Chapter 5 --- Bayesian Probabilistic Inference --- p.39Chapter 5.1 --- Estimation of PDFs --- p.40Chapter 5.1.1 --- K-Means Clustering --- p.40Chapter 5.1.2 --- Kernel Density Estimation --- p.42Chapter 5.2 --- MAP Estimation --- p.44Chapter 5.2.1 --- ML Estimation & MAP Estimation --- p.44Chapter 5.2.2 --- Detection through MAP --- p.46Chapter 5.3 --- Efficient Implementation --- p.47Chapter 5.3.1 --- K-D Trees --- p.48Chapter 5.3.2 --- Nearest Neighbor (NN) Algorithm --- p.49Chapter Chapter 6 --- Experiments and Conclusion --- p.51Chapter 6.1 --- Experiments --- p.51Chapter 6.1.1 --- Outdoor Video Surveillance - Exp. 1 --- p.52Chapter 6.1.2 --- Outdoor Video Surveillance - Exp. 2 --- p.54Chapter 6.1.3 --- Outdoor Video Surveillance - Exp. 3 --- p.56Chapter 6.1.4 --- Classroom Monitoring - Exp.4 --- p.61Chapter 6.2 --- Algorithm Evaluation --- p.64Chapter 6.3 --- Conclusion --- p.66Bibliography --- p.6

    사람 동작의 마커없는 재구성

    Get PDF
    학위논문 (박사)-- 서울대학교 대학원 : 전기·컴퓨터공학부, 2017. 2. 이제희.Markerless human pose recognition using a single-depth camera plays an important role in interactive graphics applications and user interface design. Recent pose recognition algorithms have adopted machine learning techniques, utilizing a large collection of motion capture data. The effectiveness of the algorithms is greatly influenced by the diversity and variability of training data. Many applications have been developed to use human body as a controller to utilize these pose recognition systems. In many cases, using general props help us perform immersion control of the system. Nevertheless, the human pose and prop recognition system is not yet sufficiently powerful. Moreover, there is a problem such as invisible parts lower the quality of human pose estimation from a single depth camera due to an absence of observed data. In this thesis, we present techniques to manipulate the human motion data for enabling to estimate human pose from a single depth camera. First, we developed method that resamples a collection of human motion data to improve the pose variability and achieve an arbitrary size and level of density in the space of human poses. The space of human poses is high-dimensional and thus brute-force uniform sampling is intractable. We exploit dimensionality reduction and locally stratified sampling to generate either uniform or application-specifically biased distributions in the space of human poses. Our algorithm is learned to recognize such challenging poses such as sit, kneel, stretching and yoga using a remarkably small amount of training data. The recognition algorithm can also be steered to maximize its performance for a specific domain of human poses. We demonstrate that our algorithm performs much better than Kinect SDK for recognizing challenging acrobatic poses, while performing comparably for easy upright standing poses. Second, we find out environmental object which interact with human beings. We proposed a new props recognition system, which can applied on the existing human pose estimation algorithm, and enable to powerful props estimation with human poses at the same times. Our work is widely applicable to various types of controllers system, which deals with the human pose and addition items simultaneously. Finally, we enhance the pose estimation result. All the part of human body cannot be always estimated from the single depth image. In some case, some body parts are occluded by other body parts, and sometimes estimation system fail to success. For solving this problem, we construct novel neural network model which called autoencoder. It is constructed from huge natural pose data. Then it can reconstruct the missing parameter of human pose joint as new correct joint. It can be applied to many different human pose estimation systems to improve their performance.1 Introduction 1 2 Background 9 2.1 Research on Motion Data 9 2.2 Human Pose Estimation 10 2.3 Machine Learning on Human Pose Estimation 11 2.4 Dimension Reduction and Uniform Sampling 12 2.5 Neural Networks on Motion Data 13 3 Markerless Human Pose Recognition System 14 3.1 System Overview 14 3.2 Preprocessing Data Process 15 3.3 Randomized Decision Tree 20 3.4 Joint Estimation Process 22 4 Controllable Sampling Data in the Space of Human Poses 26 4.1 Overview 26 4.2 Locally Stratified Sampling 28 4.3 Experimental Results 34 4.4 Discussion 40 5 Human Pose Estimation with Interacting Prop from Single Depth Image 48 5.1 Introduction 48 5.2 Prop Estimation 50 5.3 Experimental Results 53 5.4 Discussion 57 6 Enhancing the Estimation of Human Pose from Incomplete Joints 58 6.1 Overview 58 6.2 Method 59 6.3 Experimental Result 62 6.4 Discussion 66 7 Conclusion 67 Bibliography 69 초록 81Docto

    Graphical models for visual object recognition and tracking

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2006.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Includes bibliographical references (p. 277-301).We develop statistical methods which allow effective visual detection, categorization, and tracking of objects in complex scenes. Such computer vision systems must be robust to wide variations in object appearance, the often small size of training databases, and ambiguities induced by articulated or partially occluded objects. Graphical models provide a powerful framework for encoding the statistical structure of visual scenes, and developing corresponding learning and inference algorithms. In this thesis, we describe several models which integrate graphical representations with nonparametric statistical methods. This approach leads to inference algorithms which tractably recover high-dimensional, continuous object pose variations, and learning procedures which transfer knowledge among related recognition tasks. Motivated by visual tracking problems, we first develop a nonparametric extension of the belief propagation (BP) algorithm. Using Monte Carlo methods, we provide general procedures for recursively updating particle-based approximations of continuous sufficient statistics. Efficient multiscale sampling methods then allow this nonparametric BP algorithm to be flexibly adapted to many different applications.(cont.) As a particular example, we consider a graphical model describing the hand's three-dimensional (3D) structure, kinematics, and dynamics. This graph encodes global hand pose via the 3D position and orientation of several rigid components, and thus exposes local structure in a high-dimensional articulated model. Applying nonparametric BP, we recover a hand tracking algorithm which is robust to outliers and local visual ambiguities. Via a set of latent occupancy masks, we also extend our approach to consistently infer occlusion events in a distributed fashion. In the second half of this thesis, we develop methods for learning hierarchical models of objects, the parts composing them, and the scenes surrounding them. Our approach couples topic models originally developed for text analysis with spatial transformations, and thus consistently accounts for geometric constraints. By building integrated scene models, we may discover contextual relationships, and better exploit partially labeled training images. We first consider images of isolated objects, and show that sharing parts among object categories improves accuracy when learning from few examples.(cont.) Turning to multiple object scenes, we propose nonparametric models which use Dirichlet processes to automatically learn the number of parts underlying each object category, and objects composing each scene. Adapting these transformed Dirichlet processes to images taken with a binocular stereo camera, we learn integrated, 3D models of object geometry and appearance. This leads to a Monte Carlo algorithm which automatically infers 3D scene structure from the predictable geometry of known object categories.by Erik B. Sudderth.Ph.D

    Teacher Sensemaking in Times of Crisis: A Case Study of the Teaching of High School Ethnic Studies Classes During the COVID-19 Pandemic and Black Lives Matter Protests

    Get PDF
    This is a study about secondary ethnic studies classes within the COVID-19 pandemic. In 2020, a novel coronavirus caused dramatic changes in society, and social protests erupted in the United States in response to violence against people of color. This period of dual crises created a collective period of turbulence for educators in the United States as schooling moved to emergency virtual environments. Though the impact of this time is not yet understood, early indicators suggest that existing educational inequalities for students of color will be exacerbated. This study explored ethnic studies teacher sensemaking to understand how teachers adapted their pedagogy during the time of the COVID-19 pandemic and Black Lives Matter protests. Ethnic studies classes provided an important case, because ethnic studies tends to adopt culturally relevant and community responsive pedagogies through the study of historically marginalized groups and the deconstruction of race and systems of oppression, which was particularly relevant in the context of the concurrent crises. This study employed a qualitative case study design to investigate the sensemaking strategies of nine high school ethnic studies teachers. This study posed the question, How did ethnic studies teachers make sense of teaching and learning in virtual environments for high school students during the time of the COVID-19 pandemic and Black Lives Matter protests? Qualitative interviews formed the primary data collection strategy. Data were analyzed through two cycles of coding. The findings suggest that ethnic studies teachers adjusted their teaching during this time by prioritizing student well-being. The critical dialogic approach privileged in ethnic studies classes meant that teachers were well-positioned to incorporate culturally responsive content, and utilize digital technology in innovative, humanizing ways. The teachers’ beliefs about teaching and the nature of ethnic studies pedagogy helped them engage in actions that directly addressed students’ social and emotional needs in this novel context. Implications for educators and school leaders are addressed, along with suggestions for future research on critical pedagogy in the virtual environment

    Visualizing the Motion Flow of Crowds

    Get PDF
    In modern cities, massive population causes problems, like congestion, accident, violence and crime everywhere. Video surveillance system such as closed-circuit television cameras is widely used by security guards to monitor human behaviors and activities to manage, direct, or protect people. With the quantity and prolonged duration of the recorded videos, it requires a huge amount of human resources to examine these video recordings and keep track of activities and events. In recent years, new techniques in computer vision field reduce the barrier of entry, allowing developers to experiment more with intelligent surveillance video system. Different from previous research, this dissertation does not address any algorithm design concerns related to object detection or object tracking. This study will put efforts on the technological side and executing methodologies in data visualization to find the model of detecting anomalies. It would like to provide an understanding of how to detect the behavior of the pedestrians in the video and find out anomalies or abnormal cases by using techniques of data visualization
    corecore