185 research outputs found

    Child's play: activity recognition for monitoring children's developmental progress with augmented toys

    Get PDF
    The way in which infants play with objects can be indicative of their developmental progress and may serve as an early indicator for developmental delays. However, the observation of children interacting with toys for the purpose of quantitative analysis can be a difficult task. To better quantify how play may serve as an early indicator, researchers have conducted retrospective studies examining the differences in object play behaviors among infants. However, such studies require that researchers repeatedly inspect videos of play often at speeds much slower than real-time to indicate points of interest. The research presented in this dissertation examines whether a combination of sensors embedded within toys and automatic pattern recognition of object play behaviors can help expedite this process. For my dissertation, I developed the Child'sPlay system which uses augmented toys and statistical models to automatically provide quantitative measures of object play interactions, as well as, provide the PlayView interface to view annotated play data for later analysis. In this dissertation, I examine the hypothesis that sensors embedded in objects can provide sufficient data for automatic recognition of certain exploratory, relational, and functional object play behaviors in semi-naturalistic environments and that a continuum of recognition accuracy exists which allows automatic indexing to be useful for retrospective review. I designed several augmented toys and used them to collect object play data from more than fifty play sessions. I conducted pattern recognition experiments over this data to produce statistical models that automatically classify children's object play behaviors. In addition, I conducted a user study with twenty participants to determine if annotations automatically generated from these models help improve performance in retrospective review tasks. My results indicate that these statistical models increase user performance and decrease perceived effort when combined with the PlayView interface during retrospective review. The presence of high quality annotations are preferred by users and promotes an increase in the effective retrieval rates of object play behaviors.Ph.D.Committee Chair: Starner, Thad E.; Committee Co-Chair: Abowd, Gregory D.; Committee Member: Arriaga, Rosa; Committee Member: Jackson, Melody Moore; Committee Member: Lukowicz, Paul; Committee Member: Rehg, James M

    Learning as a Nonlinear Line of Attraction for Pattern Association, Classification and Recognition

    Get PDF
    Development of a mathematical model for learning a nonlinear line of attraction is presented in this dissertation, in contrast to the conventional recurrent neural network model in which the memory is stored in an attractive fixed point at discrete location in state space. A nonlinear line of attraction is the encapsulation of attractive fixed points scattered in state space as an attractive nonlinear line, describing patterns with similar characteristics as a family of patterns. It is usually of prime imperative to guarantee the convergence of the dynamics of the recurrent network for associative learning and recall. We propose to alter this picture. That is, if the brain remembers by converging to the state representing familiar patterns, it should also diverge from such states when presented by an unknown encoded representation of a visual image. The conception of the dynamics of the nonlinear line attractor network to operate between stable and unstable states is the second contribution in this dissertation research. These criteria can be used to circumvent the plasticity-stability dilemma by using the unstable state as an indicator to create a new line for an unfamiliar pattern. This novel learning strategy utilizes stability (convergence) and instability (divergence) criteria of the designed dynamics to induce self-organizing behavior. The self-organizing behavior of the nonlinear line attractor model can manifest complex dynamics in an unsupervised manner. The third contribution of this dissertation is the introduction of the concept of manifold of color perception. The fourth contribution of this dissertation is the development of a nonlinear dimensionality reduction technique by embedding a set of related observations into a low-dimensional space utilizing the result attained by the learned memory matrices of the nonlinear line attractor network. Development of a system for affective states computation is also presented in this dissertation. This system is capable of extracting the user\u27s mental state in real time using a low cost computer. It is successfully interfaced with an advanced learning environment for human-computer interaction

    Pattern Recognition

    Get PDF
    Pattern recognition is a very wide research field. It involves factors as diverse as sensors, feature extraction, pattern classification, decision fusion, applications and others. The signals processed are commonly one, two or three dimensional, the processing is done in real- time or takes hours and days, some systems look for one narrow object class, others search huge databases for entries with at least a small amount of similarity. No single person can claim expertise across the whole field, which develops rapidly, updates its paradigms and comprehends several philosophical approaches. This book reflects this diversity by presenting a selection of recent developments within the area of pattern recognition and related fields. It covers theoretical advances in classification and feature extraction as well as application-oriented works. Authors of these 25 works present and advocate recent achievements of their research related to the field of pattern recognition

    Using computer vision to categorize tyres and estimate the number of visible tyres in tyre stockpile images

    Get PDF
    Pressures from environmental agencies contribute to the challenges associated with the disposal of waste tyres, particularly in South Africa. Recycling of waste tyres in South Africa is in its infancy resulting in the historically undocumented and uncontrolled existence of waste tyre stockpiles across the country. The remote and distant locations of such stockpiles typically complicate the logistics associated with the collection, transport and storage of waste tyres prior to entering the recycling process. In order to optimize the logistics associated with the collection of waste tyres from stockpiles, useful information about such stockpiles would include estimates of the types of tyres as well as the quantity of specific tyre types found in particular stockpiles. This research proposes the use of computer vision for categorizing individual tyres and estimating the number of visible tyres in tyre stockpile images to support the logistics in tyre recycling efforts. The study begins with a broad review of image processing and computer vision algorithms for categorization and counting objects in images. The bag of visual words (BoVW) model for categorization is tested on two small data sets of tread tyre images using a random sub-sampling holdout method. The categorization results are evaluated using performance metrics for multiclass classifiers, namely the average accuracy, precision, and recall. The results indicated that corner-based local feature detectors combined with speeded up robust features (SURF) descriptors in a BoVW model provide moderately accurate categorization of tyres based on tread images. Two feature extraction methods for extracting features for use in training neural networks (NNs) for tyre count estimations in tyre stockpiles are proposed. The two feature extraction methods are used to describe images in terms of feature vectors that can be used as input for NNs. The first feature extraction method uses the BoVW model with histograms of oriented gradients (HOG) features collected from overlapping sub-images to create a visual vocabulary and describe the images in terms of their visual word occurrence histogram. The second feature extraction method uses the image gradient magnitude, gradient orientation, and edge orientations of edges detected using the Canny edge detector. A concatenated histogram is constructed from individual histograms of gradient orientations and gradient magnitude. The histograms are then used to train NNs using backpropogation to approximate functions from the feature vectors describing the images to scalar count estimations. The accuracy of visible object count predictions are evaluated using NN evaluation techniques to determine the accuracy of predictions and the generalization ability of the fit model. The count estimation experiments using the two feature extraction methods for input to NNs showed that fairly accurate count estimations can be obtained and that the fit model could generalize fairly well to unseen images

    Object Recognition

    Get PDF
    Vision-based object recognition tasks are very familiar in our everyday activities, such as driving our car in the correct lane. We do these tasks effortlessly in real-time. In the last decades, with the advancement of computer technology, researchers and application developers are trying to mimic the human's capability of visually recognising. Such capability will allow machine to free human from boring or dangerous jobs

    Proceedings of the Post-Graduate Conference on Robotics and Development of Cognition, 10-12 September 2012, Lausanne, Switzerland

    Get PDF
    The aim of the Postgraduate Conference on Robotics and Development of Cognition (RobotDoC-PhD) is to bring together young scientists working on developmental cognitive robotics and its core disciplines. The conference aims to provide both feedback and greater visibility to their research as lively and stimulating discussion can be held amongst participating PhD students and senior researchers. The conference is open to all PhD students and post-doctoral researchers in the field. RobotDoC-PhD conference is an initiative as a part of Marie-Curie Actions ITN RobotDoC and will be organized as a satellite event of the 22nd International Conference on Artificial Neural Networks ICANN 2012

    Proceedings of the Post-Graduate Conference on Robotics and Development of Cognition, 10-12 September 2012, Lausanne, Switzerland

    Get PDF
    The aim of the Postgraduate Conference on Robotics and Development of Cognition (RobotDoC-PhD) is to bring together young scientists working on developmental cognitive robotics and its core disciplines. The conference aims to provide both feedback and greater visibility to their research as lively and stimulating discussion can be held amongst participating PhD students and senior researchers. The conference is open to all PhD students and post-doctoral researchers in the field. RobotDoC-PhD conference is an initiative as a part of Marie-Curie Actions ITN RobotDoC and will be organized as a satellite event of the 22nd International Conference on Artificial Neural Networks ICANN 2012

    Emotion Recognition for Affective Computing: Computer Vision and Machine Learning Approach

    Get PDF
    The purpose of affective computing is to develop reliable and intelligent models that computers can use to interact more naturally with humans. The critical requirements for such models are that they enable computers to recognise, understand and interpret the emotional states expressed by humans. The emotion recognition has been a research topic of interest for decades, not only in relation to developments in the affective computing field but also due to its other potential applications. A particularly challenging problem that has emerged from this body of work, however, is the task of recognising facial expressions and emotions from still images or videos in real-time. This thesis aimed to solve this challenging problem by developing new techniques involving computer vision, machine learning and different levels of information fusion. Firstly, an efficient and effective algorithm was developed to improve the performance of the Viola-Jones algorithm. The proposed method achieved significantly higher detection accuracy (95%) than the standard Viola-Jones method (90%) in face detection from thermal images, while also doubling the detection speed. Secondly, an automatic subsystem for detecting eyeglasses, Shallow-GlassNet, was proposed to address the facial occlusion problem by designing a shallow convolutional neural network capable of detecting eyeglasses rapidly and accurately. Thirdly, a novel neural network model for decision fusion was proposed in order to make use of multiple classifier systems, which can increase the classification accuracy by up to 10%. Finally, a high-speed approach to emotion recognition from videos, called One-Shot Only (OSO), was developed based on a novel spatio-temporal data fusion method for representing video frames. The OSO method tackled video classification as a single image classification problem, which not only made it extremely fast but also reduced the overfitting problem

    Efficient audio signal processing for embedded systems

    Get PDF
    We investigated two design strategies that would allow us to efficiently process audio signals on embedded systems such as mobile phones and portable electronics. In the first strategy, we exploit properties of the human auditory system to process audio signals. We designed a sound enhancement algorithm to make piezoelectric loudspeakers sound "richer" and "fuller," using a combination of bass extension and dynamic range compression. We also developed an audio energy reduction algorithm for loudspeaker power management by suppressing signal energy below the masking threshold. In the second strategy, we use low-power analog circuits to process the signal before digitizing it. We designed an analog front-end for sound detection and implemented it on a field programmable analog array (FPAA). The sound classifier front-end can be used in a wide range of applications because programmable floating-gate transistors are employed to store classifier weights. Moreover, we incorporated a feature selection algorithm to simplify the analog front-end. A machine learning algorithm AdaBoost is used to select the most relevant features for a particular sound detection application. We also designed the circuits to implement the AdaBoost-based analog classifier.PhDCommittee Chair: Anderson, David; Committee Member: Hasler, Jennifer; Committee Member: Hunt, William; Committee Member: Lanterman, Aaron; Committee Member: Minch, Bradle
    • …