21 research outputs found
Tinjauan Kasus Model Speech Recognition: Hidden Markov Model
Teknologi pengenal suara (speech recognition) merupakan teknologi yang berkembang pesat dalam bidang kecerdasan buatan (artificial intelligent). Saat ini, teknologi pengenal suara menjadi hal yang komersil melalui berbagai media teknologi seperti smartphone dan komputer. Salah satu pembentuk struktur pengenal suara agar dapat bekerja pada perangkat tersebut adalah model statistik pengenal suara Hidden Markov Model (HMM). Penerapan HMM pada berbagai kasus menunjukkan bahwa model ini cocok dengan berbagai macam data. Tulisan ini merupakan sebuah tinjauan untuk model HMM yang bertujuan untuk memberikan gambaran dan pemahaman terhadap kinerja HMM melalui rangkuman sejumlah penelitian yang digunakan dalam berbagai data. Penerapan HMM tersebut menunjukkan optimalisasi kinerja HMM dan tinjauan terhadap sejumlah penelitian menunjukkan bahwa tingkat keberhasilan HMM dalam mengenali data mencapai 71.43%
Articulated motion and deformable objects
This guest editorial introduces the twenty two papers accepted for this Special Issue on Articulated Motion and Deformable Objects (AMDO). They are grouped into four main categories within the field of AMDO: human motion analysis (action/gesture), human pose estimation, deformable shape segmentation, and face analysis. For each of the four topics, a survey of the recent developments in the field is presented. The accepted papers are briefly introduced in the context of this survey. They contribute novel methods, algorithms with improved performance as measured on benchmarking datasets, as well as two new datasets for hand action detection and human posture analysis. The special issue should be of high relevance to the reader interested in AMDO recognition and promote future research directions in the field
WATCHING PEOPLE: ALGORITHMS TO STUDY HUMAN MOTION AND ACTIVITIES
Nowadays human motion analysis is one of the most active research topics in Computer Vision and it is receiving an increasing attention from both the industrial and scientific communities.
The growing interest in human motion analysis is motivated by the increasing number of promising applications, ranging from surveillance, human–computer interaction, virtual reality to healthcare, sports, computer games and video conferencing, just to name a few.
The aim of this thesis is to give an overview of the various tasks involved in visual motion analysis of the human body and to present the issues and possible solutions related to it.
In this thesis, visual motion analysis is categorized into three major areas related to the interpretation of human motion: tracking of human motion using virtual pan-tilt-zoom (vPTZ) camera, recognition of human motions and human behaviors segmentation.
In the field of human motion tracking, a virtual environment for PTZ cameras (vPTZ) is presented to overcame the mechanical limitations of PTZ cameras. The vPTZ is built on equirectangular images acquired by 360° cameras and it allows not only the development of pedestrian tracking algorithms but also the comparison of their performances. On the basis of this virtual environment, three novel pedestrian tracking algorithms for 360° cameras were developed, two of which adopt a tracking-by-detection approach while the last adopts a Bayesian approach.
The action recognition problem is addressed by an algorithm that represents actions in terms of multinomial distributions of frequent sequential patterns of different length. Frequent sequential patterns are series of data descriptors that occur many times in the data. The proposed method learns a codebook of frequent sequential patterns by means of an apriori-like algorithm. An action is then represented with a Bag-of-Frequent-Sequential-Patterns approach.
In the last part of this thesis a methodology to semi-automatically annotate behavioral data given a small set of manually annotated data is presented. The resulting methodology is not only effective in the semi-automated annotation task but can also be used in presence of abnormal behaviors, as demonstrated empirically by testing the system on data collected from children affected by neuro-developmental disorders
Recommended from our members
User-centric anomaly detection in activities of daily living
The current system for providing care to older adults is not sustainable due to its excessive cost. It places an unbearable financial burden on the government and families and pressure on the workforce due to the demand for human carers. Studies have also shown that older adults prefer to be looked after in their homes rather than in a care facility. An automated system of monitoring can provide much-needed support at a lower cost and give peace of mind to relatives.
The focus of the research reported in this thesis is to investigate the concept of abnormality detection in activities of daily living. More precisely, this work is aimed at proposing a dynamic approach for anomaly detection capable of adapting to changes in human behaviour. Abnormalities in daily activities can be an early indication of health decline. Therefore, early detection can inform the families of the need for intervention. Anomalies are often detected by modelling the existing activity data representing the usual behavioural routine of an individual to serve as a baseline model. Subsequent activities deviating from the baseline are then classified as outliers or anomalies. However, existing approaches suffer from a high rate of false prediction due to the static nature and the inability of the approaches to adapt to the changing human behaviour.
The contributions of the research are reported in four main categories. First, a novel ensemble approach termed "Consensus Novelty Detection Ensemble" is proposed. The outlying activities are predicted by computing their normality score using the internal and external consensus vote and the estimated weights of the models in the ensemble. Activities with a score exceeding a threshold estimated using a statistical method based on data distribution are predicted as outliers and vice versa.
Secondly, a similarity measure approach for identifying the likely sources of the ADL anomalies is proposed. While the models can detect anomalous activities, they are unable to identify the source (cause) of the anomaly. Identifying the anomaly source allows for the development of an adaptive system. The approach is based on a pairwise distance measurement of the features extracted from the activity data. Two approaches for performing the similarity measures are presented, namely, One vs One Similarity Measure (OOSM) and One vs All Similarity Measure (OASM). Features of the data with a higher dissimilarity rate are predicted as the source.
To make the proposed model adaptive to the changes in human behaviour, a novel adaptive approach is proposed based on the concept of forgetting factors. This allows the model to forget (discard) outdated activity data and adapt to the current behavioural patterns by incorporating newly verified data. The data verification can be performed by incorporating human feedback into the system. Two forgetting factor approaches are proposed namely; Forgetting Factor based on Data Ageing (FFDD) and Forgetting Factor based on Data Dissimilarity (FFDA). The data ageing forgetting factor discard old behavioural routine based on the age of the activity data, while in the data dissimilarity approach, this is achieved by measuring the similarity of the activity data.
Lastly, the means of utilising an assistive robot as a communication intermediary is explored for incorporating human feedback into the learning process using hand gestures as a communication modality. Experimental data used for the gesture recognition model is collected using a wearable sensor and a 2D camera. The feasibility of utilising the robotic platform as an exercise coach to encourage physical activity and promote a healthy lifestyle is explored. To this end, an exercise training solution is developed for the robotic platform to coach, motivate and assess the older adults in the recommended physical activities
Acquisition and distribution of synergistic reactive control skills
Learning from demonstration is an afficient way to attain a new skill. In the context of autonomous robots, using a demonstration to teach a robot accelerates the robot learning process significantly. It helps to identify feasible solutions as starting points for future exploration or to avoid actions that lead to failure. But the acquisition of pertinent observationa is predicated on first segmenting the data into meaningful sequences. These segments form the basis for learning models capable of recognising future actions and reconstructing the motion to control a robot. Furthermore, learning algorithms for generative models are generally not tuned to produce stable trajectories and suffer from parameter redundancy for high degree of freedom robots
This thesis addresses these issues by firstly investigating algorithms, based on dynamic programming and mixture models, for segmentation sensitivity and recognition accuracy on human motion capture data sets of repetitive and categorical motion classes. A stability analysis of the non-linear dynamical systems derived from the resultant mixture model representations aims to ensure that any trajectories converge to the intended target motion as observed in the demonstrations. Finally, these concepts are extended to humanoid robots by deploying a factor analyser for each mixture model component and coordinating the structure into a low dimensional representation of the demonstrated trajectories. This representation can be constructed as a correspondence map is learned between the demonstrator and robot for joint space actions.
Applying these algorithms for demonstrating movement skills to robot is a further step towards autonomous incremental robot learning
The Dollar General: Continuous Custom Gesture Recognition Techniques At Everyday Low Prices
Humans use gestures to emphasize ideas and disseminate information. Their importance is apparent in how we continuously augment social interactions with motion—gesticulating in harmony with nearly every utterance to ensure observers understand that which we wish to communicate, and their relevance has not escaped the HCI community\u27s attention. For almost as long as computers have been able to sample human motion at the user interface boundary, software systems have been made to understand gestures as command metaphors. Customization, in particular, has great potential to improve user experience, whereby users map specific gestures to specific software functions. However, custom gesture recognition remains a challenging problem, especially when training data is limited, input is continuous, and designers who wish to use customization in their software are limited by mathematical attainment, machine learning experience, domain knowledge, or a combination thereof. Data collection, filtering, segmentation, pattern matching, synthesis, and rejection analysis are all non-trivial problems a gesture recognition system must solve. To address these issues, we introduce The Dollar General (TDG), a complete pipeline composed of several novel continuous custom gesture recognition techniques. Specifically, TDG comprises an automatic low-pass filter tuner that we use to improve signal quality, a segmenter for identifying gesture candidates in a continuous input stream, a classifier for discriminating gesture candidates from non-gesture motions, and a synthetic data generation module we use to train the classifier. Our system achieves high recognition accuracy with as little as one or two training samples per gesture class, is largely input device agnostic, and does not require advanced mathematical knowledge to understand and implement. In this dissertation, we motivate the importance of gestures and customization, describe each pipeline component in detail, and introduce strategies for data collection and prototype selection
Integrating passive ubiquitous surfaces into human-computer interaction
Mobile technologies enable people to interact with computers ubiquitously. This dissertation investigates how ordinary, ubiquitous surfaces can be integrated into human-computer interaction to extend the interaction space beyond the edge of the display. It turns out that acoustic and tactile features generated during an interaction can be combined to identify input events, the user, and the surface. In addition, it is shown that a heterogeneous distribution of different surfaces is particularly suitable for realizing versatile interaction modalities. However, privacy concerns must be considered when selecting sensors, and context can be crucial in determining whether and what interaction to perform.Mobile Technologien ermöglichen den Menschen eine allgegenwärtige Interaktion mit Computern. Diese Dissertation untersucht, wie gewöhnliche, allgegenwärtige Oberflächen in die Mensch-Computer-Interaktion integriert werden können, um den Interaktionsraum über den Rand des Displays hinaus zu erweitern. Es stellt sich heraus, dass akustische und taktile Merkmale, die während einer Interaktion erzeugt werden, kombiniert werden können, um Eingabeereignisse, den Benutzer und die Oberfläche zu identifizieren. Darüber hinaus wird gezeigt, dass eine heterogene Verteilung verschiedener Oberflächen besonders geeignet ist, um vielfältige Interaktionsmodalitäten zu realisieren. Bei der Auswahl der Sensoren müssen jedoch Datenschutzaspekte berücksichtigt werden, und der Kontext kann entscheidend dafür sein, ob und welche Interaktion durchgeführt werden soll
The Democratization of Artificial Intelligence: Net Politics in the Era of Learning Algorithms
After a long time of neglect, Artificial Intelligence is once again at the center of most of our political, economic, and socio-cultural debates. Recent advances in the field of Artifical Neural Networks have led to a renaissance of dystopian and utopian speculations on an AI-rendered future. Algorithmic technologies are deployed for identifying potential terrorists through vast surveillance networks, for producing sentencing guidelines and recidivism risk profiles in criminal justice systems, for demographic and psychographic targeting of bodies for advertising or propaganda, and more generally for automating the analysis of language, text, and images. Against this background, the aim of this book is to discuss the heterogenous conditions, implications, and effects of modern AI and Internet technologies in terms of their political dimension: What does it mean to critically investigate efforts of net politics in the age of machine learning algorithms