104 research outputs found
A hierarchal framework for recognising activities of daily life
PhDIn today’s working world the elderly who are dependent can sometimes be
neglected by society. Statistically, after toddlers it is the elderly who are observed
to have higher accident rates while performing everyday activities. Alzheimer’s
disease is one of the major impairments that elderly people suffer from, and leads
to the elderly person not being able to live an independent life due to forgetfulness.
One way to support elderly people who aspire to live an independent life and
remain safe in their home is to find out what activities the elderly person is
carrying out at a given time and provide appropriate assistance or institute
safeguards.
The aim of this research is to create improved methods to identify tasks related to
activities of daily life and determine a person’s current intentions and so reason
about that person’s future intentions. A novel hierarchal framework has been
developed, which recognises sensor events and maps them to significant activities
and intentions. As privacy is becoming a growing concern, the monitoring of an
individual’s behaviour can be seen as intrusive. Hence, the monitoring is based
around using simple non intrusive sensors and tags on everyday objects that are
used to perform daily activities around the home. Specifically there is no use of
any cameras or visual surveillance equipment, though the techniques developed
are still relevant in such a situation.
Models for task recognition and plan recognition have been developed and tested
on scenarios where the plans can be interwoven. Potential targets are people in the
first stages of Alzheimer’s disease and in the structuring of the library of kernel
plan sequences, typical routines used to sustain meaningful activity have been
used. Evaluations have been carried out using volunteers conducting activities of
daily life in an experimental home environment. The results generated from the
sensors have been interpreted and analysis of developed algorithms has been
made. The outcomes and findings of these experiments demonstrate that the
developed hierarchal framework is capable of carrying activity recognition as well
as being able to carry out intention analysis, e.g. predicting what activity they are
most likely to carry out next
Modelling Speech Dynamics with Trajectory-HMMs
Institute for Communicating and Collaborative SystemsThe conditional independence assumption imposed by the hidden Markov models
(HMMs) makes it difficult to model temporal correlation patterns in human speech.
Traditionally, this limitation is circumvented by appending the first and second-order
regression coefficients to the observation feature vectors. Although this leads to improved
performance in recognition tasks, we argue that a straightforward use of dynamic
features in HMMs will result in an inferior model, due to the incorrect handling
of dynamic constraints. In this thesis I will show that an HMM can be transformed
into a Trajectory-HMM capable of generating smoothed output mean trajectories, by
performing a per-utterance normalisation. The resulting model can be trained by either
maximisingmodel log-likelihood or minimisingmean generation errors on the training
data. To combat the exponential growth of paths in searching, the idea of delayed path
merging is proposed and a new time-synchronous decoding algorithm built on the concept
of token-passing is designed for use in the recognition task. The Trajectory-HMM
brings a new way of sharing knowledge between speech recognition and synthesis
components, by tackling both problems in a coherent statistical framework. I evaluated
the Trajectory-HMM on two different speech tasks using the speaker-dependent
MOCHA-TIMIT database. First as a generative model to recover articulatory features
from speech signal, where the Trajectory-HMM was used in a complementary way
to the conventional HMM modelling techniques, within a joint Acoustic-Articulatory
framework. Experiments indicate that the jointly trained acoustic-articulatory models
are more accurate (having a lower Root Mean Square error) than the separately trained
ones, and that Trajectory-HMM training results in greater accuracy compared with
conventional Baum-Welch parameter updating. In addition, the Root Mean Square
(RMS) training objective proves to be consistently better than the Maximum Likelihood
objective. However, experiment of the phone recognition task shows that the
MLE trained Trajectory-HMM, while retaining attractive properties of being a proper
generative model, tends to favour over-smoothed trajectories among competing hypothesises,
and does not perform better than a conventional HMM. We use this to
build an argument that models giving a better fit on training data may suffer a reduction
of discrimination by being too faithful to the training data. Finally, experiments
on using triphone models show that increasing modelling detail is an effective way to
leverage modelling performance with little added complexity in training
Recommended from our members
Proceedings of ECAI International Workshop on Neural-Symbolic Learning and reasoning NeSy 2006
The Design and Application of an Acoustic Front-End for Use in Speech Interfaces
This thesis describes the design, implementation, and application of an acoustic front-end. Such front-ends constitute the core of automatic speech recognition systems. The front-end whose development is reported here has been designed for speaker-independent large vocabulary recognition. The emphasis of this thesis is more one of design than of application. This work exploits the current state-of-the-art in speech recognition research, for example, the use of Hidden Markov Models. It describes the steps taken to build a speaker-independent large vocabulary system from signal processing, through pattern matching, to language modelling. An acoustic front-end can be considered as a multi-stage process, each of which requires the specification of many parameters. Some parameters have fundamental consequences for the ultimate application of the front-end. Therefore, a major part of this thesis is concerned with their analysis and specification. Experiments were carried out to determine the characteristics of individual parameters, the results of which were then used to motivate particular parameter settings. The thesis concludes with some applications that point out, not only the power of the resulting acoustic front-end, but also its limitations
Essential Speech and Language Technology for Dutch: Results by the STEVIN-programme
Computational Linguistics; Germanic Languages; Artificial Intelligence (incl. Robotics); Computing Methodologie
Cruiser and PhoTable: Exploring Tabletop User Interface Software for Digital Photograph Sharing and Story Capture
Digital photography has not only changed the nature of photography and the photographic process, but also the manner in which we share photographs and tell stories about them. Some traditional methods, such as the family photo album or passing around piles of recently developed snapshots, are lost to us without requiring the digital photos to be printed. The current, purely digital, methods of sharing do not provide the same experience as printed photographs, and they do not provide effective face-to-face social interaction around photographs, as experienced during storytelling. Research has found that people are often dissatisfied with sharing photographs in digital form. The recent emergence of the tabletop interface as a viable multi-user direct-touch interactive large horizontal display has provided the hardware that has the potential to improve our collocated activities such as digital photograph sharing. However, while some software to communicate with various tabletop hardware technologies exists, software aspects of tabletop user interfaces are still at an early stage and require careful consideration in order to provide an effective, multi-user immersive interface that arbitrates the social interaction between users, without the necessary computer-human interaction interfering with the social dialogue. This thesis presents PhoTable, a social interface allowing people to effectively share, and tell stories about, recently taken, unsorted digital photographs around an interactive tabletop. In addition, the computer-arbitrated digital interaction allows PhoTable to capture the stories told, and associate them as audio metadata to the appropriate photographs. By leveraging the tabletop interface and providing a highly usable and natural interaction we can enable users to become immersed in their social interaction, telling stories about their photographs, and allow the computer interaction to occur as a side-effect of the social interaction. Correlating the computer interaction with the corresponding audio allows PhoTable to annotate an automatically created digital photo album with audible stories, which may then be archived. These stories remain useful for future sharing -- both collocated sharing and remote (e.g. via the Internet) -- and also provide a personal memento both of the event depicted in the photograph (e.g. as a reminder) and of the enjoyable photo sharing experience at the tabletop. To provide the necessary software to realise an interface such as PhoTable, this thesis explored the development of Cruiser: an efficient, extensible and reusable software framework for developing tabletop applications. Cruiser contributes a set of programming libraries and the necessary application framework to facilitate the rapid and highly flexible development of new tabletop applications. It uses a plugin architecture that encourages code reuse, stability and easy experimentation, and leverages the dedicated computer graphics hardware and multi-core processors of modern consumer-level systems to provide a responsive and immersive interactive tabletop user interface that is agnostic to the tabletop hardware and operating platform, using efficient, native cross-platform code. Cruiser's flexibility has allowed a variety of novel interactive tabletop applications to be explored by other researchers using the framework, in addition to PhoTable. To evaluate Cruiser and PhoTable, this thesis follows recommended practices for systems evaluation. The design rationale is framed within the above scenario and vision which we explore further, and the resulting design is critically analysed based on user studies, heuristic evaluation and a reflection on how it evolved over time. The effectiveness of Cruiser was evaluated in terms of its ability to realise PhoTable, use of it by others to explore many new tabletop applications, and an analysis of performance and resource usage. Usability, learnability and effectiveness of PhoTable was assessed on three levels: careful usability evaluations of elements of the interface; informal observations of usability when Cruiser was available to the public in several exhibitions and demonstrations; and a final evaluation of PhoTable in use for storytelling, where this had the side effect of creating a digital photo album, consisting of the photographs users interacted with on the table and associated audio annotations which PhoTable automatically extracted from the interaction. We conclude that our approach to design has resulted in an effective framework for creating new tabletop interfaces. The parallel goal of exploring the potential for tabletop interaction as a new way to share digital photographs was realised in PhoTable. It is able to support the envisaged goal of an effective interface for telling stories about one's photos. As a serendipitous side-effect, PhoTable was effective in the automatic capture of the stories about individual photographs for future reminiscence and sharing. This work provides foundations for future work in creating new ways to interact at a tabletop and to the ways to capture personal stories around digital photographs for sharing and long-term preservation
- …