4,942 research outputs found

    Accuracy and Timeliness in ML Based Activity Recognition

    Get PDF
    While recent Machine Learning (ML) based techniques for activity recognition show great promise, there remain a number of questions with respect to the relative merits of these techniques. To provide a better understanding of the relative strengths of contemporary Activity Recognition methods, in this paper we present a comparative analysis of Hidden Markov Model, Bayesian, and Support Vector Machine based human activity recognition models. The study builds on both pre-existing and newly annotated data which includes interleaved activities. Results demonstrate that while Support Vector Machine based techniques perform well for all data sets considered, simple representations of sensor histories regularly outperform more complex count based models

    A Comparative Study of the Effect of Sensor Noise on Activity Recognition Models

    Get PDF
    To provide a better understanding of the relative strengths of Machine Learning based Activity Recognition methods, in this paper we present a comparative analysis of the robustness of three popular methods with respect to sensor noise. Specifically we evaluate the robustness of Naive Bayes classifier, Support Vector Machine, and Random Forest based activity recognition models in three cases which span sensor errors from dead to poorly calibrated sensors. Test data is partially synthesized from a recently annotated activity recognition corpus which includes both interleaved activities and a range of both temporally long and short activities. Results demonstrate that the relative performance of Support Vector Machine classifiers over Naive Bayes classifiers reduces in noisy sensor conditions, but that overall the Random Forest classifier provides best activity recognition accuracy across all noise conditions synthesized in the corpus. Moreover, we find that activity recognition is equally robust across classification techniques with the relative performance of all models holding up under almost all sensor noise conditions considered

    Using Generalised Dialogue Models to Constrain Information State Based Dialogue Systems

    Get PDF
    While Information State (IS) based techniques show promise in the construction of flexible, knowledgebased dialogue systems, the many declarative rules that are used to encode Dialogue Theories often lead to opaque systems that are difficult to test and potentially unintuitive to users. In this paper, we advocate the application of explicitly defined Generic Dialogue Models (GDMs), encoded as recursive transition networks (RTNs), to the structuring of information state-based dialogue managers. To this end, we review the state of GDM approaches, comparing and contrasting them against the Dialogue Theories which are typically implemented using information state approaches. Furthermore, to support our approach, we present an extension of the ALPHA (A Language for Programming Hybrid Agents) language, which has been enhanced to support information state and GDM concepts directly

    Show, Prefer and Tell: Incorporating User Preferences into Image Captioning

    Get PDF
    Image Captioning (IC) is the task of generating natural language descriptions for images. Models encode the image using a convolutional neural network (CNN) and generate the caption via a recurrent model or a multi-modal transformer. Success is measured by the similarity between generated captions and human-written “ground-truth” captions, using the CIDEr [14], SPICE [1] and METEOR [2] metrics. While incremental gains have been made on these metrics, there is a lack of focus on end-user opinions on the amount of content in captions. Studies with blind and low-vision participants have found that lack of detail is a problem [6, 13, 17], and that the preferred amount of content varies between individuals [13], as do individual opinions on the trade-off between correctness and adding additional content with lower confidence [9]. We propose a more user-centered approach with an adjustable amount of content based on the number of regions to describe

    Formalising control in robust spoken dialogue systems

    Full text link
    The spoken language interface is now becoming an in-creasingly serious research topic with application to a wide range of highly engineered systems. Such systems not only include innocuous human-computer interactions, but also encompass shared-control safety critical devices such as automotive vehicles and robotic systems. Spoken Dialogue Systems (SDS) are the language architecture used to provide linguistic interaction in these applications, but they have to date been notoriously difficult to engineer in a robust and safe manner. In this paper we report on our efforts to im-prove the safety and overall usability of dialogue enabled applications through the employment of formal methods in SDS development and testing. Specifically, we use Commu-nicating Sequential Processes (CSP) as the basis of a new approach to the specification, design and verification of dia-logue manager control. Moreover, to support this approach, we introduce FDMSC – the Formal Dialogue Management for Shared Control toolkit – and illustrate its use in the con-struction of formal methods based spoken dialogue systems. 1

    Language-Driven Region Pointer Advancement for Controllable Image Captioning

    Get PDF
    Controllable Image Captioning is a recent sub-field in the multi-modal task of Image Captioning wherein constraints are placed on which regions in an image should be described in the generated natural language caption. This puts a stronger focus on producing more detailed descriptions, and opens the door for more end-user control over results. A vital component of the Controllable Image Captioning architecture is the mechanism that decides the timing of attending to each region through the advancement of a region pointer. In this paper, we propose a novel method for predicting the timing of region pointer advancement by treating the advancement step as a natural part of the language structure via a NEXT-token, motivated by a strong correlation to the sentence structure in the training data. We find that our timing agrees with the ground-truth timing in the Flickr30k Entities test data with a precision of 86.55% and a recall of 97.92%. Our model implementing this technique improves the state-of-the-art on standard captioning metrics while additionally demonstrating a considerably larger effective vocabulary size

    Evaluation of a Substitution Method for Idiom Transformation in Statistical Machine Translation

    Get PDF
    We evaluate a substitution based technique for improving Statistical Machine Translation performance on idiomatic multiword expressions. The method operates by performing substitution on the original idiom with its literal meaning before translation, with a second substitution step replacing literal meanings with idioms following translation. We detail our approach, outline our implementation and provide an evaluation of the method for the language pair English/Brazilian-Portuguese. Our results show improvements in translation accuracy on sentences containing either morphosyntactically constrained or unconstrained idioms. We discuss the consequences of our results and outline potential extensions to this process
    corecore