28 research outputs found

    Deliverable D1.4 Visual, text and audio information analysis for hypervideo, final release

    Get PDF
    Having extensively evaluated the performance of the technologies included in the first release of WP1 multimedia analysis tools, using content from the LinkedTV scenarios and by participating in international benchmarking activities, concrete decisions regarding the appropriateness and the importance of each individual method or combination of methods were made, which, combined with an updated list of information needs for each scenario, led to a new set of analysis requirements that had to be addressed through the release of the final set of analysis techniques of WP1. To this end, coordinated efforts on three directions, including (a) the improvement of a number of methods in terms of accuracy and time efficiency, (b) the development of new technologies and (c) the definition of synergies between methods for obtaining new types of information via multimodal processing, resulted in the final bunch of multimedia analysis methods for video hyperlinking. Moreover, the different developed analysis modules have been integrated into a web-based infrastructure, allowing the fully automatic linking of the multitude of WP1 technologies and the overall LinkedTV platform

    Automatic assessment of motivational interview with diabetes patients

    Get PDF
    Diabetes cost the UK NHS £10 billion each year, and the cost pressure is projected to get worse. Motivational Interviewing (MI) is a goal-driven clinical conversation that seeks to reduce this cost by encouraging patients to take ownership of day-to-day monitoring and medication, whose effectiveness is commonly evaluated against the Motivational Interviewing Treatment Integrity (MITI) manual. Unfortunately, measuring clinicians’ MI performance is costly, requiring expert human instructors to ensure the adherence of MITI. Although it is desirable to assess MI in an automated fashion, many challenges still remain due to its complexity. In this thesis, an automatic system to assess clinicians adherence to the MITI criteria using different spoken language techniques was developed. The system tackled the chal- lenges using automatic speech recognition (ASR), speaker diarisation, topic modelling and clinicians’ behaviour code identification. For ASR, only 8 hours of in-domain MI data are available for training. The experiments with different open-source datasets, for example, WSJCAM0 and AMI, are presented. I have explored adaptative training of the ASR system and also the best training criterion and neural network structure. Over 45 minutes of MI testing data, the best ASR system achieves 43.59% word error rate. The i-vector based diarisation system achieves an F-measure of 0.822. The MITI behaviour code classification system with manual transcriptions achieves an accuracy of 78% for Non Question/Question classification, an accuracy of 80% for Open Question/Closed Question classification and an accuracy of 78% for MI Adherence and MI Non-Adherence classification. Topic modelling was applied to track whether the conversation segments were related to ‘diabetes’ or not on manual transcriptions as well as ASR outputs. The full automatic assessment system achieve an Assessment Error Rate of 22.54%. This is the first system that targets the full automation of MI assessment with reasonable performance. In addition, the error analysis from each step is able to guide future research in this area for further improvement and optimisation

    Preface

    Get PDF

    Speech Recognition

    Get PDF
    Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

    Reading comprehension processes and strategies in L1 and L2 in Malaysian primary and secondary schools

    Get PDF
    This study is set in the context of the acknowledged debate, highlighted by the work of Lunzer and Gardner's Schools Council project (1979), concerning the theoretical issue of whether reading comprehension is a unitary competence or consists of identifiable discrete subskills. This long- standing polarised theoretical debate can be traced as far back as the sixties and seventies in the position taken by reading experts such as Spache and Spache (1969), Davis (1971) and Thorndike (1973). Spache and Spache and Thorndike concluded that reading comprehension was a unitary competence, not consisting of separate skills that can be practised in isolation. On the other hand, Davis viewed reading comprehension as composed of separate identifiable skills and abilities. The polarised arguments pose a question as to the nature of reading comprehension. Is there such a thing as discrete reading comprehension sub-skills that can be built up hierarchically and can promote the understanding of texts? With the question in mind this study set out to test whether reading comprehension is a unitary competence of sub-skills or one that can be broken down into separate sub-skills. The research involved the rigorous testing of a series of reading comprehension tests in two languages using four texts taken from the work of Lunzer and Gardner (1979). The texts were modified to suit the socio-cultural context of the students. All of the chosen texts were translated into Bahasa (L1) which is the mother-tongue of the students. In principle, the focus of the study in Part I is centred on replicating the work of Lunzer and Gardner (1979) in some selected Malaysian primary and secondary schools. It seeks to understand whether the main hypothesis holds that reading comprehension is unitary in nature and cannot be broken down into a number of distinct subskills. A selected 300 primary school pupils aged 12 were required to read and answer four comprehension tests written in L1. Another selected 150 secondary school students aged 15 were required to perform the same tasks on material written in L2. Each test has about 30 comprehension questions which are divided into eight categories of subskills. The two groups produced a total of 1,636 valid comprehension tests which were marked rigorously. Factor analysing the data yielded a number of important findings concerning whether reading comprehension subskills are unitary or hierarchic in nature. These findings may suggest some recommendations for improving reading for learning across the Malaysian primary and secondary schools curriculum. In Part I the outlined five chapters discuss the background information which led to the testing of the 450 students, the related literature review, the chosen research design and analysis, the findings and the research implications for the Part II study. The study reported in Part II is an extension of the work done in Part I, in that the remaining five chapters explore the justification of conducting the indepth interviews, the review of the related literature, the design of the interview, the findings and the educational implications of the study. This part explores the reading comprehension strategies that were used by the students in answering the comprehension questions. The second study was successfully made during the summer of 1994. A total of 16 students aged 15 were selected from several secondary schools in Johor Bahru, the capital state of Johor, Malaysia … … The thesis ends with a discussion of the implications from both studies especially for the reading curriculum, instruction, pedagogy, classroom practice and future research
    corecore