12,490 research outputs found
A systematic review of speech recognition technology in health care
BACKGROUND To undertake a systematic review of existing literature relating to speech recognition technology and its application within health care. METHODS A systematic review of existing literature from 2000 was undertaken. Inclusion criteria were: all papers that referred to speech recognition (SR) in health care settings, used by health professionals (allied health, medicine, nursing, technical or support staff), with an evaluation or patient or staff outcomes. Experimental and non-experimental designs were considered. Six databases (Ebscohost including CINAHL, EMBASE, MEDLINE including the Cochrane Database of Systematic Reviews, OVID Technologies, PreMED-LINE, PsycINFO) were searched by a qualified health librarian trained in systematic review searches initially capturing 1,730 references. Fourteen studies met the inclusion criteria and were retained. RESULTS The heterogeneity of the studies made comparative analysis and synthesis of the data challenging resulting in a narrative presentation of the results. SR, although not as accurate as human transcription, does deliver reduced turnaround times for reporting and cost-effective reporting, although equivocal evidence of improved workflow processes. CONCLUSIONS SR systems have substantial benefits and should be considered in light of the cost and selection of the SR system, training requirements, length of the transcription task, potential use of macros and templates, the presence of accented voices or experienced and in-experienced typists, and workflow patterns.Funding for this study was provided by the University of Western Sydney.
NICTA is funded by the Australian Government through the Department of
Communications and the Australian Research Council through the ICT
Centre of Excellence Program. NICTA is also funded and supported by the
Australian Capital Territory, the New South Wales, Queensland and Victorian
Governments, the Australian National University, the University of New South
Wales, the University of Melbourne, the University of Queensland, the
University of Sydney, Griffith University, Queensland University of
Technology, Monash University and other university partners
Special Issue on the AMCIS 2001 Workshops: Speech Enabled Information Systems: The Next Frontier
Speech technologies are coming of age. They are applied in an increasing number of mobile, call-center, home and office settings. They challenge the established Graphical User Interface metaphor and promise to fundamentally alter the way humans conceptualize and interact with computers. This change leads to new requirements for the development of information systems. It also provides new research issues and opportunities for the academic community. In this article, the main elements of speech technologies will be presented and their applications will be discussed. The article does not focus on technical aspects of speech technologies but is concerned with the business aspects of applying such technologies. The article is based on a workshop at the Americas Conference on Information Systems 2001 in Boston
Vocal Access to a Newspaper Archive: Design Issues and Preliminary Investigation
This paper presents the design and the current prototype implementation of an
interactive vocal Information Retrieval system that can be used to access
articles of a large newspaper archive using a telephone. The results of
preliminary investigation into the feasibility of such a system are also
presented
Why has (reasonably accurate) Automatic Speech Recognition been so hard to achieve?
Hidden Markov models (HMMs) have been successfully applied to automatic
speech recognition for more than 35 years in spite of the fact that a key HMM
assumption -- the statistical independence of frames -- is obviously violated
by speech data. In fact, this data/model mismatch has inspired many attempts to
modify or replace HMMs with alternative models that are better able to take
into account the statistical dependence of frames. However it is fair to say
that in 2010 the HMM is the consensus model of choice for speech recognition
and that HMMs are at the heart of both commercially available products and
contemporary research systems. In this paper we present a preliminary
exploration aimed at understanding how speech data depart from HMMs and what
effect this departure has on the accuracy of HMM-based speech recognition. Our
analysis uses standard diagnostic tools from the field of statistics --
hypothesis testing, simulation and resampling -- which are rarely used in the
field of speech recognition. Our main result, obtained by novel manipulations
of real and resampled data, demonstrates that real data have statistical
dependency and that this dependency is responsible for significant numbers of
recognition errors. We also demonstrate, using simulation and resampling, that
if we `remove' the statistical dependency from data, then the resulting
recognition error rates become negligible. Taken together, these results
suggest that a better understanding of the structure of the statistical
dependency in speech data is a crucial first step towards improving HMM-based
speech recognition
Power, Performance, and Perception (P3): Integrating Usability Metrics and Technology Acceptance Determinants to Validate a New Model for Predicting System Usage
Currently, there are two distinct approaches to assist information technology managers in the successful implementation of office automation software. The first approach resides within the field of usability engineering, while the second approach is derived from the discipline of management information systems (MIS). However, neither approach has successfully produced conclusive evidence that explains what characteristics facilitate system use as well as influence user acceptance of the system. This study reports on the validity of a new model, entitled the Power, Performance, Perception (P3) model, that links the constructs of usability engineering to user acceptance. Additionally, speech recognition software (SRS) was used in an experimental setting to validate the P3 model. This research also examined the viability of employing SRS in an Air Force office environment. The results of this study failed to validate the P3 model. However, an alternate model for predicting user acceptance, the Usability Acceptance Model, did emerge from the research which showed that the usability metric of user satisfaction can explain 53% of the variance of user intention to use a new technology. Additionally, the results of this study indicate that users in a typical Air Force office environment would utilize SRS for text processing
Spoken query processing for interactive information retrieval
It has long been recognised that interactivity improves the effectiveness of information retrieval systems. Speech is the most natural and interactive medium of communication and recent progress in speech recognition is making it possible to build systems that interact with the user via speech. However, given the typical length of queries submitted to information retrieval systems, it is easy to imagine that the effects of word recognition errors in spoken queries must be severely destructive on the system's effectiveness. The experimental work reported in this paper shows that the use of classical information retrieval techniques for spoken query processing is robust to considerably high levels of word recognition errors, in particular for long queries. Moreover, in the case of short queries, both standard relevance feedback and pseudo relevance feedback can be effectively employed to improve the effectiveness of spoken query processing
- …