1,301 research outputs found
Keskusteluavustimen kehittäminen kuulovammaisia varten automaattista puheentunnistusta käyttäen
Understanding and participating in conversations has been reported as one of the biggest challenges hearing impaired people face in their daily lives. These communication problems have been shown to have wide-ranging negative consequences, affecting their quality of life and the opportunities available to them in education and employment.
A conversational assistance application was investigated to alleviate these problems. The application uses automatic speech recognition technology to provide real-time speech-to-text transcriptions to the user, with the goal of helping deaf and hard of hearing persons in conversational situations. To validate the method and investigate its usefulness, a prototype application was developed for testing purposes using open-source software. A user test was designed and performed with test participants representing the target user group.
The results indicate that the Conversation Assistant method is valid, meaning it can help the hearing impaired to follow and participate in conversational situations. Speech recognition accuracy, especially in noisy environments, was identified as the primary target for further development for increased usefulness of the application. Conversely, recognition speed was deemed to be sufficient and already surpass the transcription speed of human transcribers.Keskustelupuheen ymmärtäminen ja keskusteluihin osallistuminen on raportoitu yhdeksi suurimmista haasteista, joita kuulovammaiset kohtaavat jokapäiväisessä elämässään. Näillä viestintäongelmilla on osoitettu olevan laaja-alaisia negatiivisia vaikutuksia, jotka heijastuvat elämänlaatuun ja heikentävät kuulovammaisten yhdenvertaisia osallistumismahdollisuuksia opiskeluun ja työelämään.
Työssä kehitettiin ja arvioitiin apusovellusta keskustelupuheen ymmärtämisen ja keskusteluihin osallistumisen helpottamiseksi. Sovellus käyttää automaattista puheentunnistusta reaaliaikaiseen puheen tekstittämiseen kuuroja ja huonokuuloisia varten. Menetelmän toimivuuden vahvistamiseksi ja sen hyödyllisyyden tutkimiseksi siitä kehitettiin prototyyppisovellus käyttäjätestausta varten avointa lähdekoodia hyödyntäen. Testaamista varten suunniteltiin ja toteutettiin käyttäjäkoe sovelluksen kohderyhmää edustavilla koekäyttäjillä.
Saadut tulokset viittaavat siihen, että työssä esitetty Keskusteluavustin on toimiva ja hyödyllinen apuväline huonokuuloisille ja kuuroille. Puheentunnistustarkkuus erityisesti meluisissa olosuhteissa osoittautui ensisijaiseksi kehityskohteeksi apusovelluksen hyödyllisyyden lisäämiseksi. Puheentunnistuksen nopeus arvioitiin puolestaan jo riittävän nopeaksi, ylittäen selkeästi kirjoitustulkkien kirjoitusnopeuden
Grounding robot motion in natural language and visual perception
The current state of the art in military and first responder ground robots involves heavy physical and cognitive burdens on the human operator while taking little to no advantage of the potential autonomy of robotic technology. The robots currently in use are rugged remote-controlled vehicles. Their interaction modalities, usually utilizing a game controller connected to a computer, require a dedicated operator who has limited capacity for other tasks.
I present research which aims to ease these burdens by incorporating multiple modes of robotic sensing into a system which allows humans to interact with robots through a natural-language interface. I conduct this research on a custom-built six-wheeled mobile robot.
First I present a unified framework which supports grounding natural-language semantics in robotic driving. This framework supports learning the meanings of nouns and prepositions from sentential descriptions of paths driven by the robot, as well as using such meanings to both generate a sentential description of a path and perform automated driving of a path specified in natural language. One limitation of this framework is that it requires as input the locations of the (initially nameless) objects in the floor plan.
Next I present a method to automatically detect, localize, and label objects in the robot’s environment using only the robot’s video feed and corresponding odometry. This method produces a map of the robot’s environment in which objects are differentiated by abstract class labels.
Finally, I present work that unifies the previous two approaches. This method detects, localizes, and labels objects, as the previous method does. However, this new method integrates natural-language descriptions to learn actual object names, rather than abstract labels
Becoming Human with Humanoid
Nowadays, our expectations of robots have been significantly increases. The robot, which was initially only doing simple jobs, is now expected to be smarter and more dynamic. People want a robot that resembles a human (humanoid) has and has emotional intelligence that can perform action-reaction interactions. This book consists of two sections. The first section focuses on emotional intelligence, while the second section discusses the control of robotics. The contents of the book reveal the outcomes of research conducted by scholars in robotics fields to accommodate needs of society and industry
Recommended from our members
Facilitating file retrieval on resource limited devices
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The rapid development of mobile technologies has facilitated users to generate and store files on mobile devices. However, it has become a challenging issue for users to search efficiently and effectively for files of interest in a mobile environment that involves a large number of mobile nodes. In this thesis, file management and retrieval alternatives have been investigated to propose a feasible framework that can be employed on resource-limited devices without altering their operating systems. The file annotation and retrieval framework (FARM) proposed in the thesis automatically annotates the files with their basic file attributes by extracting them from the underlying operating system of the device. The framework is implemented in the JME platform as a case study. This framework provides a variety of features for managing the metadata and file search features on the device itself and on other devices in a networked environment. FARM not only automates the file-search process but also provides accurate results as demonstrated by the experimental analysis.
In order to facilitate a file search and take advantage of the Semantic Web Technologies, the SemFARM framework is proposed which utilizes the knowledge of a generic ontology. The generic ontology defines the most common keywords that can be used as the metadata of stored files. This provides semantic-based file search capabilities on low-end devices where the search keywords are enriched with additional knowledge extracted from the defined ontology. The existing frameworks annotate image files only, while SemFARM can be used to annotate all types of files.
Semantic heterogeneity is a challenging issue and necessitates extensive research to accomplish the aim of a semantic web. For this reason, significant research efforts have been made in recent years by proposing an enormous number of ontology alignment systems to deal with ontology heterogeneities.
In the process of aligning different ontologies, it is essential to encompass their semantic, structural or any system-specific measures in mapping decisions to produce more accurate alignments. The proposed solution, in this thesis, for ontology alignment presents a structural matcher, which computes the similarity between the super-classes, sub-classes and properties of two entities from different ontologies that require aligning. The proposed alignment system (OARS)
uses Rough Sets to aggregate the results obtained from various matchers in order to deal with uncertainties during the mapping process of entities. The OARS uses a combinational approach by using a string-based and linguistic-based matcher, in addition to structural-matcher for computing the overall similarity between two entities. The performance of the OARS is evaluated in comparison with existing state of the art alignment systems in terms of precision and recall. The performance tests are performed by using benchmark ontologies and the results show significant improvements, specifically in terms of recall on all groups of test ontologies. There is no such existing framework, which can use alignments for file search on mobile devices.
The ontology alignment paradigm is integrated in the SemFARM to further enhance the file search features of the framework as it utilises the knowledge of more than one ontology in order to perform a search query. The experimental evaluations show that it performs better in terms of precision and recall where more than one ontology is available when searching for a required file.Education Commission of Pakistan and the University of Engineering & Technology, Peshawa
Efficient machine learning: models and accelerations
One of the key enablers of the recent unprecedented success of machine learning is the adoption of very large models. Modern machine learning models typically consist of multiple cascaded layers such as deep neural networks, and at least millions to hundreds of millions of parameters (i.e., weights) for the entire model. The larger-scale model tend to enable the extraction of more complex high-level features, and therefore, lead to a significant improvement of the overall accuracy. On the other side, the layered deep structure and large model sizes also demand to increase computational capability and memory requirements. In order to achieve higher scalability, performance, and energy efficiency for deep learning systems, two orthogonal research and development trends have attracted enormous interests. The first trend is the acceleration while the second is the model compression. The underlying goal of these two trends is the high quality of the models to provides accurate predictions. In this thesis, we address these two problems and utilize different computing paradigms to solve real-life deep learning problems.
To explore in these two domains, this thesis first presents the cogent confabulation network for sentence completion problem. We use Chinese language as a case study to describe our exploration of the cogent confabulation based text recognition models. The exploration and optimization of the cogent confabulation based models have been conducted through various comparisons. The optimized network offered a better accuracy performance for the sentence completion. To accelerate the sentence completion problem in a multi-processing system, we propose a parallel framework for the confabulation recall algorithm. The parallel implementation reduce runtime, improve the recall accuracy by breaking the fixed evaluation order and introducing more generalization, and maintain a balanced progress in status update among all neurons. A lexicon scheduling algorithm is presented to further improve the model performance.
As deep neural networks have been proven effective to solve many real-life applications, and they are deployed on low-power devices, we then investigated the acceleration for the neural network inference using a hardware-friendly computing paradigm, stochastic computing. It is an approximate computing paradigm which requires small hardware footprint and achieves high energy efficiency. Applying this stochastic computing to deep convolutional neural networks, we design the functional hardware blocks and optimize them jointly to minimize the accuracy loss due to the approximation. The synthesis results show that the proposed design achieves the remarkable low hardware cost and power/energy consumption.
Modern neural networks usually imply a huge amount of parameters which cannot be fit into embedded devices. Compression of the deep learning models together with acceleration attracts our attention. We introduce the structured matrices based neural network to address this problem. Circulant matrix is one of the structured matrices, where a matrix can be represented using a single vector, so that the matrix is compressed. We further investigate a more flexible structure based on circulant matrix, called block-circulant matrix. It partitions a matrix into several smaller blocks and makes each submatrix is circulant. The compression ratio is controllable. With the help of Fourier Transform based equivalent computation, the inference of the deep neural network can be accelerated energy efficiently on the FPGAs. We also offer the optimization for the training algorithm for block circulant matrices based neural networks to obtain a high accuracy after compression
Natural Language Processing in-and-for Design Research
We review the scholarly contributions that utilise Natural Language
Processing (NLP) methods to support the design process. Using a heuristic
approach, we collected 223 articles published in 32 journals and within the
period 1991-present. We present state-of-the-art NLP in-and-for design research
by reviewing these articles according to the type of natural language text
sources: internal reports, design concepts, discourse transcripts, technical
publications, consumer opinions, and others. Upon summarizing and identifying
the gaps in these contributions, we utilise an existing design innovation
framework to identify the applications that are currently being supported by
NLP. We then propose a few methodological and theoretical directions for future
NLP in-and-for design research
- …