20,290 research outputs found
An efficient implementation of lattice-ladder multilayer perceptrons in field programmable gate arrays
The implementation efficiency of electronic systems is a combination of conflicting requirements, as increasing volumes of computations, accelerating the exchange of data, at the same time increasing energy consumption forcing the researchers not only to optimize the algorithm, but also to quickly implement in a specialized hardware. Therefore in this work, the problem of efficient and straightforward implementation of operating in a real-time electronic intelligent systems on field-programmable gate array (FPGA) is tackled. The object of research is specialized FPGA intellectual property (IP) cores that operate in a real-time. In the thesis the following main aspects of the research object are investigated: implementation criteria and techniques.
The aim of the thesis is to optimize the FPGA implementation process of selected class dynamic artificial neural networks. In order to solve stated problem and reach the goal following main tasks of the thesis are formulated: rationalize the selection of a class of Lattice-Ladder Multi-Layer Perceptron (LLMLP) and its electronic intelligent system test-bed â a speaker dependent Lithuanian speech recognizer, to be created and investigated; develop dedicated technique for implementation of LLMLP class on FPGA that is based on specialized efficiency criteria for a circuitry synthesis; develop and experimentally affirm the efficiency of optimized FPGA IP cores used in
Lithuanian speech recognizer.
The dissertation contains: introduction, four chapters and general conclusions. The first chapter reveals the fundamental knowledge on computer-aideddesign, artificial neural networks and speech recognition implementation on FPGA. In the second chapter the efficiency criteria and technique of LLMLP IP cores implementation are proposed in order to make multi-objective optimization of throughput, LLMLP complexity and resource utilization. The data flow graphs are applied for optimization of LLMLP computations. The optimized neuron processing element is proposed. The IP cores for features extraction and comparison are developed for Lithuanian speech recognizer and analyzed in third chapter. The fourth chapter is devoted for experimental verification of developed numerous LLMLP IP cores. The experiments of isolated word recognition accuracy and speed for different speakers, signal to noise ratios, features extraction and accelerated comparison methods were performed.
The main results of the thesis were published in 12 scientific publications: eight of them were printed in peer-reviewed scientific journals, four of them in a Thomson Reuters Web of Science database, four articles â in conference proceedings. The results were presented in 17 scientific conferences
An analysis of the application of AI to the development of intelligent aids for flight crew tasks
This report presents the results of a study aimed at developing a basis for applying artificial intelligence to the flight deck environment of commercial transport aircraft. In particular, the study was comprised of four tasks: (1) analysis of flight crew tasks, (2) survey of the state-of-the-art of relevant artificial intelligence areas, (3) identification of human factors issues relevant to intelligent cockpit aids, and (4) identification of artificial intelligence areas requiring further research
Paralinguistic Privacy Protection at the Edge
Voice user interfaces and digital assistants are rapidly entering our lives
and becoming singular touch points spanning our devices. These always-on
services capture and transmit our audio data to powerful cloud services for
further processing and subsequent actions. Our voices and raw audio signals
collected through these devices contain a host of sensitive paralinguistic
information that is transmitted to service providers regardless of deliberate
or false triggers. As our emotional patterns and sensitive attributes like our
identity, gender, mental well-being, are easily inferred using deep acoustic
models, we encounter a new generation of privacy risks by using these services.
One approach to mitigate the risk of paralinguistic-based privacy breaches is
to exploit a combination of cloud-based processing with privacy-preserving,
on-device paralinguistic information learning and filtering before transmitting
voice data. In this paper we introduce EDGY, a configurable, lightweight,
disentangled representation learning framework that transforms and filters
high-dimensional voice data to identify and contain sensitive attributes at the
edge prior to offloading to the cloud. We evaluate EDGY's on-device performance
and explore optimization techniques, including model quantization and knowledge
distillation, to enable private, accurate and efficient representation learning
on resource-constrained devices. Our results show that EDGY runs in tens of
milliseconds with 0.2% relative improvement in ABX score or minimal performance
penalties in learning linguistic representations from raw voice signals, using
a CPU and a single-core ARM processor without specialized hardware.Comment: 14 pages, 7 figures. arXiv admin note: text overlap with
arXiv:2007.1506
Deep Spoken Keyword Spotting:An Overview
Spoken keyword spotting (KWS) deals with the identification of keywords in
audio streams and has become a fast-growing technology thanks to the paradigm
shift introduced by deep learning a few years ago. This has allowed the rapid
embedding of deep KWS in a myriad of small electronic devices with different
purposes like the activation of voice assistants. Prospects suggest a sustained
growth in terms of social use of this technology. Thus, it is not surprising
that deep KWS has become a hot research topic among speech scientists, who
constantly look for KWS performance improvement and computational complexity
reduction. This context motivates this paper, in which we conduct a literature
review into deep spoken KWS to assist practitioners and researchers who are
interested in this technology. Specifically, this overview has a comprehensive
nature by covering a thorough analysis of deep KWS systems (which includes
speech features, acoustic modeling and posterior handling), robustness methods,
applications, datasets, evaluation metrics, performance of deep KWS systems and
audio-visual KWS. The analysis performed in this paper allows us to identify a
number of directions for future research, including directions adopted from
automatic speech recognition research and directions that are unique to the
problem of spoken KWS
- âŚ