33 research outputs found
A Novel Video/Photo Recorder Using an Online Motion Sensor-Triggered Embedded System
In recent years embedded systems have gained more importance. These systems are especially dedicated to specific tasks which are handled by highly optimized solutions. One of the interesting areas of embedded systems use is multi-media. Producing, processing, streaming various multimedia types and interacting with the physical environment is very common today. Similar to these studies, controlling and observing the specified area by multi-media tools are the necessities for many reasons such as security. This paper presents a method of video and photo recording of any moving object by using open source operation system (Raspbian- a distribution of Linux) and software (Python β a high-level programming language). The system is triggered by a motion sensor and it collects visual data from a specified area for limited duration. The collected data is published on internet via dedicated web site. Keywords: Photo and Video Recorder, Open Source Platforms, Embedded System
Enabling mobile microinteractions
While much attention has been paid to the usability of desktop computers, mobile com- puters are quickly becoming the dominant platform. Because mobile computers may be used in nearly any situation--including while the user is actually in motion, or performing other tasks--interfaces designed for stationary use may be inappropriate, and alternative interfaces should be considered.
In this dissertation I consider the idea of microinteractions--interactions with a device that take less than four seconds to initiate and complete. Microinteractions are desirable because they may minimize interruption; that is, they allow for a tiny burst of interaction with a device so that the user can quickly return to the task at hand.
My research concentrates on methods for applying microinteractions through wrist- based interaction. I consider two modalities for this interaction: touchscreens and motion- based gestures. In the case of touchscreens, I consider the interface implications of making touchscreen watches usable with the finger, instead of the usual stylus, and investigate users' performance with a round touchscreen. For gesture-based interaction, I present a tool, MAGIC, for designing gesture-based interactive system, and detail the evaluation of the tool.Ph.D.Committee Chair: Starner, Thad; Committee Member: Abowd, Gregory; Committee Member: Isbell, Charles; Committee Member: Landay, james; Committee Member: McIntyre, Blai
Speech Recognition
Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes
Deep Learning for Distant Speech Recognition
Deep learning is an emerging technology that is considered one of the most
promising directions for reaching higher levels of artificial intelligence.
Among the other achievements, building computers that understand speech
represents a crucial leap towards intelligent machines. Despite the great
efforts of the past decades, however, a natural and robust human-machine speech
interaction still appears to be out of reach, especially when users interact
with a distant microphone in noisy and reverberant environments. The latter
disturbances severely hamper the intelligibility of a speech signal, making
Distant Speech Recognition (DSR) one of the major open challenges in the field.
This thesis addresses the latter scenario and proposes some novel techniques,
architectures, and algorithms to improve the robustness of distant-talking
acoustic models. We first elaborate on methodologies for realistic data
contamination, with a particular emphasis on DNN training with simulated data.
We then investigate on approaches for better exploiting speech contexts,
proposing some original methodologies for both feed-forward and recurrent
neural networks. Lastly, inspired by the idea that cooperation across different
DNNs could be the key for counteracting the harmful effects of noise and
reverberation, we propose a novel deep learning paradigm called network of deep
neural networks. The analysis of the original concepts were based on extensive
experimental validations conducted on both real and simulated data, considering
different corpora, microphone configurations, environments, noisy conditions,
and ASR tasks.Comment: PhD Thesis Unitn, 201