Search CORE

1,985 research outputs found

EYECOM: an innovative approach for computer interaction

Author: Mazhar Anam
Publication venue: Marshall Digital Scholar
Publication date: 01/01/2020
Field of study

The world is innovating rapidly, and there is a need for continuous interaction with the technology. Sadly, there do not exist promising options for paralyzed people to interact with the machines i.e., laptops, smartphones, and tabs. A few commercial solutions such as Google Glasses are costly and cannot be afforded by every paralyzed person for such interaction. Towards this end, the thesis proposes a retina-controlled device called EYECOM. The proposed device is constructed from off-the-shelf cost-effective yet robust IoT devices (i.e., Arduino microcontrollers, Xbee wireless sensors, IR diodes, and accelerometer). The device can easily be mounted on to the glasses; the paralyzed person using this device can interact with the machine using simple head movement and eye blinks. The IR detector is located in front of the eye to illuminate the eye region. As a result of illumination, the eye reflects IR light which includes electrical signals and as the eyelids close, the reflected light over eye surface is disrupted, and such change in reflected value is recorded. Further to enable cursor movement onto the computer screen for the paralyzed person a device named accelerometer is used. The accelerometer is a small device, with the size of phalanges, a human thumb bone. The device operates on the principle of axis-based motion sensing and it can be worn as a ring by a paralyzed person. A microcontroller processes the inputs from the IR sensors, accelerometer and transmits them wirelessly via Xbee wireless sensor (i.e., a radio) to another microcontroller attached to the computer. With the help of a proposed algorithm, the microcontroller attached to the computer, on receiving the signals moves cursor onto the computer screen and facilitate performing actions, as simple as opening a document to operating a word-to-speech software. EYECOM has features which can help paralyzed persons to continue their contributions towards the technological world and become an active part of the society. Resultantly, they will be able to perform number of tasks without depending upon others from as simple as reading a newspaper on the computer to activate word-to-voice software

Development of a text reading system on video images

Author: Merino Gracia Carlos
Publication venue
Publication date: 01/01/2015
Field of study

Since the early days of computer science researchers sought to devise a machine which could automatically read text to help people with visual impairments. The problem of extracting and recognising text on document images has been largely resolved, but reading text from images of natural scenes remains a challenge. Scene text can present uneven lighting, complex backgrounds or perspective and lens distortion; it usually appears as short sentences or isolated words and shows a very diverse set of typefaces. However, video sequences of natural scenes provide a temporal redundancy that can be exploited to compensate for some of these deficiencies. Here we present a complete end-to-end, real-time scene text reading system on video images based on perspective aware text tracking. The main contribution of this work is a system that automatically detects, recognises and tracks text in videos of natural scenes in real-time. The focus of our method is on large text found in outdoor environments, such as shop signs, street names and billboards. We introduce novel efficient techniques for text detection, text aggregation and text perspective estimation. Furthermore, we propose using a set of Unscented Kalman Filters (UKF) to maintain each text region¿s identity and to continuously track the homography transformation of the text into a fronto-parallel view, thereby being resilient to erratic camera motion and wide baseline changes in orientation. The orientation of each text line is estimated using a method that relies on the geometry of the characters themselves to estimate a rectifying homography. This is done irrespective of the view of the text over a large range of orientations. We also demonstrate a wearable head-mounted device for text reading that encases a camera for image acquisition and a pair of headphones for synthesized speech output. Our system is designed for continuous and unsupervised operation over long periods of time. It is completely automatic and features quick failure recovery and interactive text reading. It is also highly parallelised in order to maximize the usage of available processing power and to achieve real-time operation. We show comparative results that improve the current state-of-the-art when correcting perspective deformation of scene text. The end-to-end system performance is demonstrated on sequences recorded in outdoor scenarios. Finally, we also release a dataset of text tracking videos along with the annotated ground-truth of text regions

The selection and evaluation of a sensory technology for interaction in a warehouse environment

Author: Greyling Jean
Zadeh Seyed Amirsaleh Saleh
Publication venue: Faculty of Business and Economic Sciences
Publication date: 01/01/2016
Field of study

In recent years, Human-Computer Interaction (HCI) has become a significant part of modern life as it has improved human performance in the completion of daily tasks in using computerised systems. The increase in the variety of bio-sensing and wearable technologies on the market has propelled designers towards designing more efficient, effective and fully natural User-Interfaces (UI), such as the Brain-Computer Interface (BCI) and the Muscle-Computer Interface (MCI). BCI and MCI have been used for various purposes, such as controlling wheelchairs, piloting drones, providing alphanumeric inputs into a system and improving sports performance. Various challenges are experienced by workers in a warehouse environment. Because they often have to carry objects (referred to as hands-full) it is difficult to interact with traditional devices. Noise undeniably exists in some industrial environments and it is known as a major factor that causes communication problems. This has reduced the popularity of using verbal interfaces with computer applications, such as Warehouse Management Systems. Another factor that effects the performance of workers are action slips caused by a lack of concentration during, for example, routine picking activities. This can have a negative impact on job performance and allow a worker to incorrectly execute a task in a warehouse environment. This research project investigated the current challenges workers experience in a warehouse environment and the technologies utilised in this environment. The latest automation and identification systems and technologies are identified and discussed, specifically the technologies which have addressed known problems. Sensory technologies were identified that enable interaction between a human and a computerised warehouse environment. Biological and natural behaviours of humans which are applicable in the interaction with a computerised environment were described and discussed. The interactive behaviours included the visionary, auditory, speech production and physiological movement where other natural human behaviours such paying attention, action slips and the action of counting items were investigated. A number of modern sensory technologies, devices and techniques for HCI were identified with the aim of selecting and evaluating an appropriate sensory technology for MCI. iii MCI technologies enable a computer system to recognise hand and other gestures of a user, creating means of direct interaction between a user and a computer as they are able to detect specific features extracted from a specific biological or physiological activity. Thereafter, Machine Learning (ML) is applied in order to train a computer system to detect these features and convert them to a computer interface. An application of biomedical signals (bio-signals) in HCI using a MYO Armband for MCI is presented. An MCI prototype (MCIp) was developed and implemented to allow a user to provide input to an HCI, in a hands-free and hands-full situation. The MCIp was designed and developed to recognise the hand-finger gestures of a person when both hands are free or when holding an object, such a cardboard box. The MCIp applies an Artificial Neural Network (ANN) to classify features extracted from the surface Electromyography signals acquired by the MYO Armband around the forearm muscle. The MCIp provided the results of data classification for gesture recognition to an accuracy level of 34.87% with a hands-free situation. This was done by employing the ANN. The MCIp, furthermore, enabled users to provide numeric inputs to the MCIp system hands-full with an accuracy of 59.7% after a training session for each gesture of only 10 seconds. The results were obtained using eight participants. Similar experimentation with the MYO Armband has not been found to be reported in any literature at submission of this document. Based on this novel experimentation, the main contribution of this research study is a suggestion that the application of a MYO Armband, as a commercially available muscle-sensing device on the market, has the potential as an MCI to recognise the finger gestures hands-free and hands-full. An accurate MCI can increase the efficiency and effectiveness of an HCI tool when it is applied to different applications in a warehouse where noise and hands-full activities pose a challenge. Future work to improve its accuracy is proposed

Speech Recognition

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Chapters in the first part of the book cover all the essential speech processing techniques for building robust, automatic speech recognition systems: the representation for speech signals and the methods for speech-features extraction, acoustic and language modeling, efficient algorithms for searching the hypothesis space, and multimodal approaches to speech recognition. The last part of the book is devoted to other speech processing applications that can use the information from automatic speech recognition for speaker identification and tracking, for prosody modeling in emotion-detection systems and in other speech processing applications that are able to operate in real-world environments, like mobile communication services and smart homes

NASA Tech Briefs, August 2010

Author
Publication venue
Publication date
Field of study

Topics covered include: Technology Focus: Mechanical Components; Electronics/Computers; Software; Materials; Mechanics/Machinery; Manufacturing; Bio-Medical; Physical Sciences; Information Sciences; and Books and Reports

A navigation and object location device for the blind

Author: Caperna Steve
Cheng Christopher
Cho Junghee
Fan Victoria
Luthra Avishkar
O'Leary Brendan
Sheng Jansen
Stearns Lee
Sun Andrew
Tessler Roni
Wong Paul
Yeh Jimmy
Publication venue
Publication date: 01/05/2009
Field of study

Gemstone Team VisionTeam Vision's goal is to create a navigation system for the blind. To achieve this, we took a multi-pronged approach. First, through surveys, we assessed the needs of the blind community and developed a system around those needs. Then, using recent technology, we combined a global positioning system (GPS), inertial navigation unit (INU), computer vision algorithms, and audio and haptic interfaces into one system. The GPS and INU work together to provide walking directions from building to building when outdoors and the computer vision algorithms identify and locate objects such as signs and landmarks, both indoors and outdoors. The speech-based interface ties the GPS, INU, and computer vision algorithms together into an interactive audio-based navigation device. Finally, the haptic interface provides an alternative intuitive directional guidance system. The resulting system guides users to speci ed buildings and to important objects such as cellular telephones, wallets, or even restroom or exit signs

Emerging ExG-based NUI Inputs in Extended Realities : A Bottom-up Survey

Author: Chatzopoulos Dimitris
Hui Pan
Lee Lik-Hang
Shatilov Kirill A.
Publication venue
Publication date: 01/01/2021
Field of study

Incremental and quantitative improvements of two-way interactions with extended realities (XR) are contributing toward a qualitative leap into a state of XR ecosystems being efficient, user-friendly, and widely adopted. However, there are multiple barriers on the way toward the omnipresence of XR; among them are the following: computational and power limitations of portable hardware, social acceptance of novel interaction protocols, and usability and efficiency of interfaces. In this article, we overview and analyse novel natural user interfaces based on sensing electrical bio-signals that can be leveraged to tackle the challenges of XR input interactions. Electroencephalography-based brain-machine interfaces that enable thought-only hands-free interaction, myoelectric input methods that track body gestures employing electromyography, and gaze-tracking electrooculography input interfaces are the examples of electrical bio-signal sensing technologies united under a collective concept of ExG. ExG signal acquisition modalities provide a way to interact with computing systems using natural intuitive actions enriching interactions with XR. This survey will provide a bottom-up overview starting from (i) underlying biological aspects and signal acquisition techniques, (ii) ExG hardware solutions, (iii) ExG-enabled applications, (iv) discussion on social acceptance of such applications and technologies, as well as (v) research challenges, application directions, and open problems; evidencing the benefits that ExG-based Natural User Interfaces inputs can introduceto the areaof XR.Peer reviewe

Helsingin yliopiston digitaalinen arkisto

Emerging ExG-based NUI Inputs in Extended Realities : A Bottom-up Survey

Author: Chatzopoulos Dimitris
Hui Pan
Lee Lik-Hang
Shatilov Kirill A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/07/2021
Field of study

Helsingin yliopiston digitaalinen arkisto

Low vision assistance with mobile devices

Author: Stump Mark
Publication venue: RIT Scholar Works
Publication date: 01/01/2011
Field of study

Low vision affects many people, both young and old. Low vision conditions can range from near- and far-sightedness to conditions such as blind spots and tunnel vision. With the growing popularity of mobile devices such as smartphones, there is large opportunity for use of these multipurpose devices to provide low vision assistance. Furthermore, Google\u27s Android operating system provides a robust environment for applications in various fields, including low vision assistance. The objective of this thesis research is to develop a system for low vision assistance that displays important information at the preferred location of the user\u27s visual field. To that end, a first release of a prototype blind spot/tunnel vision assistance system was created and demonstrated on an Android smartphone. Various algorithms for face detection and face tracking were implemented on the Android platform and their performance was assessed with regards to metrics such as throughput and battery usage. Specifically, Viola-Jones, Support Vector Machines, and a color-based method from Pai et al were used for face detection. Template matching, CAMShift, and Lucas-Kanade methods were used for face tracking. It was found that face detection and tracking could be successfully executed within acceptable bounds of time and battery usage, and in some cases performed faster than it would take a comparable cloud-based system for offloading algorithm usage to complete execution

RIT Scholar Works

Keskusteluavustimen kehittäminen kuulovammaisia varten automaattista puheentunnistusta käyttäen

Author: Lukkarila Juri
Publication venue
Publication date: 11/12/2017
Field of study

Understanding and participating in conversations has been reported as one of the biggest challenges hearing impaired people face in their daily lives. These communication problems have been shown to have wide-ranging negative consequences, affecting their quality of life and the opportunities available to them in education and employment. A conversational assistance application was investigated to alleviate these problems. The application uses automatic speech recognition technology to provide real-time speech-to-text transcriptions to the user, with the goal of helping deaf and hard of hearing persons in conversational situations. To validate the method and investigate its usefulness, a prototype application was developed for testing purposes using open-source software. A user test was designed and performed with test participants representing the target user group. The results indicate that the Conversation Assistant method is valid, meaning it can help the hearing impaired to follow and participate in conversational situations. Speech recognition accuracy, especially in noisy environments, was identified as the primary target for further development for increased usefulness of the application. Conversely, recognition speed was deemed to be sufficient and already surpass the transcription speed of human transcribers.Keskustelupuheen ymmärtäminen ja keskusteluihin osallistuminen on raportoitu yhdeksi suurimmista haasteista, joita kuulovammaiset kohtaavat jokapäiväisessä elämässään. Näillä viestintäongelmilla on osoitettu olevan laaja-alaisia negatiivisia vaikutuksia, jotka heijastuvat elämänlaatuun ja heikentävät kuulovammaisten yhdenvertaisia osallistumismahdollisuuksia opiskeluun ja työelämään. Työssä kehitettiin ja arvioitiin apusovellusta keskustelupuheen ymmärtämisen ja keskusteluihin osallistumisen helpottamiseksi. Sovellus käyttää automaattista puheentunnistusta reaaliaikaiseen puheen tekstittämiseen kuuroja ja huonokuuloisia varten. Menetelmän toimivuuden vahvistamiseksi ja sen hyödyllisyyden tutkimiseksi siitä kehitettiin prototyyppisovellus käyttäjätestausta varten avointa lähdekoodia hyödyntäen. Testaamista varten suunniteltiin ja toteutettiin käyttäjäkoe sovelluksen kohderyhmää edustavilla koekäyttäjillä. Saadut tulokset viittaavat siihen, että työssä esitetty Keskusteluavustin on toimiva ja hyödyllinen apuväline huonokuuloisille ja kuuroille. Puheentunnistustarkkuus erityisesti meluisissa olosuhteissa osoittautui ensisijaiseksi kehityskohteeksi apusovelluksen hyödyllisyyden lisäämiseksi. Puheentunnistuksen nopeus arvioitiin puolestaan jo riittävän nopeaksi, ylittäen selkeästi kirjoitustulkkien kirjoitusnopeuden