    Voice Command Controller

    Signal processing technology has been strongly developed and it has attracted interest from scientists and engineers around the world from the last decade. Speech synthesis and speech recognition are particular topic in the field that have been widely used and developed in many different area such as business, controlling, education and entertainment. The project\u27s main objective is to study and develop an application program with the Speech SDK through design and implementation of Tele-Control system based on the commercial product of National Semiconductor: Carrier-Current Transceiver (LM 1893) and Speech development kit (Speech SDK4.0) from Microsoft Corporation. The project is suitable to be used in restricted areas where space, wiring, decoration and signal interference are issues of concerned. Speech SDK is an interesting and useful tool in helping develop a Voice application programs. In this project, the user can use voice command interact with the control program to control a remote device. In conjunction with hardware modification, extra function can be added to the program such as controlling camera, video capture and position control buttons on the environment map, the project will be suitable for security purposes

    Multimodal access to social media services

    Tese de mestrado integrado. Engenharia Informática e Computação. Faculdade de Engenharia. Universidade do Porto, Microsoft Language Development Center. 201

    Information Technologies for the Healthcare Delivery System

    That modern healthcare requires information technology to be efficient and fully effective is evident if one spends any time observing the delivery of institutional health care. Consider the observation of a practitioner of the discipline, David M. Eddy, MD, PhD, voiced in Clinical Decision Making, JAMA 263:1265-75, 1990, . . .All confirm what would be expected from common sense: The complexity of modern medicine exceeds the inherent limitations of the unaided human mind. The goal of this thesis is to identify the technological factors that are required to enable a fully sufficient application of information technology (IT) to the modern institutional practice of medicine. Perhaps the epitome of healthcare IT is the fully integrated, fully electronic patient medical record. Although, in 1991 the Institute of Medicine called for such a record to be standard technology by 2001, it has still not materialized. The author will argue that some of the technology and standards that are pre-requisite for this achievement have now arrived, while others are still evolving to fully sufficient levels. The paper will concentrate primarily on the health care system in the United States, although much of what is contained is applicable to a large degree, around the world. The paper will illustrate certain of these pre-requisite IT factors by discussing the actual installation of a major health care computer system at the University of Rochester Medical Center (URMC) in Rochester, New York. This system is a Picture Archiving and Communications System (PACS). As the name implies, PACS is a system of capturing health care images in digital format, storing them and communicating them to users throughout the enterprise

    Chinese Text Entry with Mobile Devices

    Tietokoneiden ja nykyaikaisten matkapuhelimien käytön kannalta on olennaista, että niihin voidaan syöttää tekstiä tehokkaasti. Kiinan kielen eri murteita puhuu äidinkielenään noin viidesosa maailman väestöstä eli yli miljardi ihmistä. Kiinan kielen merkki- ja tavuperustaisuus tekee siitä tekstinsyötön kannalta ainutlaatuisen haastavan. Monet kiinalaisista merkeistä ovat rakenteeltaan monimutkaisia ja homofonisia (ääntyvät samalla tavoin) joidenkin muiden merkkien kanssa. Syötettäessä tekstiä näppäimistöltä tavallinen tapa on käyttää ns. pinyin-koodeja, joiden avulla kukin kiinan merkki voidaan esittää useasta latinalaisen aakkoston merkistä koostuvana koodina. Homofoniasta johtuen tarkoitettu kiinan kielen merkki joudutaan tämän jälkeen vielä valitsemaan usean vaihtoehdon joukosta, mikä tekee tekstinsyöttöprosessista vaikeampaa kuin romaanisten kielten tapauksessa. Lisäksi on otettava huomioon Kiinan eri osissa puhutut useat murteet. Kaikki nämä tekijät yhdessä tekevät kiinankielisen tekstin syötöstä tietokoneille haastavaa. Tämän väitöskirjan tavoitteena on parantaa kiinankielisen tekstin syöttötapojen käyttäjäkokemusta käytettäessä matkapuhelimia ja muita mobiililaitteita. Väitöskirjassa tutkitaan empiiristen kokeiden ja mallinnuksen avulla uusia tekstinsyöttötapoja ja niiden käyttöä. Tutkimuksen kohteena on neljä erilaista tekstinsyöttötapaa: kiinankielen käsinkirjoituksen tunnistus, pyörivän kiekon avulla tapahtuva tekstinsyöttö, mandariinikiinaan perustuva sanelu, ja numeronäppäinten avulla tapahtuva pinyin-koodien syöttö. Työssä ehdotetaan uusia tekniikoita sekä käsinkirjoituksen tunnistukseen että kiekkoa käyttävään pinyin-koodien syöttöön. Empiirisissä kokeissa osoittautui että käyttäjät pitivät uusista tekniikoista. Mandariinikiinalle on suunniteltu lyhytviestien sanelusovellus, josta on tehty kaksi käyttäjäkoetta. Myös numeronäppäinten avulla tapahtuvaa pinyin-koodien syöttöä on tutkittu kahdessa kokeessa. Ensimmäisessä kokeessa vertailtiin viittä eri menetelmää. Se tuotti suunnitteluohjeita etenkin koskien fraasien (useamman merkin kokonaisuuksien) syöttöä, tekniikkaa joka voi nopeuttaa tekstinsyöttöä. Toisen osatutkimuksen tuloksena on tekstinsyöttöä kuvaava malli, jonka avulla voidaan ennustaa menetelmän nopeutta kun syötettäessä ei tehdä virheitä. Tutkimus johti myös useisiin jatkotutkimuskysymyksiin. On tarpeen kehittää tehokkaampia menetelmiä tilanteeseen, jossa merkki joudutaan valitsemaan useista vaihtoehdoista. Kehityspotentiaalia on myös merkkien perustana olevien viivojen tunnistustavoissa sekä kosketusnäytöllä esitettyjen näppäimistöjen paremmassa hyödyntämisessä.For using computers and modern mobile phones it is essential that there are efficient methods for providing textual input. About one fifth of the world´s population, or over one billion people, speaks some variety of Chinese as their native language. Chinese has unique characteristics as a logosyllabic language. For example, many Chinese characters are complex in structure and normally homophonic with some others. With keyboards and other key-based input devices the normal approach is to use so-called pinyin input, where the Chinese characters are entered using their pinyin mark that consists of several characters in the Roman alphabet. Because of homophony this technique requires choosing the correct Chinese character from a list of posssible choices, making the input process more complicated than in Roman languages. Moreover, the many varieties of the language in different parts of China have to be taken into account as well. All above factors bring new challenges to the design and evaluation of Chinese text entry methods in computing systems. The overall objective of this dissertation is to improve user experience of Chinese text entry on mobile devices. To achieve the goal, the author explores new interaction solutions and patterns of user behavior in the Chinese text entry process with various approaches including empirical studies and performance modeling. The work covers four means of Chinese text entry on mobile devices: Chinese handwriting recognition, Chinese indirect text entry with a rotator, Mandarin dictation, and Chinese pinyin input methods with a 12-key keypad. New design solutions for Chinese handwriting recognition and pinyin methods utilizing a rotator are proposed and proved being well accepted by users with empirical studies. A Mandarin short message dictation application for mobile phones is also presented , with two associated studies on human factors. Two studies were also carried out on Chinese pinyin input methods that are based on the 12-key keypad. The comparative study of five phrasal pinyin input methods led to design guidelines for the advanced feature of phrasal input. The second study of pinyin input methods produced a predictive model addressing users´ error-free speeds. Based on the conclusions from studies in this thesis, several additional research questions were identified for the future. For example, improvements are necessary to promote user performance on target selection process in Chinese text entry on mobile devices. Moreover, design and studies on stroke methods and Chinese specific soft keyboards are also required

    How Machines Learn: Where Do Companies Get Data for Machine Learning and What Licenses Do They Need?

    Machine learning services ingest customer data in order to provide refined, customized services. Machine learning algorithms are increasingly prominent in multiple sectors within the software-as-a-service industry including online advertising, health diagnostics, and travel. However, very little has been written on the rights a company utilizing machine learning needs to obtain in order to use customer data to improve its own products or services. Machine learning encompasses multiple types of data use and analysis, including (a) supervised machine learning algorithms, which take specific data provided in a tagged and classified format to deliver specific predictable output; and (b) unsupervised machine learning algorithms, where untagged data is processed in order to look for patterns and correlations without a specified output. This Article introduces the reader to the types of data use involved in various machine learning models, the level of data retention normally required for each model, and the risks of using personal information or re-identifiable data in connection with machine learning. The paper also discusses the type of license a commercial provider and consumer would need to enter into for various types of machine learning software. Finally, the paper proposes best practices for ensuring adequate rights are obtained through legal agreements so that machines may self-improve and innovate

    Multi-modal post-editing of machine translation

    As MT quality continues to improve, more and more translators switch from traditional translation from scratch to PE of MT output, which has been shown to save time and reduce errors. Instead of mainly generating text, translators are now asked to correct errors within otherwise helpful translation proposals, where repetitive MT errors make the process tiresome, while hard-to-spot errors make PE a cognitively demanding activity. Our contribution is three-fold: first, we explore whether interaction modalities other than mouse and keyboard could well support PE by creating and testing the MMPE translation environment. MMPE allows translators to cross out or hand-write text, drag and drop words for reordering, use spoken commands or hand gestures to manipulate text, or to combine any of these input modalities. Second, our interviews revealed that translators see value in automatically receiving additional translation support when a high CL is detected during PE. We therefore developed a sensor framework using a wide range of physiological and behavioral data to estimate perceived CL and tested it in three studies, showing that multi-modal, eye, heart, and skin measures can be used to make translation environments cognition-aware. Third, we present two multi-encoder Transformer architectures for APE and discuss how these can adapt MT output to a domain and thereby avoid correcting repetitive MT errors.Angesichts der stetig steigenden Qualität maschineller Übersetzungssysteme (MÜ) post-editieren (PE) immer mehr Übersetzer die MÜ-Ausgabe, was im Vergleich zur herkömmlichen Übersetzung Zeit spart und Fehler reduziert. Anstatt primär Text zu generieren, müssen Übersetzer nun Fehler in ansonsten hilfreichen Übersetzungsvorschlägen korrigieren. Dennoch bleibt die Arbeit durch wiederkehrende MÜ-Fehler mühsam und schwer zu erkennende Fehler fordern die Übersetzer kognitiv. Wir tragen auf drei Ebenen zur Verbesserung des PE bei: Erstens untersuchen wir, ob andere Interaktionsmodalitäten als Maus und Tastatur das PE unterstützen können, indem wir die Übersetzungsumgebung MMPE entwickeln und testen. MMPE ermöglicht es, Text handschriftlich, per Sprache oder über Handgesten zu verändern, Wörter per Drag & Drop neu anzuordnen oder all diese Eingabemodalitäten zu kombinieren. Zweitens stellen wir ein Sensor-Framework vor, das eine Vielzahl physiologischer und verhaltensbezogener Messwerte verwendet, um die kognitive Last (KL) abzuschätzen. In drei Studien konnten wir zeigen, dass multimodale Messung von Augen-, Herz- und Hautmerkmalen verwendet werden kann, um Übersetzungsumgebungen an die KL der Übersetzer anzupassen. Drittens stellen wir zwei Multi-Encoder-Transformer-Architekturen für das automatische Post-Editieren (APE) vor und erörtern, wie diese die MÜ-Ausgabe an eine Domäne anpassen und dadurch die Korrektur von sich wiederholenden MÜ-Fehlern vermeiden können.Deutsche Forschungsgemeinschaft (DFG), Projekt MMP

    Quantitative determinants of prefabs: A corpus-based, experimental study of multiword units in the lexicon

    In recent years many researchers have been rethinking the Words and Rules\u27 model of syntax (Pinker 1999), instead arguing that language processing relies on a large number of preassembled multiword units, or \u27prefabs\u27 (Bolinger 1976). A usage-based perspective predicts that linguistic units, including prefabs, arise via repeated use, and prefabs should thus be associated with the frequency with which words co-occur (Langacker 1987). Indeed, in several recent experiments, corpus analysis is found to be associated with behavioral measures for multiword sequences (Kapatsinski and Radicke 2009, Ellis and Simpson-Vlach 2009). This dissertation supplements such findings with two new psycholinguistic investigations of prefabs. Study 1 revisits a dictation experiment by Schmitt et al. (2004), in which participants are asked to listen to stretches of speech and repeat the input verbatim, after performing a distractor task intended to encourage reliance on prefabs. I describe the results of an updated experiment which demonstrates that participants are less likely to interrupt or partially alter high-frequency multiword sequences. Although the original study by Schmitt et al. (2004) reported null findings, the revised methodology suggests that frequency indeed plays a role in the creation of prefabs. Study 2 investigates the distribution of affix positioning errors (he go aheads) which give evidence that some multiword sequences (e.g., go ahead) are retrieved from memory as a unit. As part of this study, I describe a novel methodology which elicits the errors of interest in an experimental setting. Errors evincing holistic retrieval are induced more often among multiword sequences that are high in Mutual Dependency, a corpus measure that weighs a sequence\u27s frequency against the frequencies of its component words. Followup analyses indicate that sequence frequency is positively associated with affix errors, but only if component-word frequencies are included as variables in the model. In sum, the studies in this dissertation provide evidence that prefabricated, multiword units are associated with high frequency of a sequence, in addition to statistical measures that take component words\u27 frequency into account. These findings provide further support for a usage-based model of the lexicon, in which linguistic units are both gradient and changeable with experience