10 research outputs found

    Segmentierung medizinischer Bilddaten und bildgestützte intraoperative Navigation

    Get PDF
    Die Entwicklung von Algorithmen zur automatischen oder semi-automatischen Verarbeitung von medizinischen Bilddaten hat in den letzten Jahren mehr und mehr an Bedeutung gewonnen. Das liegt zum einen an den immer besser werdenden medizinischen Aufnahmemodalitäten, die den menschlichen Körper immer feiner virtuell abbilden können. Zum anderen liegt dies an der verbesserten Computerhardware, die eine algorithmische Verarbeitung der teilweise im Gigabyte-Bereich liegenden Datenmengen in einer vernünftigen Zeit erlaubt. Das Ziel dieser Habilitationsschrift ist die Entwicklung und Evaluation von Algorithmen für die medizinische Bildverarbeitung. Insgesamt besteht die Habilitationsschrift aus einer Reihe von Publikationen, die in drei übergreifende Themenbereiche gegliedert sind: -Segmentierung medizinischer Bilddaten anhand von vorlagenbasierten Algorithmen -Experimentelle Evaluation quelloffener Segmentierungsmethoden unter medizinischen Einsatzbedingungen -Navigation zur Unterstützung intraoperativer Therapien Im Bereich Segmentierung medizinischer Bilddaten anhand von vorlagenbasierten Algorithmen wurden verschiedene graphbasierte Algorithmen in 2D und 3D entwickelt, die einen gerichteten Graphen mittels einer Vorlage aufbauen. Dazu gehört die Bildung eines Algorithmus zur Segmentierung von Wirbeln in 2D und 3D. In 2D wird eine rechteckige und in 3D eine würfelförmige Vorlage genutzt, um den Graphen aufzubauen und das Segmentierungsergebnis zu berechnen. Außerdem wird eine graphbasierte Segmentierung von Prostatadrüsen durch eine Kugelvorlage zur automatischen Bestimmung der Grenzen zwischen Prostatadrüsen und umliegenden Organen vorgestellt. Auf den vorlagenbasierten Algorithmen aufbauend, wurde ein interaktiver Segmentierungsalgorithmus, der einem Benutzer in Echtzeit das Segmentierungsergebnis anzeigt, konzipiert und implementiert. Der Algorithmus nutzt zur Segmentierung die verschiedenen Vorlagen, benötigt allerdings nur einen Saatpunkt des Benutzers. In einem weiteren Ansatz kann der Benutzer die Segmentierung interaktiv durch zusätzliche Saatpunkte verfeinern. Dadurch wird es möglich, eine semi-automatische Segmentierung auch in schwierigen Fällen zu einem zufriedenstellenden Ergebnis zu führen. Im Bereich Evaluation quelloffener Segmentierungsmethoden unter medizinischen Einsatzbedingungen wurden verschiedene frei verfügbare Segmentierungsalgorithmen anhand von Patientendaten aus der klinischen Routine getestet. Dazu gehörte die Evaluierung der semi-automatischen Segmentierung von Hirntumoren, zum Beispiel Hypophysenadenomen und Glioblastomen, mit der frei verfügbaren Open Source-Plattform 3D Slicer. Dadurch konnte gezeigt werden, wie eine rein manuelle Schicht-für-Schicht-Vermessung des Tumorvolumens in der Praxis unterstützt und beschleunigt werden kann. Weiterhin wurde die Segmentierung von Sprachbahnen in medizinischen Aufnahmen von Hirntumorpatienten auf verschiedenen Plattformen evaluiert. Im Bereich Navigation zur Unterstützung intraoperativer Therapien wurden Softwaremodule zum Begleiten von intra-operativen Eingriffen in verschiedenen Phasen einer Behandlung (Therapieplanung, Durchführung, Kontrolle) entwickelt. Dazu gehört die erstmalige Integration des OpenIGTLink-Netzwerkprotokolls in die medizinische Prototyping-Plattform MeVisLab, die anhand eines NDI-Navigationssystems evaluiert wurde. Außerdem wurde hier ebenfalls zum ersten Mal die Konzeption und Implementierung eines medizinischen Software-Prototypen zur Unterstützung der intraoperativen gynäkologischen Brachytherapie vorgestellt. Der Software-Prototyp enthielt auch ein Modul zur erweiterten Visualisierung bei der MR-gestützten interstitiellen gynäkologischen Brachytherapie, welches unter anderem die Registrierung eines gynäkologischen Brachytherapie-Instruments in einen intraoperativen Datensatz einer Patientin ermöglichte. Die einzelnen Module führten zur Vorstellung eines umfassenden bildgestützten Systems für die gynäkologische Brachytherapie in einem multimodalen Operationssaal. Dieses System deckt die prä-, intra- und postoperative Behandlungsphase bei einer interstitiellen gynäkologischen Brachytherapie ab

    Human perception-oriented segmentation for triangle meshes

    Get PDF
    A segmentação de malhas é um tópico importante de investigação em computação gráfica, em particular em modelação geométrica. Isto deve-se ao facto de as técnicas de segmentaçãodemalhasteremváriasaplicações,nomeadamentenaproduçãodefilmes, animaçãoporcomputador, realidadevirtual, compressãodemalhas, assimcomoemjogosdigitais. Emconcreto, asmalhastriangularessãoamplamenteusadasemaplicações interativas, visto que sua segmentação em partes significativas (também designada por segmentação significativa, segmentação perceptiva ou segmentação perceptualmente significativa ) é muitas vezes vista como uma forma de acelerar a interação com o utilizador ou a deteção de colisões entre esses objetos 3D definidos por uma malha, bem como animar uma ou mais partes significativas (por exemplo, a cabeça de uma personagem) de um dado objeto, independentemente das restantes partes. Acontece que não se conhece nenhuma técnica capaz de segmentar correctamente malhas arbitrárias −ainda que restritas aos domínios de formas livres e não-livres− em partes significativas. Algumas técnicas são mais adequadas para objetos de forma não-livre (por exemplo, peças mecânicas definidas geometricamente por quádricas), enquanto outras são mais talhadas para o domínio dos objectos de forma livre. Só na literatura recente surgem umas poucas técnicas que se aplicam a todo o universo de objetos de forma livre e não-livre. Pior ainda é o facto de que a maioria das técnicas de segmentação não serem totalmente automáticas, no sentido de que quase todas elas exigem algum tipo de pré-requisitos e assistência do utilizador. Resumindo, estes três desafios relacionados com a proximidade perceptual, generalidade e automação estão no cerne do trabalho descrito nesta tese. Para enfrentar estes desafios, esta tese introduz o primeiro algoritmo de segmentação baseada nos contornos ou fronteiras dos segmentos, cuja técnica se inspira nas técnicas de segmentação baseada em arestas, tão comuns em análise e processamento de imagem,porcontraposiçãoàstécnicasesegmentaçãobaseadaemregiões. Aideiaprincipal é a de encontrar em primeiro lugar a fronteira de cada região para, em seguida, identificar e agrupar todos os seus triângulos internos. As regiões da malha encontradas correspondem a saliências e reentrâncias, que não precisam de ser estritamente convexas, nem estritamente côncavas, respectivamente. Estas regiões, designadas regiões relaxadamenteconvexas(ousaliências)eregiõesrelaxadamentecôncavas(oureentrâncias), produzem segmentações que são menos sensíveis ao ruído e, ao mesmo tempo, são mais intuitivas do ponto de vista da perceção humana; por isso, é designada por segmentação orientada à perceção humana (ou, human perception- oriented (HPO), do inglês). Além disso, e ao contrário do atual estado-da-arte da segmentação de malhas, a existência destas regiões relaxadas torna o algoritmo capaz de segmentar de maneira bastante plausível tanto objectos de forma não-livre como objectos de forma livre. Nesta tese, enfrentou-se também um quarto desafio, que está relacionado com a fusão de segmentação e multi-resolução de malhas. Em boa verdade, já existe na literatura uma variedade grande de técnicas de segmentação, bem como um número significativo de técnicas de multi-resolução, para malhas triangulares. No entanto, não é assim tão comum encontrar estruturas de dados e algoritmos que façam a fusão ou a simbiose destes dois conceitos, multi-resolução e segmentação, num único esquema multi-resolução que sirva os propósitos das aplicações que lidam com malhas simples e segmentadas, sendo que neste contexto se entende que uma malha simples é uma malha com um único segmento. Sendo assim, nesta tese descreve-se um novo esquema (entenda-seestruturasdedadosealgoritmos)demulti-resoluçãoesegmentação,designado por extended Ghost Cell (xGC). Este esquema preserva a forma das malhas, tanto em termos globais como locais, ou seja, os segmentos da malha e as suas fronteiras, bem como os seus vincos e ápices são preservados, não importa o nível de resolução que usamos durante a/o simplificação/refinamento da malha. Além disso, ao contrário de outros esquemas de segmentação, tornou-se possível ter segmentos adjacentes com dois ou mais níveis de resolução de diferença. Isto é particularmente útil em animação por computador, compressão e transmissão de malhas, operações de modelação geométrica, visualização científica e computação gráfica. Em suma, esta tese apresenta um esquema genérico, automático, e orientado à percepção humana, que torna possível a simbiose dos conceitos de segmentação e multiresolução de malhas trianguladas que sejam representativas de objectos 3D.The mesh segmentation is an important topic in computer graphics, in particular in geometric computing. This is so because mesh segmentation techniques find many applications in movies, computer animation, virtual reality, mesh compression, and games. Infact, trianglemeshesarewidelyusedininteractiveapplications, sothattheir segmentation in meaningful parts (i.e., human-perceptually segmentation, perceptive segmentationormeaningfulsegmentation)isoftenseenasawayofspeedinguptheuser interaction, detecting collisions between these mesh-covered objects in a 3D scene, as well as animating one or more meaningful parts (e.g., the head of a humanoid) independently of the other parts of a given object. It happens that there is no known technique capable of correctly segmenting any mesh into meaningful parts. Some techniques are more adequate for non-freeform objects (e.g., quadricmechanicalparts), whileothersperformbetterinthedomainoffreeform objects. Only recently, some techniques have been developed for the entire universe of objects and shapes. Even worse it is the fact that most segmentation techniques are not entirely automated in the sense that almost all techniques require some sort of pre-requisites and user assistance. Summing up, these three challenges related to perceptual proximity, generality and automation are at the core of the work described in this thesis. In order to face these challenges, we have developed the first contour-based mesh segmentation algorithm that we may find in the literature, which is inspired in the edgebased segmentation techniques used in image analysis, as opposite to region-based segmentation techniques. Its leading idea is to firstly find the contour of each region, and then to identify and collect all of its inner triangles. The encountered mesh regions correspond to ups and downs, which do not need to be strictly convex nor strictly concave, respectively. These regions, called relaxedly convex regions (or saliences) and relaxedly concave regions (or recesses), produce segmentations that are less-sensitive to noise and, at the same time, are more intuitive from the human point of view; hence it is called human perception- oriented (HPO) segmentation. Besides, and unlike the current state-of-the-art in mesh segmentation, the existence of these relaxed regions makes the algorithm suited to both non-freeform and freeform objects. In this thesis, we have also tackled a fourth challenge, which is related with the fusion of mesh segmentation and multi-resolution. Truly speaking, a plethora of segmentation techniques, as well as a number of multiresolution techniques, for triangle meshes already exist in the literature. However, it is not so common to find algorithms and data structures that fuse these two concepts, multiresolution and segmentation, into a symbiotic multi-resolution scheme for both plain and segmented meshes, in which a plainmeshisunderstoodasameshwithasinglesegment. So, weintroducesuchanovel multiresolution segmentation scheme, called extended Ghost Cell (xGC) scheme. This scheme preserves the shape of the meshes in both global and local terms, i.e., mesh segments and their boundaries, as well as creases and apices are preserved, no matter the level of resolution we use for simplification/refinement of the mesh. Moreover, unlike other segmentation schemes, it was made possible to have adjacent segments with two or more resolution levels of difference. This is particularly useful in computer animation, mesh compression and transmission, geometric computing, scientific visualization, and computer graphics. In short, this thesis presents a fully automatic, general, and human perception-oriented scheme that symbiotically integrates the concepts of mesh segmentation and multiresolution

    Calculating Sparse and Dense Correspondences for Near-Isometric Shapes

    Get PDF
    Comparing and analysing digital models are basic techniques of geometric shape processing. These techniques have a variety of applications, such as extracting the domain knowledge contained in the growing number of digital models to simplify shape modelling. Another example application is the analysis of real-world objects, which itself has a variety of applications, such as medical examinations, medical and agricultural research, and infrastructure maintenance. As methods to digitalize physical objects mature, any advances in the analysis of digital shapes lead to progress in the analysis of real-world objects. Global shape properties, like volume and surface area, are simple to compare but contain only very limited information. Much more information is contained in local shape differences, such as where and how a plant grew. Sadly the computation of local shape differences is hard as it requires knowledge of corresponding point pairs, i.e. points on both shapes that correspond to each other. The following article thesis (cumulative dissertation) discusses several recent publications for the computation of corresponding points: - Geodesic distances between points, i.e. distances along the surface, are fundamental for several shape processing tasks as well as several shape matching techniques. Chapter 3 introduces and analyses fast and accurate bounds on geodesic distances. - When building a shape space on a set of shapes, misaligned correspondences lead to points moving along the surfaces and finally to a larger shape space. Chapter 4 shows that this also works the other way around, that is good correspondences are obtain by optimizing them to generate a compact shape space. - Representing correspondences with a “functional map” has a variety of advantages. Chapter 5 shows that representing the correspondence map as an alignment of Green’s functions of the Laplace operator has similar advantages, but is much less dependent on the number of eigenvectors used for the computations. - Quadratic assignment problems were recently shown to reliably yield sparse correspondences. Chapter 6 compares state-of-the-art convex relaxations of graphics and vision with methods from discrete optimization on typical quadratic assignment problems emerging in shape matching

    Advanced Technologies for the Optimization of Internal Combustion Engines

    Get PDF
    This Special Issue puts together recent findings in advanced technologies for the optimization of internal combustion engines in order to help the scientific community address the efforts towards the development of higher-power engines with lower fuel consumption and pollutant emissions

    Pattern search for the visualization of scalar, vector, and line fields

    Get PDF
    The main topic of this thesis is pattern search in data sets for the purpose of visual data analysis. By giving a reference pattern, pattern search aims to discover similar occurrences in a data set with invariance to translation, rotation and scaling. To address this problem, we developed algorithms dealing with different types of data: scalar fields, vector fields, and line fields. For scalar fields, we use the SIFT algorithm (Scale-Invariant Feature Transform) to find a sparse sampling of prominent features in the data with invariance to translation, rotation, and scaling. Then, the user can define a pattern as a set of SIFT features by e.g. brushing a region of interest. Finally, we locate and rank matching patterns in the entire data set. Due to the sparsity and accuracy of SIFT features, we achieve fast and memory-saving pattern query in large scale scalar fields. For vector fields, we propose a hashing strategy in scale space to accelerate the convolution-based pattern query. We encode the local flow behavior in scale space using a sequence of hierarchical base descriptors, which are pre-computed and hashed into a number of hash tables. This ensures a fast fetching of similar occurrences in the flow and requires only a constant number of table lookups. For line fields, we present a stream line segmentation algorithm to split long stream lines into globally-consistent segments, which provides similar segmentations for similar flow structures. It gives the benefit of isolating a pattern from long and dense stream lines, so that our patterns can be defined sparsely and have a significant extent, i.e., they are integration-based and not local. This allows for a greater flexibility in defining features of interest. For user-defined patterns of curve segments, our algorithm finds similar ones that are invariant to similarity transformations. Additionally, we present a method for shape recovery from multiple views. This semi-automatic method fits a template mesh to high-resolution normal data. In contrast to existing 3D reconstruction approaches, we accelerate the data acquisition time by omitting the structured light scanning step of obtaining low frequency 3D information.Das Hauptthema dieser Arbeit ist die Mustersuche in Datensätzen zur visuellen Datenanalyse. Durch die Vorgabe eines Referenzmusters versucht die Mustersuche ähnliche Vorkommen in einem Datensatz mit Translations-, Rotations- und Skalierungsinvarianz zu entdecken. In diesem Zusammenhang haben wir Algorithmen entwickelt, die sich mit verschiedenen Arten von Daten befassen: Skalarfelder, Vektorfelder und Linienfelder. Bei Skalarfeldern benutzen wir den SIFT-Algorithmus (Scale-Invariant Feature Transform), um ein spärliches Abtasten von markanten Merkmalen in Daten mit Translations-, Rotations- und Skalierungsinvarianz zu finden. Danach kann der Benutzer ein Muster als Menge von SIFT-Merkmalspunkten definieren, zum Beispiel durch Markieren einer interessierenden Region. Schließlich lokalisieren wir passende Muster im gesamten Datensatz und stufen sie ein. Aufgrund der spärlichen Verteilung und der Genauigkeit von SIFT-Merkmalspunkten erreichen wir eine schnelle und speichersparende Musterabfrage in großen Skalarfeldern. Für Vektorfelder schlagen wir eine Hashing-Strategie zur Beschleunigung der faltungsbasierten Musterabfrage im Skalenraum vor. Wir kodieren das lokale Flussverhalten im Skalenraum durch eine Sequenz von hierarchischen Basisdeskriptoren, welche vorberechnet und als Zahlen in einer Hashtabelle gespeichert sind. Dies stellt eine schnelle Abfrage von ähnlichen Vorkommen im Fluss sicher und benötigt lediglich eine konstante Anzahl von Nachschlageoperationen in der Tabelle. Für Linienfelder präsentieren wir einen Algorithmus zur Segmentierung von Stromlinien, um lange Stromlinen in global konsistente Segmente aufzuteilen. Dies erlaubt eine größere Flexibilität bei der Definition von Mustern. Für vom Benutzer definierte Muster von Kurvensegmenten findet unser Algorithmus ähnliche Kurvensegmente, die unter Ähnlichkeitstransformationen invariant sind. Zusätzlich präsentieren wir eine Methode zur Rekonstruktion von Formen aus mehreren Ansichten. Diese halbautomatische Methode passt ein Template an hochauflösendeNormalendatenan. Im Gegensatz zu existierenden 3D-Rekonstruktionsverfahren beschleunigen wir die Datenaufnahme, indem wir auf die Streifenprojektion verzichten, um niederfrequente 3D Informationen zu gewinnen

    A Silent-Speech Interface using Electro-Optical Stomatography

    Get PDF
    Sprachtechnologie ist eine große und wachsende Industrie, die das Leben von technologieinteressierten Nutzern auf zahlreichen Wegen bereichert. Viele potenzielle Nutzer werden jedoch ausgeschlossen: Nämlich alle Sprecher, die nur schwer oder sogar gar nicht Sprache produzieren können. Silent-Speech Interfaces bieten einen Weg, mit Maschinen durch ein bequemes sprachgesteuertes Interface zu kommunizieren ohne dafür akustische Sprache zu benötigen. Sie können außerdem prinzipiell eine Ersatzstimme stellen, indem sie die intendierten Äußerungen, die der Nutzer nur still artikuliert, künstlich synthetisieren. Diese Dissertation stellt ein neues Silent-Speech Interface vor, das auf einem neu entwickelten Messsystem namens Elektro-Optischer Stomatografie und einem neuartigen parametrischen Vokaltraktmodell basiert, das die Echtzeitsynthese von Sprache basierend auf den gemessenen Daten ermöglicht. Mit der Hardware wurden Studien zur Einzelworterkennung durchgeführt, die den Stand der Technik in der intra- und inter-individuellen Genauigkeit erreichten und übertrafen. Darüber hinaus wurde eine Studie abgeschlossen, in der die Hardware zur Steuerung des Vokaltraktmodells in einer direkten Artikulation-zu-Sprache-Synthese verwendet wurde. Während die Verständlichkeit der Synthese von Vokalen sehr hoch eingeschätzt wurde, ist die Verständlichkeit von Konsonanten und kontinuierlicher Sprache sehr schlecht. Vielversprechende Möglichkeiten zur Verbesserung des Systems werden im Ausblick diskutiert.:Statement of authorship iii Abstract v List of Figures vii List of Tables xi Acronyms xiii 1. Introduction 1 1.1. The concept of a Silent-Speech Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2. Structure of this work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Fundamentals of phonetics 7 2.1. Components of the human speech production system . . . . . . . . . . . . . . . . . . . 7 2.2. Vowel sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3. Consonantal sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4. Acoustic properties of speech sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.5. Coarticulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.6. Phonotactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.7. Summary and implications for the design of a Silent-Speech Interface (SSI) . . . . . . . 21 3. Articulatory data acquisition techniques in Silent-Speech Interfaces 25 3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2. Scope of the literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3. Video Recordings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.4. Ultrasonography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.5. Electromyography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.6. Permanent-Magnetic Articulography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.7. Electromagnetic Articulography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.8. Radio waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.9. Palatography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.10.Conclusion and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4. Electro-Optical Stomatography 55 4.1. Contact sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.2. Optical distance sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.3. Lip sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.4. Sensor Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.5. Control Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.6. Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5. Articulation-to-Text 99 5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.2. Command word recognition pilot study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.3. Command word recognition small-scale study . . . . . . . . . . . . . . . . . . . . . . . . 102 6. Articulation-to-Speech 109 6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.2. Articulatory synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.3. The six point vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6.4. Objective evaluation of the vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . 116 6.5. Perceptual evaluation of the vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . 120 6.6. Direct synthesis using EOS to control the vocal tract model . . . . . . . . . . . . . . . . 125 6.7. Pitch and voicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 7. Summary and outlook 145 7.1. Summary of the contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 7.2. Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 A. Overview of the International Phonetic Alphabet 151 B. Mathematical proofs and derivations 153 B.1. Combinatoric calculations illustrating the reduction of possible syllables using phonotactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 B.2. Signal Averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 B.3. Effect of the contact sensor area on the conductance . . . . . . . . . . . . . . . . . . . . 155 B.4. Calculation of the forward current for the OP280V diode . . . . . . . . . . . . . . . . . . 155 C. Schematics and layouts 157 C.1. Schematics of the control unit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 C.2. Layout of the control unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 C.3. Bill of materials of the control unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 C.4. Schematics of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 C.5. Layout of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 C.6. Bill of materials of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 D. Sensor unit assembly 169 E. Firmware flow and data protocol 177 F. Palate file format 181 G. Supplemental material regarding the vocal tract model 183 H. Articulation-to-Speech: Optimal hyperparameters 189 Bibliography 191Speech technology is a major and growing industry that enriches the lives of technologically-minded people in a number of ways. Many potential users are, however, excluded: Namely, all speakers who cannot easily or even at all produce speech. Silent-Speech Interfaces offer a way to communicate with a machine by a convenient speech recognition interface without the need for acoustic speech. They also can potentially provide a full replacement voice by synthesizing the intended utterances that are only silently articulated by the user. To that end, the speech movements need to be captured and mapped to either text or acoustic speech. This dissertation proposes a new Silent-Speech Interface based on a newly developed measurement technology called Electro-Optical Stomatography and a novel parametric vocal tract model to facilitate real-time speech synthesis based on the measured data. The hardware was used to conduct command word recognition studies reaching state-of-the-art intra- and inter-individual performance. Furthermore, a study on using the hardware to control the vocal tract model in a direct articulation-to-speech synthesis loop was also completed. While the intelligibility of synthesized vowels was high, the intelligibility of consonants and connected speech was quite poor. Promising ways to improve the system are discussed in the outlook.:Statement of authorship iii Abstract v List of Figures vii List of Tables xi Acronyms xiii 1. Introduction 1 1.1. The concept of a Silent-Speech Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.2. Structure of this work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2. Fundamentals of phonetics 7 2.1. Components of the human speech production system . . . . . . . . . . . . . . . . . . . 7 2.2. Vowel sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3. Consonantal sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4. Acoustic properties of speech sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 2.5. Coarticulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 2.6. Phonotactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 2.7. Summary and implications for the design of a Silent-Speech Interface (SSI) . . . . . . . 21 3. Articulatory data acquisition techniques in Silent-Speech Interfaces 25 3.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 3.2. Scope of the literature review . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.3. Video Recordings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 3.4. Ultrasonography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 3.5. Electromyography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 3.6. Permanent-Magnetic Articulography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 3.7. Electromagnetic Articulography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 3.8. Radio waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 3.9. Palatography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 3.10.Conclusion and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 4. Electro-Optical Stomatography 55 4.1. Contact sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 4.2. Optical distance sensors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 4.3. Lip sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 4.4. Sensor Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 4.5. Control Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89 4.6. Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 5. Articulation-to-Text 99 5.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.2. Command word recognition pilot study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 5.3. Command word recognition small-scale study . . . . . . . . . . . . . . . . . . . . . . . . 102 6. Articulation-to-Speech 109 6.1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.2. Articulatory synthesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 6.3. The six point vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 6.4. Objective evaluation of the vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . 116 6.5. Perceptual evaluation of the vocal tract model . . . . . . . . . . . . . . . . . . . . . . . . 120 6.6. Direct synthesis using EOS to control the vocal tract model . . . . . . . . . . . . . . . . 125 6.7. Pitch and voicing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132 7. Summary and outlook 145 7.1. Summary of the contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 7.2. Outlook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146 A. Overview of the International Phonetic Alphabet 151 B. Mathematical proofs and derivations 153 B.1. Combinatoric calculations illustrating the reduction of possible syllables using phonotactics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153 B.2. Signal Averaging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 B.3. Effect of the contact sensor area on the conductance . . . . . . . . . . . . . . . . . . . . 155 B.4. Calculation of the forward current for the OP280V diode . . . . . . . . . . . . . . . . . . 155 C. Schematics and layouts 157 C.1. Schematics of the control unit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158 C.2. Layout of the control unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163 C.3. Bill of materials of the control unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 C.4. Schematics of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165 C.5. Layout of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166 C.6. Bill of materials of the sensor unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167 D. Sensor unit assembly 169 E. Firmware flow and data protocol 177 F. Palate file format 181 G. Supplemental material regarding the vocal tract model 183 H. Articulation-to-Speech: Optimal hyperparameters 189 Bibliography 19

    Représentations cartographiques intermédiaires : comment covisualiser une carte et une orthophotographie pour naviguer entre abstraction et réalisme ?

    Get PDF
    Two representations of the territory are widely provided simultaneously to the user through interactive tools (such as magnifiers, sliders or swipes): topographic maps and orthoimages. They provide complementary visions of the territory because of abstraction steps used to design maps and the intrisic perceived photorealism power of orthoimages. Aiming at providing efficient covisualizations of these two representations to the user, we advise not to search for an ideal graphic mix, but to produce a cartographic continuum composed of in-between representations mixing topographic data and orthoimagery. Our objective is to provide interactive tools allowing to choose an intermediate step within the continuum by controling the realism and abstraction levels. Our approach is based on three principles: first, the need for local adaptation of vector data symbolisation to preserve their readability, second, the call for graphic transitions to establish a continuity through in-between cartographic representations, and third the required control over realism level in order to ensure a visual consistency of hybrid visualisations. We provide elementary symbolisation methods to be combined in a global design process. The first one aims at interpolating SLD symbolisation parameters such as color, opacity or texturing between two symbolisations. The second one aims at defining a local symbolisation depending on the graphic context of objects to be highlighted. Those symbolisations are combined for each theme and synchronized for all themes. For these design steps, we provide guidelines based on the evaluation of the realism level coming from our user test. Finally we build a prototype software allowing to test our propositions and browse in-between representations from abstraction to realism through an interactive sliderDeux représentations du territoire sont majoritairement proposées pour être covisualisées de multiples façons (loupe, curseurs, vues asservies, etc.) : la carte topographique et l'orthophotographie. Ces deux représentations apportent une vision complémentaire du territoire : la carte topographique est l'archétype même de l'abstraction et l'orthophotographie renvoie une perception réaliste du territoire. Pour permettre à l'utilisateur de covisualiser ces deux types de représentations, nous préconisons de ne pas chercher un mélange graphique idéal mais plutôt de produire un continuum cartographique formé d'un ensemble continu de représentations intermédiaires mixant données topographiques et orthophotographie. Notre objectif est de permettre à l'utilisateur de choisir sa position entre les deux extrémités en contrôlant le degré de réalisme et d'abstraction tout au long du continuum. Notre approche se fonde sur la nécessité d'adaptation locale de la symbolisation des données topographiques pour assurer la lisibilité de chaque représentation intermédiaire, la création de transitions graphiques pour établir une continuité entre ces représentations, et la synchronisation des symbolisations visant à garantir une homogénéité visuelle de ces représentations mixtes. Nous proposons une méthode de conception reposant sur la combinaison de briques de symbolisation élémentaires. Le premier type de brique consiste à interpoler les paramètres de symbolisation de la norme SLD tels que la couleur, la transparence ou la texture (procédurale, naturelle, ou mixée) entre deux symbolisations données. Le second type de brique analyse le contexte graphique des objets à mettre en valeur afin de déterminer localement une symbolisation adaptée et lisible. Ces briques sont combinées pour chaque thème et coordonnées entre les différents thèmes. Nous émettons des préconisations de paramétrage de ces étapes de conception à partir des résultats de notre test utilisateur visant à estimer le degré de réalisme et d'abstraction des symbolisations cartographiques. Enfin, nous mettons en œuvre cette méthode de conception au sein de la plateforme de recherche GeOxygene sous la forme d'un outil permettant de naviguer dans un continuum cartographique entre réalisme et abstractio

    Proceedings of the ECCOMAS Thematic Conference on Multibody Dynamics 2015

    Get PDF
    This volume contains the full papers accepted for presentation at the ECCOMAS Thematic Conference on Multibody Dynamics 2015 held in the Barcelona School of Industrial Engineering, Universitat Politècnica de Catalunya, on June 29 - July 2, 2015. The ECCOMAS Thematic Conference on Multibody Dynamics is an international meeting held once every two years in a European country. Continuing the very successful series of past conferences that have been organized in Lisbon (2003), Madrid (2005), Milan (2007), Warsaw (2009), Brussels (2011) and Zagreb (2013); this edition will once again serve as a meeting point for the international researchers, scientists and experts from academia, research laboratories and industry working in the area of multibody dynamics. Applications are related to many fields of contemporary engineering, such as vehicle and railway systems, aeronautical and space vehicles, robotic manipulators, mechatronic and autonomous systems, smart structures, biomechanical systems and nanotechnologies. The topics of the conference include, but are not restricted to: ● Formulations and Numerical Methods ● Efficient Methods and Real-Time Applications ● Flexible Multibody Dynamics ● Contact Dynamics and Constraints ● Multiphysics and Coupled Problems ● Control and Optimization ● Software Development and Computer Technology ● Aerospace and Maritime Applications ● Biomechanics ● Railroad Vehicle Dynamics ● Road Vehicle Dynamics ● Robotics ● Benchmark ProblemsPostprint (published version

    Low Latency Rendering with Dataflow Architectures

    Get PDF
    The research presented in this thesis concerns latency in VR and synthetic environments. Latency is the end-to-end delay experienced by the user of an interactive computer system, between their physical actions and the perceived response to these actions. Latency is a product of the various processing, transport and buffering delays present in any current computer system. For many computer mediated applications, latency can be distracting, but it is not critical to the utility of the application. Synthetic environments on the other hand attempt to facilitate direct interaction with a digitised world. Direct interaction here implies the formation of a sensorimotor loop between the user and the digitised world - that is, the user makes predictions about how their actions affect the world, and see these predictions realised. By facilitating the formation of the this loop, the synthetic environment allows users to directly sense the digitised world, rather than the interface, and induce perceptions, such as that of the digital world existing as a distinct physical place. This has many applications for knowledge transfer and efficient interaction through the use of enhanced communication cues. The complication is, the formation of the sensorimotor loop that underpins this is highly dependent on the fidelity of the virtual stimuli, including latency. The main research questions we ask are how can the characteristics of dataflow computing be leveraged to improve the temporal fidelity of the visual stimuli, and what implications does this have on other aspects of the fidelity. Secondarily, we ask what effects latency itself has on user interaction. We test the effects of latency on physical interaction at levels previously hypothesized but unexplored. We also test for a previously unconsidered effect of latency on higher level cognitive functions. To do this, we create prototype image generators for interactive systems and virtual reality, using dataflow computing platforms. We integrate these into real interactive systems to gain practical experience of how the real perceptible benefits of alternative rendering approaches, but also what implications are when they are subject to the constraints of real systems. We quantify the differences of our systems compared with traditional systems using latency and objective image fidelity measures. We use our novel systems to perform user studies into the effects of latency. Our high performance apparatuses allow experimentation at latencies lower than previously tested in comparable studies. The low latency apparatuses are designed to minimise what is currently the largest delay in traditional rendering pipelines and we find that the approach is successful in this respect. Our 3D low latency apparatus achieves lower latencies and higher fidelities than traditional systems. The conditions under which it can do this are highly constrained however. We do not foresee dataflow computing shouldering the bulk of the rendering workload in the future but rather facilitating the augmentation of the traditional pipeline with a very high speed local loop. This may be an image distortion stage or otherwise. Our latency experiments revealed that many predictions about the effects of low latency should be re-evaluated and experimenting in this range requires great care

    Multibody dynamics 2015

    Get PDF
    This volume contains the full papers accepted for presentation at the ECCOMAS Thematic Conference on Multibody Dynamics 2015 held in the Barcelona School of Industrial Engineering, Universitat Politècnica de Catalunya, on June 29 - July 2, 2015. The ECCOMAS Thematic Conference on Multibody Dynamics is an international meeting held once every two years in a European country. Continuing the very successful series of past conferences that have been organized in Lisbon (2003), Madrid (2005), Milan (2007), Warsaw (2009), Brussels (2011) and Zagreb (2013); this edition will once again serve as a meeting point for the international researchers, scientists and experts from academia, research laboratories and industry working in the area of multibody dynamics. Applications are related to many fields of contemporary engineering, such as vehicle and railway systems, aeronautical and space vehicles, robotic manipulators, mechatronic and autonomous systems, smart structures, biomechanical systems and nanotechnologies. The topics of the conference include, but are not restricted to: Formulations and Numerical Methods, Efficient Methods and Real-Time Applications, Flexible Multibody Dynamics, Contact Dynamics and Constraints, Multiphysics and Coupled Problems, Control and Optimization, Software Development and Computer Technology, Aerospace and Maritime Applications, Biomechanics, Railroad Vehicle Dynamics, Road Vehicle Dynamics, Robotics, Benchmark Problems. The conference is organized by the Department of Mechanical Engineering of the Universitat Politècnica de Catalunya (UPC) in Barcelona. The organizers would like to thank the authors for submitting their contributions, the keynote lecturers for accepting the invitation and for the quality of their talks, the awards and scientific committees for their support to the organization of the conference, and finally the topic organizers for reviewing all extended abstracts and selecting the awards nominees.Postprint (published version
    corecore