10 research outputs found

    {HiFECap}: {M}onocular High-Fidelity and Expressive Capture of Human Performances

    Get PDF
    Monocular 3D human performance capture is indispensable for many applicationsin computer graphics and vision for enabling immersive experiences. However,detailed capture of humans requires tracking of multiple aspects, including theskeletal pose, the dynamic surface, which includes clothing, hand gestures aswell as facial expressions. No existing monocular method allows joint trackingof all these components. To this end, we propose HiFECap, a new neural humanperformance capture approach, which simultaneously captures human pose,clothing, facial expression, and hands just from a single RGB video. Wedemonstrate that our proposed network architecture, the carefully designedtraining strategy, and the tight integration of parametric face and hand modelsto a template mesh enable the capture of all these individual aspects.Importantly, our method also captures high-frequency details, such as deformingwrinkles on the clothes, better than the previous works. Furthermore, we showthat HiFECap outperforms the state-of-the-art human performance captureapproaches qualitatively and quantitatively while for the first time capturingall aspects of the human.<br

    3D Graphics Kit -- Zrahpics

    Get PDF

    3D Graphics Kit -- Zrahpics

    Get PDF

    Applied Advanced Error Control Coding for General Purpose Representation and Association Machine Systems

    Get PDF
    General-Purpose Representation and Association Machine (GPRAM) is proposed to be focusing on computations in terms of variation and flexibility, rather than precision and speed. GPRAM system has a vague representation and has no predefined tasks. With several important lessons learned from error control coding, neuroscience and human visual system, we investigate several types of error control codes, including Hamming code and Low-Density Parity Check (LDPC) codes, and extend them to different directions. While in error control codes, solely XOR logic gate is used to connect different nodes. Inspired by bio-systems and Turbo codes, we suggest and study non-linear codes with expanded operations, such as codes including AND and OR gates which raises the problem of prior-probabilities mismatching. Prior discussions about critical challenges in designing codes and iterative decoding for non-equiprobable symbols may pave the way for a more comprehensive understanding of bio-signal processing. The limitation of XOR operation in iterative decoding with non-equiprobable symbols is described and can be potentially resolved by applying quasi-XOR operation and intermediate transformation layer. Constructing codes for non-equiprobable symbols with the former approach cannot satisfyingly perform with regarding to error correction capability. Probabilistic messages for sum-product algorithm using XOR, AND, and OR operations with non-equiprobable symbols are further computed. The primary motivation for the constructing codes is to establish the GPRAM system rather than to conduct error control coding per se. The GPRAM system is fundamentally developed by applying various operations with substantial over-complete basis. This system is capable of continuously achieving better and simpler approximations for complex tasks. The approaches of decoding LDPC codes with non-equiprobable binary symbols are discussed due to the aforementioned prior-probabilities mismatching problem. The traditional Tanner graph should be modified because of the distinction of message passing to information bits and to parity check bits from check nodes. In other words, the message passing along two directions are identical in conventional Tanner graph, while the message along the forward direction and backward direction are different in our case. A method of optimizing signal constellation is described, which is able to maximize the channel mutual information. A simple Image Processing Unit (IPU) structure is proposed for GPRAM system, to which images are inputted. The IPU consists of a randomly constructed LDPC code, an iterative decoder, a switch, and scaling and decision device. The quality of input images has been severely deteriorated for the purpose of mimicking visual information variability (VIV) experienced in human visual systems. The IPU is capable of (a) reliably recognizing digits from images of which quality is extremely inadequate; (b) achieving similar hyper-acuity performance comparing to human visual system; and (c) significantly improving the recognition rate with applying randomly constructed LDPC code, which is not specifically optimized for the tasks

    Eastern Progress - 02 Feb 1984

    Get PDF

    Projection-based Spatial Augmented Reality for Interactive Visual Guidance in Surgery

    Get PDF
    Ph.DDOCTOR OF PHILOSOPH

    Multi-Scale Integral Invariants for Robust Character Extraction from Irregular Polygon Mesh Data

    Get PDF
    Hunderttausende von antiken Dokumenten in Keilschrift befinden sich in Museen, und täglich werden weitere bei archäologischen Grabungen gefunden. Die Auswertung dieser Dokumente ist wesentlich für das Verständnis der Herkunft von Kultur, Gesetzgebung und Religion. Die Keilschrift ist eine Handschrift und wurde in den Jahrtausenden vor Christi Geburt im gesamten alten Orient benutzt. Der Name leitet sich von den keilförmigen Eindrücken eines Schreibgriffels in den weichen Beschreibstoff Ton ab. Das Anfertigen von Handzeichnungen und Transkriptionen dieser Tontafeln ist eine langwierige Aufgabe und verlangt nach Unterstützung mittels automatisierter rechnergestützter Verfahren. Das Ziel dieser Arbeit ist die präzise Extraktion von Schriftzeichen mit variablen Formen in 3D. Die für die Merkmalsextraktion aus 2D-Mannigfaltigkeiten in 3D entscheidenden Schritte sind Kantenerkennung und Segmentierung. Robuste Techniken in der Signalverarbeitung und dem Shape Matching benutzen hierfür Integralinvarianten in 2D. In aktuellen Arbeiten werden die Integralinvarianten grob geschätzt, um wenige prägnante Merkmale zu finden, mit denen sich zerbrochene 3D-Objekte zusammensetzen lassen. Mit dem Ziel der exakten Bestimmung der 3D-Formen von Zeichen, wurde die aus der Bildverarbeitung und Mustererkennung bekannte Verarbeitungskette an 3D-Modelle angepasst. Diese Modelle bestehen aus Millionen von Messpunkten, die mit optischen 3D-Scannern aufgenommen werden. Die Punkte approximieren Mannigfaltigkeiten durch ein irreguläres Dreiecksnetz. Verschiedene Typen von integralinvarianten Filtern in mehreren Skalen führen zu verschiedenen hochdimensionalen Merkmalsräumen. Faltungen und kombinierte Metriken werden auf die Merkmalsräume angewandt, um Zusammenhangskomponenten zu bestimmen. Diese Komponenten stellen die Zeichen genauer als die Messauflösung dar. Parallel zum Design der Algorithmen werden die Eigenschaften der verschiedenen Integralinvarianten analysiert. Die Interpretation der Filterergebnisse sind von großem Nutzen zur Bestimmung von robusten Krümmungsmaßen und zur Segmentierung. Die Extraktion von Keilschriftzeichen wird mit einer Voronoi basierten Berechnung von minimalen normalisierbaren Vektordarstellungen vervollständigt. Diese Darstellung ist eine wichtige Grundlage für die Paläographie. Weitere Abstraktion und Normalisierung der Darstellung führt zur Zeichenerkennung. Die Einbettung der Algorithmen in das neu entworfene mehrschichtige GigaMesh Software Framework erlaubt eine Vielzahl von Anwendungen. Die Algorithmen nutzen den Speicher effektiv und die Verarbeitungskette ist parallelisiert. Die konfigurierbare Verarbeitungskette hat nur einen relevanten Parameter, nämlich die maximale Größe der zu erwartenden Merkmale. Die vorgestellten Verfahren wurden an Hunderten von Keilschrifttafeln, so wie weiteren realen und synthetischen Objekten getestet.Repräsentative Ergebnisse sowie Aufwands- und Genauigkeitsabschätzung der Algorithmen werden gezeigt. Ein Ausblick auf künftige Erweiterungen und Integralinvarianten in höheren Dimensionen gegeben

    Hybrid Visual-Inertial/Magnetic 3D Pose Estimation for Tracking Poorly-Textured/Textureless Symmetrical Objects

    Get PDF
    The focus of this research is mainly to develop a visual 3D pose estimation that can be used for many purposes including but not limited to autonomous visual inspection support system. The work overcomes the fundamental problem of region-based pose estimation in tracking poorly-textured/textureless symmetrical objects due to non-unique projection shape given numerous different poses. The work improved the existing state-of-the-art region-based pose estimation, known as Pixel-Wise Posterior 3D Pose estimation (PWP3D), by incorporating with inertial/magnetic orientation estimate. For this purpose, an inertial/magnetic orientation estimate expressed as a full optimisation problem is proposed beforehand. The proposed method, referred to NAG-AHRS, aims to deal better with the non-Gaussian noise and the non-linear model. The NAG-AHRS is then analysed by comparing its output to the motion capture system, as well as benchmarked to five state-of-the-art inertial/magnetic orientation estimates. The experiments show NAG-AHRS outperformed other benchmarking algorithms. Furthermore, NAG-AHRS facilitates the integration to visual-only pose estimation and to develop hybrid visual-inertial/magnetic pose estimation. In contrast with common visual-inertial integration method that has been dominated by Kalman filtering framework, the proposed method integrates visual and inertial/magnetic as a single optimisation problem. The selected optimisation method is Nesterov’s Accelerated Gradient (NAG) descent, hence the proposed method is referred to as PWP3Di-NAG. The developed PWP3Di-NAG algorithm is then validated by comparing its output to the reference pose provided by Aruco marker and at the same time, it is also benchmarked to the original PWP3D algorithm. The validation demonstrated some significant performances improvements. Moreover, integrating visual-inertial as a single optimisation problem requires to transform inertial/magnetic measurements into the object reference frame. The required transformation induces an initialisation stage to accurately estimate the initial pose of the object. A novel framework for serving this purpose that combines region-based and edge-based pose estimation in a particle filtering framework is also proposed. The validation shows that the proposed framework be able to estimate the pose of an object with low pose estimation errors

    Struggling Readers Learning with Graphic-Rich Digital Science Text: Effects of a Highlight & Animate Feature and Manipulable Graphics.

    Full text link
    Technology offers promise of ‘leveling the playing field’ for struggling readers. That is, instructional support features within digital texts may enable all readers to learn. This quasi-experimental study examined the effects on learning of two support features, which offered unique opportunities to interact with text. The Highlight & Animate Feature highlighted an important idea in prose, while simultaneously animating its representation in an adjacent graphic. It invited readers to integrate ideas depicted in graphics and prose, using each one to interpret the other. The Manipulable Graphics had parts that the reader could operate to discover relationships among phenomena. It invited readers to test or refine the ideas that they brought to, or gleaned from, the text. Use of these support features was compulsory. Twenty fifth grade struggling readers read a graphic-rich digital science text in a clinical interview setting, under one of two conditions: using either the Highlight & Animate Feature or the Manipulable Graphics. Participants in both conditions made statistically significant gains on a multiple choice measure of knowledge of the topic of the text. While there were no significant differences by condition in the amount of knowledge gained; there were significant differences in the quality of knowledge expressed. Transcripts revealed that understandings about light and vision, expressed by those who used the Highlight & Animate Feature, were more often conceptually and linguistically ‘complete.’ That is, their understandings included both a description of phenomena as well as an explanation of underlying scientific principles, which participants articulated using the vocabulary of the text. This finding may be attributed to the multiple opportunities to integrate graphics (depicting the behavior of phenomena) and prose (providing the scientific explanation of that phenomena), which characterized the Highlight & Animate Condition . Those who used the Manipulable Graphics were more likely to express complete understandings when they were able to structure a systematic investigation of the graphic and when the graphic was designed to confront their own naïve conceptions about light and vision. The Manipulable Graphics also provided a foothold for those who entered the study with very little prior knowledge of the topic.Ph.D.EducationUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/60875/1/ndefranc_1.pd
    corecore