19 research outputs found
Expressive movement generation with machine learning
Movement is an essential aspect of our lives. Not only do we move to interact with our physical environment, but we also express ourselves and communicate with others through our movements. In an increasingly computerized world where various technologies and devices surround us, our movements are essential parts of our interaction with and consumption of computational devices and artifacts. In this context, incorporating an understanding of our movements within the design of the technologies surrounding us can significantly improve our daily experiences. This need has given rise to the field of movement computing – developing computational models of movement that can perceive, manipulate, and generate movements. In this thesis, we contribute to the field of movement computing by building machine-learning-based solutions for automatic movement generation. In particular, we focus on using machine learning techniques and motion capture data to create controllable, generative movement models. We also contribute to the field by creating datasets, tools, and libraries that we have developed during our research. We start our research by reviewing the works on building automatic movement generation systems using machine learning techniques and motion capture data. Our review covers background topics such as high-level movement characterization, training data, features representation, machine learning models, and evaluation methods. Building on our literature review, we present WalkNet, an interactive agent walking movement controller based on neural networks. The expressivity of virtual, animated agents plays an essential role in their believability. Therefore, WalkNet integrates controlling the expressive qualities of movement with the goal-oriented behaviour of an animated virtual agent. It allows us to control the generation based on the valence and arousal levels of affect, the movement’s walking direction, and the mover’s movement signature in real-time. Following WalkNet, we look at controlling movement generation using more complex stimuli such as music represented by audio signals (i.e., non-symbolic music). Music-driven dance generation involves a highly non-linear mapping between temporally dense stimuli (i.e., the audio signal) and movements, which renders a more challenging modelling movement problem. To this end, we present GrooveNet, a real-time machine learning model for music-driven dance generation
Structured meshes: composition and remeshing guided by the Curve-Skeleton
Virtual sculpting is currently a broadly used modeling metaphor with rising
popularity especially in the entertainment industry. While this approach
unleashes the artists' inspiration and creativity and leads to wonderfully
detailed and artistic 3D models, it has the side effect, purely technical,
of producing highly irregular meshes that are not optimal for subsequent
processing. Converting an unstructured mesh into a more regular and struc-
tured model in an automatic way is a challenging task and still open prob-
lem.
Since structured meshes are useful in different applications, it is of in-
terest to be able to guarantee such property also in scenarios of part based
modeling, which aim to build digital objects by composition, instead of
modeling them from a scratch.
This thesis will present methods for obtaining structured meshes in two
different ways. First is presented a coarse quad layout computation method
which starts from a triangle mesh and the curve-skeleton of the shape. The
second approach allows to build complex shapes by procedural composition
of PAM's. Since both quad layouts and PAMs exploit their global struc-
ture, similarities between the two will be discussed, especially how their
structure has correspondences to the curve-skeleton describing the topology
of the shape being represented. Since both the presented methods rely on
the information provided by the skeleton, the difficulties of using automat-
ically extracted curve-skeletons without processing are discussed, and an
interactive tool for user-assisted processing is presented
Structured meshes: composition and remeshing guided by the Curve-Skeleton
Virtual sculpting is currently a broadly used modeling metaphor with rising
popularity especially in the entertainment industry. While this approach
unleashes the artists' inspiration and creativity and leads to wonderfully
detailed and artistic 3D models, it has the side effect, purely technical,
of producing highly irregular meshes that are not optimal for subsequent
processing. Converting an unstructured mesh into a more regular and struc-
tured model in an automatic way is a challenging task and still open prob-
lem.
Since structured meshes are useful in different applications, it is of in-
terest to be able to guarantee such property also in scenarios of part based
modeling, which aim to build digital objects by composition, instead of
modeling them from a scratch.
This thesis will present methods for obtaining structured meshes in two
different ways. First is presented a coarse quad layout computation method
which starts from a triangle mesh and the curve-skeleton of the shape. The
second approach allows to build complex shapes by procedural composition
of PAM's. Since both quad layouts and PAMs exploit their global struc-
ture, similarities between the two will be discussed, especially how their
structure has correspondences to the curve-skeleton describing the topology
of the shape being represented. Since both the presented methods rely on
the information provided by the skeleton, the difficulties of using automat-
ically extracted curve-skeletons without processing are discussed, and an
interactive tool for user-assisted processing is presented
Motion capture data processing, retrieval and recognition.
Character animation plays an essential role in the area of featured film and computer games. Manually creating character animation by animators is both tedious and inefficient, where motion capture techniques (MoCap) have been developed and become the most popular method for creating realistic character animation products. Commercial MoCap systems are expensive and the capturing process itself usually requires an indoor studio environment. Procedural animation creation is often lacking extensive user control during the generation progress. Therefore, efficiently and effectively reusing MoCap data can brings significant benefits, which has motivated wider research in terms of machine learning based MoCap data processing. A typical work flow of MoCap data reusing can be divided into 3 stages: data capture, data management and data reusing. There are still many challenges at each stage. For instance, the data capture and management often suffer from data quality problems. The efficient and effective retrieval method is also demanding due to the large amount of data being used. In addition, classification and understanding of actions are the fundamental basis of data reusing. This thesis proposes to use machine learning on MoCap data for reusing purposes, where a frame work of motion capture data processing is designed. The modular design of this framework enables motion data refinement, retrieval and recognition. The first part of this thesis introduces various methods used in existing motion capture processing approaches in literature and a brief introduction of relevant machine learning methods used in this framework. In general, the frameworks related to refinement, retrieval, recognition are discussed. A motion refinement algorithm based on dictionary learning will then be presented, where kinematical structural and temporal information are exploited. The designed optimization method and data preprocessing technique can ensure a smooth property for the recovered result. After that, a motion refinement algorithm based on matrix completion is presented, where the low-rank property and spatio-temporal information is exploited. Such model does not require preparing data for training. The designed optimization method outperforms existing approaches in regard to both effectiveness and efficiency. A motion retrieval method based on multi-view feature selection is also proposed, where the intrinsic relations between visual words in each motion feature subspace are discovered as a means of improving the retrieval performance. A provisional trace-ratio objective function and an iterative optimization method are also included. A non-negative matrix factorization based motion data clustering method is proposed for recognition purposes, which aims to deal with large scale unsupervised/semi-supervised problems. In addition, deep learning models are used for motion data recognition, e.g. 2D gait recognition and 3D MoCap recognition. To sum up, the research on motion data refinement, retrieval and recognition are presented in this thesis with an aim to tackle the major challenges in motion reusing. The proposed motion refinement methods aim to provide high quality clean motion data for downstream applications. The designed multi-view feature selection algorithm aims to improve the motion retrieval performance. The proposed motion recognition methods are equally essential for motion understanding. A collection of publications by the author of this thesis are noted in publications section
Towards a framework for socially interactive robots
250 p.En las últimas décadas, la investigación en el campo de la robótica social ha crecido considerablemente. El desarrollo de diferentes tipos de robots y sus roles dentro de la sociedad se están expandiendo poco a poco. Los robots dotados de habilidades sociales pretenden ser utilizados para diferentes aplicaciones; por ejemplo, como profesores interactivos y asistentes educativos, para apoyar el manejo de la diabetes en niños, para ayudar a personas mayores con necesidades especiales, como actores interactivos en el teatro o incluso como asistentes en hoteles y centros comerciales.El equipo de investigación RSAIT ha estado trabajando en varias áreas de la robótica, en particular,en arquitecturas de control, exploración y navegación de robots, aprendizaje automático y visión por computador. El trabajo presentado en este trabajo de investigación tiene como objetivo añadir una nueva capa al desarrollo anterior, la capa de interacción humano-robot que se centra en las capacidades sociales que un robot debe mostrar al interactuar con personas, como expresar y percibir emociones, mostrar un alto nivel de diálogo, aprender modelos de otros agentes, establecer y mantener relaciones sociales, usar medios naturales de comunicación (mirada, gestos, etc.),mostrar personalidad y carácter distintivos y aprender competencias sociales.En esta tesis doctoral, tratamos de aportar nuestro grano de arena a las preguntas básicas que surgen cuando pensamos en robots sociales: (1) ¿Cómo nos comunicamos (u operamos) los humanos con los robots sociales?; y (2) ¿Cómo actúan los robots sociales con nosotros? En esa línea, el trabajo se ha desarrollado en dos fases: en la primera, nos hemos centrado en explorar desde un punto de vista práctico varias formas que los humanos utilizan para comunicarse con los robots de una maneranatural. En la segunda además, hemos investigado cómo los robots sociales deben actuar con el usuario.Con respecto a la primera fase, hemos desarrollado tres interfaces de usuario naturales que pretenden hacer que la interacción con los robots sociales sea más natural. Para probar tales interfaces se han desarrollado dos aplicaciones de diferente uso: robots guía y un sistema de controlde robot humanoides con fines de entretenimiento. Trabajar en esas aplicaciones nos ha permitido dotar a nuestros robots con algunas habilidades básicas, como la navegación, la comunicación entre robots y el reconocimiento de voz y las capacidades de comprensión.Por otro lado, en la segunda fase nos hemos centrado en la identificación y el desarrollo de los módulos básicos de comportamiento que este tipo de robots necesitan para ser socialmente creíbles y confiables mientras actúan como agentes sociales. Se ha desarrollado una arquitectura(framework) para robots socialmente interactivos que permite a los robots expresar diferentes tipos de emociones y mostrar un lenguaje corporal natural similar al humano según la tarea a realizar y lascondiciones ambientales.La validación de los diferentes estados de desarrollo de nuestros robots sociales se ha realizado mediante representaciones públicas. La exposición de nuestros robots al público en esas actuaciones se ha convertido en una herramienta esencial para medir cualitativamente la aceptación social de los prototipos que estamos desarrollando. De la misma manera que los robots necesitan un cuerpo físico para interactuar con el entorno y convertirse en inteligentes, los robots sociales necesitan participar socialmente en tareas reales para las que han sido desarrollados, para así poder mejorar su sociabilida
Proof of Concept For the Use of Motion Capture Technology In Athletic Pedagogy
Visualization has long been an important method for conveying complex information. Where information transfer using written and spoken means might amount to 200-250 words per minute, visual media can often convey information at many times this rate. This makes visualization a potentially important tool for education. Athletic instruction, particularly, can involve communication about complex human movement that is not easily conveyed with written or spoken descriptions. Video based instruction can be problematic since video data can contain too much information, thereby making it more difficult for a student to absorb what is cognitively necessary. The lesson is to present the learner what is needed and not more. We present a novel use of motion capture animation as an educational tool for teaching athletic movements. The advantage of motion capture is its ability to accurately represent real human motion in a minimalist context which removes extraneous information normally found in video. Motion capture animation only displays motion information, not additional information regarding the motion context. Producing an “automated coach” would be too large and difficult a problem to solve within the scope of a Master's thesis but we can perform initial steps including producing a useful software tool which performs data analysis on two motion datasets. We believe such a tool would be beneficial to a human coach as an analysis tool and the work would provide some useful understanding of next important steps towards perhaps someday producing an automated coach
Electronic Imaging & the Visual Arts. EVA 2017 Florence
The Publication is following the yearly Editions of EVA FLORENCE. The State of Art is presented regarding the Application of Technologies (in particular of digital type) to Cultural Heritage. The more recent results of the Researches in the considered Area are presented. Information Technologies of interest for Culture Heritage are presented: multimedia systems, data-bases, data protection, access to digital content, Virtual Galleries. Particular reference is reserved to digital images (Electronic Imaging & the Visual Arts), regarding Cultural Institutions (Museums, Libraries, Palace - Monuments, Archaeological Sites). The International Conference includes the following Sessions: Strategic Issues; New Sciences and Culture Developments and Applications; New Technical Developments & Applications; Museums - Virtual Galleries and Related Initiatives; Art and Humanities Ecosystem & Applications; Access to the Culture Information. Two Workshops regard: Innovation and Enterprise; the Cloud Systems connected to the Culture (eCulture Cloud) in the Smart Cities context. The more recent results of the Researches at national and international are reported in the Area of Technologies and Culture Heritage, also with experimental demonstrations of developed Activities
Of assembling small sculptures and disassembling large geometry
This thesis describes the research results and contributions that have been achieved
during the author’s doctoral work. It is divided into two independent parts, each
of which is devoted to a particular research aspect.
The first part covers the true-to-detail creation of digital pieces of art, so-called
relief sculptures, from given 3D models. The main goal is to limit the depth of the
contained objects with respect to a certain perspective without compromising the
initial three-dimensional impression. Here, the preservation of significant features
and especially their sharpness is crucial. Therefore, it is necessary to overemphasize
fine surface details to ensure their perceptibility in the more complanate relief.
Our developments are aimed at amending the flexibility and user-friendliness
during the generation process. The main focus is on providing real-time solutions
with intuitive usability that make it possible to create precise, lifelike and
aesthetic results. These goals are reached by a GPU implementation, the use of
efficient filtering techniques, and the replacement of user defined parameters by
adaptive values. Our methods are capable of processing dynamic scenes and allow
the generation of seamless artistic reliefs which can be composed of multiple
elements.
The second part addresses the analysis of repetitive structures, so-called symmetries,
within very large data sets. The automatic recognition of components
and their patterns is a complex correspondence problem which has numerous applications
ranging from information visualization over compression to automatic
scene understanding. Recent algorithms reach their limits with a growing amount
of data, since their runtimes rise quadratically. Our aim is to make even massive
data sets manageable. Therefore, it is necessary to abstract features and to develop
a suitable, low-dimensional descriptor which ensures an efficient, robust, and purposive
search. A simple inspection of the proximity within the descriptor space
helps to significantly reduce the number of necessary pairwise comparisons. Our
method scales quasi-linearly and allows a rapid analysis of data sets which could
not be handled by prior approaches because of their size.Die vorgelegte Arbeit beschreibt die wissenschaftlichen Ergebnisse und Beiträge,
die während der vergangenen Promotionsphase entstanden sind. Sie gliedert sich
in zwei voneinander unabhängige Teile, von denen jeder einem eigenen Forschungsschwerpunkt gewidmet ist.
Der erste Teil beschäftigt sich mit der detailgetreuen Erzeugung digitaler
Kunstwerke, sogenannter Reliefplastiken, aus gegebenen 3D-Modellen. Das Ziel
ist es, die Objekte, abhängig von der Perspektive, stark in ihrer Tiefe zu limitieren,
ohne dass der Eindruck der räumlichen Ausdehnung verloren geht. Hierbei
kommt dem Aufrechterhalten der Schärfe signifikanter Merkmale besondere
Bedeutung zu. Dafür ist es notwendig, die feinen Details der Objektoberfläche
überzubetonen, um ihre Sichtbarkeit im flacheren Relief zu gewährleisten. Unsere
Weiterentwicklungen zielen auf die Verbesserung der Flexibilität und Benutzerfreundlichkeit
während des Enstehungsprozesses ab. Der Fokus liegt dabei
auf dem Bereitstellen intuitiv bedienbarer Echtzeitlösungen, die die Erzeugung
präziser, naturgetreuer und visuell ansprechender Resultate ermöglichen. Diese
Ziele werden durch eine GPU-Implementierung, den Einsatz effizienter Filtertechniken
sowie das Ersetzen benutzergesteuerter Parameter durch adaptive Werte
erreicht. Unsere Methoden erlauben das Verarbeiten dynamischer Szenen und die
Erstellung nahtloser, kunstvoller Reliefs, die aus mehreren Elementen und Perspektiven
zusammengesetzt sein können.
Der zweite Teil behandelt die Analyse wiederkehrender Stukturen, sogenannter
Symmetrien, innerhalb sehr großer Datensätze. Das automatische Erkennen
von Komponenten und deren Muster ist ein komplexes Korrespondenzproblem
mit zahlreichen Anwendungen, von der Informationsvisualisierung über Kompression
bis hin zum automatischen Verstehen. Mit zunehmender Datenmenge
geraten die etablierten Algorithmen an ihre Grenzen, da ihre Laufzeiten quadratisch
ansteigen. Unser Ziel ist es, auch massive Datensätze handhabbar zu machen.
Dazu ist es notwendig, Merkmale zu abstrahieren und einen passenden
niedrigdimensionalen Deskriptor zu entwickeln, der eine effiziente, robuste und
zielführende Suche erlaubt. Eine simple Betrachtung der Nachbarschaft innerhalb
der Deskriptoren hilft dabei, die Anzahl notwendiger paarweiser Vergleiche signifikant
zu reduzieren. Unser Verfahren skaliert quasi-linear und ermöglicht somit
eine rasche Auswertung auch auf Daten, die für bisherige Methoden zu groß waren