    Automatic Sign Dance Synthesis from Gesture-based Sign Language

    Automatic dance synthesis has become more and more popular due to the increasing demand in computer games and animations. Existing research generates dance motions without much consideration for the context of the music. In reality, professional dancers make choreography according to the lyrics and music features. In this research, we focus on a particular genre of dance known as sign dance, which combines gesture-based sign language with full body dance motion. We propose a system to automatically generate sign dance from a piece of music and its corresponding sign gesture. The core of the system is a Sign Dance Model trained by multiple regression analysis to represent the correlations between sign dance and sign gesture/music, as well as a set of objective functions to evaluate the quality of the sign dance. Our system can be applied to music visualization, allowing people with hearing difficulties to understand and enjoy music

    MotionScript: Natural Language Descriptions for Expressive 3D Human Motions

    This paper proposes MotionScript, a motion-to-text conversion algorithm and natural language representation for human body motions. MotionScript aims to describe movements in greater detail and with more accuracy than previous natural language approaches. Many motion datasets describe relatively objective and simple actions with little variation on the way they are expressed (e.g. sitting, walking, dribbling a ball). But for expressive actions that contain a diversity of movements in the class (e.g. being sad, dancing), or for actions outside the domain of standard motion capture datasets (e.g. stylistic walking, sign-language), more specific and granular natural language descriptions are needed. Our proposed MotionScript descriptions differ from existing natural language representations in that it provides direct descriptions in natural language instead of simple action labels or high-level human captions. To the best of our knowledge, this is the first attempt at translating 3D motions to natural language descriptions without requiring training data. Our experiments show that when MotionScript representations are used in a text-to-motion neural task, body movements are more accurately reconstructed, and large language models can be used to generate unseen complex motions

    This article surveys the state of so-called topic theory today. It charts its development through two generations of topic theorists. The first is constructed around three influential texts: Leonard Ratners seminal book that established the discipline in its own right, Classic music: expression, form and style (1980); Wye Allanbrooks. Rhythmic gesture in Mozart: Le nozze di Figaro and Don Giovanni (1983); and Kofi Agawus. Playing with signs: a semiotic interpretation of classical music (1991). The second comprises significant advances in topic theory essayed through two further pairs of texts: Robert Hattens Musical meaning in Beethoven: markedness, correlation, and interpretation (1994) and Interpreting musical gestures, topics, and tropes: Mozart, Beethoven, Schubert (2004); and Raymond Monelles Linguistics and semiotics in music (1992) and The sense of music: semiotic essays (2000). Topic Theory's role as the soft hermeneutic sub-field of music semiotics (relative to the harder, formalist practices of Nattiezs neutral level analysis) is portrayed here as navigating a number of treacherous polemical paths. These wend their way between referential style (expression) and structural syntax (form); historical reconstruction and hermeneutic construction; and heightened sensitivity to social meanings and imposed acts of creative interpretation. This existence of topic theory in a continuous dialogue between structural formalism and the semantics of expressive discourse is held responsible for its marginal position both to the dominant strains of contemporary postmodern musicology and to the dying embers of formalist analysis. The failure of topic theory to strike a fashionable text-context balance thus highlights why musicology continues to view semiotics with scepticism. Ratner presents his thesaurus of style labelssomewhat dubiouslyas the historically authentic ready-to-hand materials (types and styles) of eighteenth-century expressive musical rhetoric. But it is Agawus combination of this universe of topics with a Schenker-influenced beginning-middle-end paradigm that establishes the hallmark of first generation topic theory on which the first half of this paper focuses. Agawus delicate equation between extroversive and introversive semiosis is essayed as a pivotal turning point in topic theorys ability to transcend the mere passive ascription of rhetorical labels. Out of this equation, expressive meanings can ariseas much from the non-congruence, as the congruence, of signs and structure. Hatten's critique of Agawu for neglecting the full interpretative consequences of his signifieds is the springboard for his more hermeneutically replete brand of topic theory and the emergence of the second generation topic theorists. Hattens use of troping (a kind of musical metaphor), is one of many interpretative tools that are responsible for broadening the arena of topic theorysome of his others being: expressive genres, emergent meanings and markedness theory. These are deployed across a variety of musical parameters as Hattens attention increasingly turns to the prototypicality of topics in their euphoric and dysphoric states. Hattens interpretative work is shown to transcend historical reconstruction to comprise creative interpretation built on a much broader definition of expressive gestures, of which topics are only a constituent part. The article concludes with Monelles expos of the dubious historical underpinnings of Ratners topic theory foundations. This does not render this vibrant branch of semiotics redundant but, on the contrary, charts its future direction as one calling out for far deeper historical investigation and cultural criticism. Monelles enlightening forays into the more replete expressive meanings of such topics as the horse and pianto make this point abundantly clear. The future of topics today, if not musicology in general, is one of cultural criticism

    How2Sign: A large-scale multimodal dataset for continuous American sign language

    One of the factors that have hindered progress in the areas of sign language recognition, translation, and production is the absence of large annotated datasets. Towards this end, we introduce How2Sign, a multimodal and multiview continuous American Sign Language (ASL) dataset, consisting of a parallel corpus of more than 80 hours of sign language videos and a set of corresponding modalities including speech, English transcripts, and depth. A three-hour subset was further recorded in the Panoptic studio enabling detailed 3D pose estimation. To evaluate the potential of How2Sign for real-world impact, we conduct a study with ASL signers and show that synthesized videos using our dataset can indeed be understood. The study further gives insights on challenges that computer vision should address in order to make progress in this field. Dataset website: http://how2sign.github.io/This work received funding from Facebook through gifts to CMU and UPC; through projects TEC2016-75976-R, TIN2015- 65316-P, SEV-2015-0493 and PID2019-107255GB-C22 of the Spanish Government and 2017-SGR-1414 of Generalitat de Catalunya. This work used XSEDE’s “Bridges” system at the Pittsburgh Supercomputing Center (NSF award ACI- 1445606). Amanda Duarte has received support from la Caixa Foundation (ID 100010434) under the fellowship code LCF/BQ/IN18/11660029. Shruti Palaskar was supported by the Facebook Fellowship program.Peer ReviewedObjectius de Desenvolupament Sostenible::10 - Reducció de les DesigualtatsObjectius de Desenvolupament Sostenible::4 - Educació de Qualitat::4.5 - Per a 2030, eliminar les disparitats de gènere en l’educació i garantir l’accés en condicions d’igualtat a les persones vulnerables, incloses les persones amb discapacitat, els pobles indígenes i els nens i nenes en situacions de vulnerabilitat, a tots els nivells de l’ensenyament i la formació professionalObjectius de Desenvolupament Sostenible::10 - Reducció de les Desigualtats::10.2 - Per a 2030, potenciar i promoure la inclusió social, econòmica i política de totes les persones, independentment de l’edat, sexe, discapacitat, raça, ètnia, origen, religió, situació econòmica o altra condicióObjectius de Desenvolupament Sostenible::4 - Educació de QualitatPostprint (author's final draft

    Design and semantics of form and movement (DeSForM 2006)

    Design and Semantics of Form and Movement (DeSForM) grew from applied research exploring emerging design methods and practices to support new generation product and interface design. The products and interfaces are concerned with: the context of ubiquitous computing and ambient technologies and the need for greater empathy in the pre-programmed behaviour of the ‘machines’ that populate our lives. Such explorative research in the CfDR has been led by Young, supported by Kyffin, Visiting Professor from Philips Design and sponsored by Philips Design over a period of four years (research funding £87k). DeSForM1 was the first of a series of three conferences that enable the presentation and debate of international work within this field: • 1st European conference on Design and Semantics of Form and Movement (DeSForM1), Baltic, Gateshead, 2005, Feijs L., Kyffin S. & Young R.A. eds. • 2nd European conference on Design and Semantics of Form and Movement (DeSForM2), Evoluon, Eindhoven, 2006, Feijs L., Kyffin S. & Young R.A. eds. • 3rd European conference on Design and Semantics of Form and Movement (DeSForM3), New Design School Building, Newcastle, 2007, Feijs L., Kyffin S. & Young R.A. eds. Philips sponsorship of practice-based enquiry led to research by three teams of research students over three years and on-going sponsorship of research through the Northumbria University Design and Innovation Laboratory (nuDIL). Young has been invited on the steering panel of the UK Thinking Digital Conference concerning the latest developments in digital and media technologies. Informed by this research is the work of PhD student Yukie Nakano who examines new technologies in relation to eco-design textiles

    Neural Sign Reenactor: Deep Photorealistic Sign Language Retargeting

    In this paper, we introduce a neural rendering pipeline for transferring the facial expressions, head pose, and body movements of one person in a source video to another in a target video. We apply our method to the challenging case of Sign Language videos: given a source video of a sign language user, we can faithfully transfer the performed manual (e.g., handshape, palm orientation, movement, location) and non-manual (e.g., eye gaze, facial expressions, mouth patterns, head, and body movements) signs to a target video in a photo-realistic manner. Our method can be used for Sign Language Anonymization, Sign Language Production (synthesis module), as well as for reenacting other types of full body activities (dancing, acting performance, exercising, etc.). We conduct detailed qualitative and quantitative evaluations and comparisons, which demonstrate the particularly promising and realistic results that we obtain and the advantages of our method over existing approaches.Comment: Accepted at AI4CC Workshop at CVPR 202