1,227 research outputs found
Graphic Synchronization with AudioWAV Files
This project deals with matter regarding to the application that combines the computer
graphic and the audio elements. The objective of this project is to create an application
that synchronizes the movement of the graphic with the audio tunes. The emphasis is
placed on the aims at recognizing specific tunes through the graphic movements. The
application is meant to assist the users in identifying and distinguishing the variety
tunes in an audio file specifically the WAV file format. The main features of the
application are that 1) it plays audio with the WAV file format and 2) it displays a
computer graphic 'stickman' that moves according to the tunes. The scope of study is
the process of how tune recognition works and how it can be synchronized with the
graphic motion. It basically focuses on the graphic visualization of an audio. For this
project, I basically follow the waterfall methodology butI have adjusted the phases in it
to appropriately suit my project tasks which consist of six (6) main processes. The
finding will be more on the issues of how 1) to understand the underlying concepts of
integrating graphic and audio in single application, 2) to extract the WAV audio file
into specific tunes and 3) to synchronize the graphic element with the audio beat or
tune. Basically, by using the graphic-audio combination, the project will be able to
visualize and differentiate the tunes specifically and accordingly
A Framework for Designing 3d Virtual Environments
The process of design and development of virtual environments can be supported by tools and frameworks, to save time in technical aspects and focusing on the content. In this paper we present an academic framework which provides several levels of abstraction to ease this work. It includes state-of-the-art components we devised or integrated adopting open-source solutions in order to face specific problems. Its architecture is modular and customizable, the code is open-source.\u
Presenting in Virtual Worlds: An Architecture for a 3D Anthropomorphic Presenter
Multiparty-interaction technology is changing entertainment, education, and training. Deployed examples of such technology include embodied agents and robots that act as a museum guide, a news presenter, a teacher, a receptionist, or someone trying to sell you insurance, homes, or tickets. In all these cases, the embodied agent needs to explain and describe. This article describes the design of a 3D virtual presenter that uses different output channels (including speech and animation of posture, pointing, and involuntary movements) to present and explain. The behavior is scripted and synchronized with a 2D display containing associated text and regions (slides, drawings, and paintings) at which the presenter can point. This article is part of a special issue on interactive entertainment
Performance of grassed swale as stormwater quantity control in lowland area
Grassed swale is a vegetated open channel designed to attenuate stormwater through infiltration and conveying runoff into nearby water bodies, thus reduces peak flows and minimizes the causes of flood. UTHM is a flood-prone area due to located in lowland area, has high groundwater level and low infiltration rates. The aim of this study is to assess the performance of grassed swale as a stormwater quantity control in UTHM. Flow depths and velocities of swales were measured according to Six-Tenths Depth Method shortly after a rainfall event. Flow discharges of swales (Qswale) were evaluated by Mean- Section Method to determine the variations of Manning’s roughness coefficients (ncalculate) that results between 0.075 – 0.122 due to tall grass and irregularity of channels. Based on the values of Qswale between sections of swales, the percentages of flow attenuation are up to 54%. As for the flow conveyance of swales, Qswale were determined by Manning’s equation that divided into Qcalculate, evaluated using ncalculate, and Qdesign, evaluated using roughness coefficient recommended by MSMA (ndesign), to compare with flow discharges of drainage areas (Qpeak), evaluated by Rational Method with 10-year ARI. Each site of study has shown Qdesign is greater than Qpeak up to 59%. However, Qcalculate is greater than Qpeak only at a certain site of study up to 14%. The values of Qdesign also greater than Qcalculate up to 52% where it shows that the roughness coefficients as considered in MSMA are providing a better performance of swale. This study also found that the characteristics of the studied swales are comparable to the design consideration by MSMA. Based on these findings, grassed swale has the potential in collecting, attenuating, and conveying stormwater, which suitable to be applied as one of the best management practices in preventing flash flood at UTHM campus
Realistic Lip Syncing for Virtual Character Using Common Viseme Set
Speech is one of the most important interaction methods between the humans. Therefore, most of avatar researches focus on this area with significant attention. Creating animated speech requires a facial model capable of representing the myriad shapes the human face expressions during speech. Moreover, a method to produce the correct shape at the correct time is also in order. One of the main challenges is to create precise lip movements of the avatar and synchronize it with a recorded audio. This paper proposes a new lip synchronization algorithm for realistic applications, which can be employed to generate synchronized facial movements among the audio generated from natural speech or through a text-to-speech engine. This method requires an animator to construct animations using a canonical set of visemes for all pair wise combination of a reduced phoneme set. These animations are then stitched together smoothly to construct the final animation
Enhancing Expressiveness of Speech through Animated Avatars for Instant Messaging and Mobile Phones
This thesis aims to create a chat program that allows users to communicate via an animated avatar that provides believable lip-synchronization and expressive emotion. Currently many avatars do not attempt to do lip-synchronization. Those that do are not well synchronized and have little or no emotional expression. Most avatars with lip synch use realistic looking 3D models or stylized rendering of complex models. This work utilizes images rendered in a cartoon style and lip-synchronization rules based on traditional animation. The cartoon style, as opposed to a more realistic look, makes the mouth motion more believable and the characters more appealing. The cartoon look and image-based animation (as opposed to a graphic model animated through manipulation of a skeleton or wireframe) also allows for fewer key frames resulting in faster speed with more room for expressiveness. When text is entered into the program, the Festival Text-to-Speech engine creates a speech file and extracts phoneme and phoneme duration data. Believable and fluid lip-synchronization is then achieved by means of a number of phoneme-to-image rules. Alternatively, phoneme and phoneme duration data can be obtained for speech dictated into a microphone using Microsoft SAPI and the CSLU Toolkit. Once lip synchronization has been completed, rules for non-verbal animation are added. Emotions are appended to the animation of speech in two ways: automatically, by recognition of key words and punctuation, or deliberately, by user-defined tags. Additionally, rules are defined for idle-time animation. Preliminary results indicate that the animated avatar program offers an improvement over currently available software. It aids in the understandability of speech, combines easily recognizable and expressive emotions with speech, and successfully enhances overall enjoyment of the chat experience. Applications for the program include use in cell phones for the deaf or hearing impaired, instant messaging, video conferencing, instructional software, and speech and animation synthesis
Reviving Static Charts into Live Charts
Data charts are prevalent across various fields due to their efficacy in
conveying complex data relationships. However, static charts may sometimes
struggle to engage readers and efficiently present intricate information,
potentially resulting in limited understanding. We introduce "Live Charts," a
new format of presentation that decomposes complex information within a chart
and explains the information pieces sequentially through rich animations and
accompanying audio narration. We propose an automated approach to revive static
charts into Live Charts. Our method integrates GNN-based techniques to analyze
the chart components and extract data from charts. Then we adopt large natural
language models to generate appropriate animated visuals along with a
voice-over to produce Live Charts from static ones. We conducted a thorough
evaluation of our approach, which involved the model performance, use cases, a
crowd-sourced user study, and expert interviews. The results demonstrate Live
Charts offer a multi-sensory experience where readers can follow the
information and understand the data insights better. We analyze the benefits
and drawbacks of Live Charts over static charts as a new information
consumption experience
- …