12,023 research outputs found
The Metaverse: Survey, Trends, Novel Pipeline Ecosystem & Future Directions
The Metaverse offers a second world beyond reality, where boundaries are
non-existent, and possibilities are endless through engagement and immersive
experiences using the virtual reality (VR) technology. Many disciplines can
benefit from the advancement of the Metaverse when accurately developed,
including the fields of technology, gaming, education, art, and culture.
Nevertheless, developing the Metaverse environment to its full potential is an
ambiguous task that needs proper guidance and directions. Existing surveys on
the Metaverse focus only on a specific aspect and discipline of the Metaverse
and lack a holistic view of the entire process. To this end, a more holistic,
multi-disciplinary, in-depth, and academic and industry-oriented review is
required to provide a thorough study of the Metaverse development pipeline. To
address these issues, we present in this survey a novel multi-layered pipeline
ecosystem composed of (1) the Metaverse computing, networking, communications
and hardware infrastructure, (2) environment digitization, and (3) user
interactions. For every layer, we discuss the components that detail the steps
of its development. Also, for each of these components, we examine the impact
of a set of enabling technologies and empowering domains (e.g., Artificial
Intelligence, Security & Privacy, Blockchain, Business, Ethics, and Social) on
its advancement. In addition, we explain the importance of these technologies
to support decentralization, interoperability, user experiences, interactions,
and monetization. Our presented study highlights the existing challenges for
each component, followed by research directions and potential solutions. To the
best of our knowledge, this survey is the most comprehensive and allows users,
scholars, and entrepreneurs to get an in-depth understanding of the Metaverse
ecosystem to find their opportunities and potentials for contribution
Recommended from our members
Ensuring Access to Safe and Nutritious Food for All Through the Transformation of Food Systems
One Small Step for Generative AI, One Giant Leap for AGI: A Complete Survey on ChatGPT in AIGC Era
OpenAI has recently released GPT-4 (a.k.a. ChatGPT plus), which is
demonstrated to be one small step for generative AI (GAI), but one giant leap
for artificial general intelligence (AGI). Since its official release in
November 2022, ChatGPT has quickly attracted numerous users with extensive
media coverage. Such unprecedented attention has also motivated numerous
researchers to investigate ChatGPT from various aspects. According to Google
scholar, there are more than 500 articles with ChatGPT in their titles or
mentioning it in their abstracts. Considering this, a review is urgently
needed, and our work fills this gap. Overall, this work is the first to survey
ChatGPT with a comprehensive review of its underlying technology, applications,
and challenges. Moreover, we present an outlook on how ChatGPT might evolve to
realize general-purpose AIGC (a.k.a. AI-generated content), which will be a
significant milestone for the development of AGI.Comment: A Survey on ChatGPT and GPT-4, 29 pages. Feedback is appreciated
([email protected]
Audio-Visual Automatic Speech Recognition Towards Education for Disabilities
Education is a fundamental right that enriches everyone’s life. However, physically challenged people often debar from the general and advanced education system. Audio-Visual Automatic Speech Recognition (AV-ASR) based system is useful to improve the education of physically challenged people by providing hands-free computing. They can communicate to the learning system through AV-ASR. However, it is challenging to trace the lip correctly for visual modality. Thus, this paper addresses the appearance-based visual feature along with the co-occurrence statistical measure for visual speech recognition. Local Binary Pattern-Three Orthogonal Planes (LBP-TOP) and Grey-Level Co-occurrence Matrix (GLCM) is proposed for visual speech information. The experimental results show that the proposed system achieves 76.60 % accuracy for visual speech and 96.00 % accuracy for audio speech recognition
Perfect is the enemy of test oracle
Automation of test oracles is one of the most challenging facets of software
testing, but remains comparatively less addressed compared to automated test
input generation. Test oracles rely on a ground-truth that can distinguish
between the correct and buggy behavior to determine whether a test fails
(detects a bug) or passes. What makes the oracle problem challenging and
undecidable is the assumption that the ground-truth should know the exact
expected, correct, or buggy behavior. However, we argue that one can still
build an accurate oracle without knowing the exact correct or buggy behavior,
but how these two might differ. This paper presents SEER, a learning-based
approach that in the absence of test assertions or other types of oracle, can
determine whether a unit test passes or fails on a given method under test
(MUT). To build the ground-truth, SEER jointly embeds unit tests and the
implementation of MUTs into a unified vector space, in such a way that the
neural representation of tests are similar to that of MUTs they pass on them,
but dissimilar to MUTs they fail on them. The classifier built on top of this
vector representation serves as the oracle to generate "fail" labels, when test
inputs detect a bug in MUT or "pass" labels, otherwise. Our extensive
experiments on applying SEER to more than 5K unit tests from a diverse set of
open-source Java projects show that the produced oracle is (1) effective in
predicting the fail or pass labels, achieving an overall accuracy, precision,
recall, and F1 measure of 93%, 86%, 94%, and 90%, (2) generalizable, predicting
the labels for the unit test of projects that were not in training or
validation set with negligible performance drop, and (3) efficient, detecting
the existence of bugs in only 6.5 milliseconds on average.Comment: Published in ESEC/FSE 202
Being Comes from Not-being: Open-vocabulary Text-to-Motion Generation with Wordless Training
Text-to-motion generation is an emerging and challenging problem, which aims
to synthesize motion with the same semantics as the input text. However, due to
the lack of diverse labeled training data, most approaches either limit to
specific types of text annotations or require online optimizations to cater to
the texts during inference at the cost of efficiency and stability. In this
paper, we investigate offline open-vocabulary text-to-motion generation in a
zero-shot learning manner that neither requires paired training data nor extra
online optimization to adapt for unseen texts. Inspired by the prompt learning
in NLP, we pretrain a motion generator that learns to reconstruct the full
motion from the masked motion. During inference, instead of changing the motion
generator, our method reformulates the input text into a masked motion as the
prompt for the motion generator to ``reconstruct'' the motion. In constructing
the prompt, the unmasked poses of the prompt are synthesized by a text-to-pose
generator. To supervise the optimization of the text-to-pose generator, we
propose the first text-pose alignment model for measuring the alignment between
texts and 3D poses. And to prevent the pose generator from overfitting to
limited training texts, we further propose a novel wordless training mechanism
that optimizes the text-to-pose generator without any training texts. The
comprehensive experimental results show that our method obtains a significant
improvement against the baseline methods. The code is available at
https://github.com/junfanlin/oohmg
Message Journal, Issue 5: COVID-19 SPECIAL ISSUE Capturing visual insights, thoughts and reflections on 2020/21 and beyond...
If there is a theme running through the Message Covid-19 special issue, it is one of caring. Of our own and others’ resilience and wellbeing, of friendship and community, of students, practitioners and their futures, of social justice, equality and of doing the right thing. The veins of designing with care run through the edition, wide and deep. It captures, not designers as heroes, but those with humble views, exposing the need to understand a diversity of perspectives when trying to comprehend the complexity that Covid-19 continues to generate.
As graphic designers, illustrators and visual communicators, contributors have created, documented, written, visualised, reflected, shared, connected and co-created, designed for good causes and re-defined what it is to be a student, an academic and a designer during the pandemic. This poignant period in time has driven us, through isolation, towards new rules of living, and new ways of working; to see and map the world in a different light. A light that is uncertain, disjointed, and constantly being redefined.
This Message issue captures responses from the graphic communication design community in their raw state, to allow contributors to communicate their experiences through both their written and visual voice. Thus, the reader can discern as much from the words as the design and visualisations.
Through this issue a substantial number of contributions have focused on personal reflection, isolation, fear, anxiety and wellbeing, as well as reaching out to community, making connections and collaborating. This was not surprising in a world in which connection with others has often been remote, and where ‘normal’ social structures of support and care have been broken down. We also gain insight into those who are using graphic communication design to inspire and capture new ways of teaching and learning, developing themselves as designers, educators, and activists, responding to social justice and to do good; gaining greater insight into society, government actions and conspiracy. Introduction: Victoria Squire - Coping with Covid: Community, connection and collaboration: James Alexander & Carole Evans, Meg Davies, Matthew Frame, Chae Ho Lee, Alma Hoffmann, Holly K. Kaufman-Hill, Joshua Korenblat, Warren Lehrer, Christine Lhowe, Sara Nesteruk, Cat Normoyle & Jessica Teague, Kyuha Shim. - Coping with Covid: Isolation, wellbeing and hope: Sadia Abdisalam, Tom Ayling, Jessica Barness, Megan Culliford, Stephanie Cunningham, Sofija Gvozdeva, Hedzlynn Kamaruzzaman, Merle Karp, Erica V. P. Lewis, Kelly Salchow Macarthur, Steven McCarthy, Shelly Mayers, Elizabeth Shefrin, Angelica Sibrian, David Smart, Ane Thon Knutsen, Isobel Thomas, Darryl Westley. - Coping with Covid: Pedagogy, teaching and learning: Bernard J Canniffe, Subir Dey, Aaron Ganci, Elizabeth Herrmann, John Kilburn, Paul Nini, Emily Osborne, Gianni Sinni & Irene Sgarro, Dave Wood, Helena Gregory, Colin Raeburn & Jackie Malcolm. - Coping with Covid: Social justice, activism and doing good: Class Action Collective, Xinyi Li, Matt Soar, Junie Tang, Lisa Winstanley. - Coping with Covid: Society, control and conspiracy: Diana Bîrhală, Maria Borțoi, Patti Capaldi, Tânia A. Cardoso, Peter Gibbons, Bianca Milea, Rebecca Tegtmeyer, Danne Wo
Learning disentangled speech representations
A variety of informational factors are contained within the speech signal and a single short recording of speech reveals much more than the spoken words. The best method to extract and represent informational factors from the speech signal ultimately depends on which informational factors are desired and how they will be used. In addition, sometimes methods will capture more than one informational factor at the same time such as speaker identity, spoken content, and speaker prosody.
The goal of this dissertation is to explore different ways to deconstruct the speech signal into abstract representations that can be learned and later reused in various speech technology tasks. This task of deconstructing, also known as disentanglement, is a form of distributed representation learning. As a general approach to disentanglement, there are some guiding principles that elaborate what a learned representation should contain as well as how it should function. In particular, learned representations should contain all of the requisite information in a more compact manner, be interpretable, remove nuisance factors of irrelevant information, be useful in downstream tasks, and independent of the task at hand. The learned representations should also be able to answer counter-factual questions.
In some cases, learned speech representations can be re-assembled in different ways according to the requirements of downstream applications. For example, in a voice conversion task, the speech content is retained while the speaker identity is changed. And in a content-privacy task, some targeted content may be concealed without affecting how surrounding words sound. While there is no single-best method to disentangle all types of factors, some end-to-end approaches demonstrate a promising degree of generalization to diverse speech tasks.
This thesis explores a variety of use-cases for disentangled representations including phone recognition, speaker diarization, linguistic code-switching, voice conversion, and content-based privacy masking. Speech representations can also be utilised for automatically assessing the quality and authenticity of speech, such as automatic MOS ratings or detecting deep fakes. The meaning of the term "disentanglement" is not well defined in previous work, and it has acquired several meanings depending on the domain (e.g. image vs. speech). Sometimes the term "disentanglement" is used interchangeably with the term "factorization". This thesis proposes that disentanglement of speech is distinct, and offers a viewpoint of disentanglement that can be considered both theoretically and practically
Building body identities - exploring the world of female bodybuilders
This thesis explores how female bodybuilders seek to develop and maintain a viable sense of self despite being stigmatized by the gendered foundations of what Erving Goffman (1983) refers to as the 'interaction order'; the unavoidable presentational context in which identities are forged during the course of social life. Placed in the context of an overview of the historical treatment of women's bodies, and a concern with the development of bodybuilding as a specific form of body modification, the research draws upon a unique two year ethnographic study based in the South of England, complemented by interviews with twenty-six female bodybuilders, all of whom live in the U.K. By mapping these extraordinary women's lives, the research illuminates the pivotal spaces and essential lived experiences that make up the female bodybuilder. Whilst the women appear to be embarking on an 'empowering' radical body project for themselves, the consequences of their activity remains culturally ambivalent. This research exposes the 'Janus-faced' nature of female bodybuilding, exploring the ways in which the women negotiate, accommodate and resist pressures to engage in more orthodox and feminine activities and appearances
- …