16,228 research outputs found
Security and Privacy Problems in Voice Assistant Applications: A Survey
Voice assistant applications have become omniscient nowadays. Two models that
provide the two most important functions for real-life applications (i.e.,
Google Home, Amazon Alexa, Siri, etc.) are Automatic Speech Recognition (ASR)
models and Speaker Identification (SI) models. According to recent studies,
security and privacy threats have also emerged with the rapid development of
the Internet of Things (IoT). The security issues researched include attack
techniques toward machine learning models and other hardware components widely
used in voice assistant applications. The privacy issues include technical-wise
information stealing and policy-wise privacy breaches. The voice assistant
application takes a steadily growing market share every year, but their privacy
and security issues never stopped causing huge economic losses and endangering
users' personal sensitive information. Thus, it is important to have a
comprehensive survey to outline the categorization of the current research
regarding the security and privacy problems of voice assistant applications.
This paper concludes and assesses five kinds of security attacks and three
types of privacy threats in the papers published in the top-tier conferences of
cyber security and voice domain.Comment: 5 figure
The Metaverse: Survey, Trends, Novel Pipeline Ecosystem & Future Directions
The Metaverse offers a second world beyond reality, where boundaries are
non-existent, and possibilities are endless through engagement and immersive
experiences using the virtual reality (VR) technology. Many disciplines can
benefit from the advancement of the Metaverse when accurately developed,
including the fields of technology, gaming, education, art, and culture.
Nevertheless, developing the Metaverse environment to its full potential is an
ambiguous task that needs proper guidance and directions. Existing surveys on
the Metaverse focus only on a specific aspect and discipline of the Metaverse
and lack a holistic view of the entire process. To this end, a more holistic,
multi-disciplinary, in-depth, and academic and industry-oriented review is
required to provide a thorough study of the Metaverse development pipeline. To
address these issues, we present in this survey a novel multi-layered pipeline
ecosystem composed of (1) the Metaverse computing, networking, communications
and hardware infrastructure, (2) environment digitization, and (3) user
interactions. For every layer, we discuss the components that detail the steps
of its development. Also, for each of these components, we examine the impact
of a set of enabling technologies and empowering domains (e.g., Artificial
Intelligence, Security & Privacy, Blockchain, Business, Ethics, and Social) on
its advancement. In addition, we explain the importance of these technologies
to support decentralization, interoperability, user experiences, interactions,
and monetization. Our presented study highlights the existing challenges for
each component, followed by research directions and potential solutions. To the
best of our knowledge, this survey is the most comprehensive and allows users,
scholars, and entrepreneurs to get an in-depth understanding of the Metaverse
ecosystem to find their opportunities and potentials for contribution
Sign Language Translation from Instructional Videos
The advances in automatic sign language translation (SLT) to spoken languages
have been mostly benchmarked with datasets of limited size and restricted
domains. Our work advances the state of the art by providing the first baseline
results on How2Sign, a large and broad dataset.
We train a Transformer over I3D video features, using the reduced BLEU as a
reference metric for validation, instead of the widely used BLEU score. We
report a result of 8.03 on the BLEU score, and publish the first open-source
implementation of its kind to promote further advances.Comment: Paper accepted at WiCV @CVPR2
The place where curses are manufactured : four poets of the Vietnam War
The Vietnam War was unique among American wars. To pinpoint its uniqueness, it was necessary to look for a non-American voice that would enable me to articulate its distinctiveness and explore the American character as observed by an Asian. Takeshi Kaiko proved to be most helpful. From his novel, Into a Black Sun, I was able to establish a working pair of 'bookends' from which to approach the poetry of Walter McDonald, Bruce Weigl, Basil T. Paquet and Steve Mason. Chapter One is devoted to those seemingly mismatched 'bookends,' Walt Whitman and General William C. Westmoreland, and their respective anthropocentric and technocentric visions of progress and the peculiarly American concept of the "open road" as they manifest themselves in Vietnam. In Chapter, Two, I analyze the war poems of Walter McDonald. As a pilot, writing primarily about flying, his poetry manifests General Westmoreland's technocentric vision of the 'road' as determined by and manifest through technology. Chapter Three focuses on the poems of Bruce Weigl. The poems analyzed portray the literal and metaphorical descent from the technocentric, 'numbed' distance of aerial warfare to the world of ground warfare, and the initiation of a 'fucking new guy,' who discovers the contours of the self's interior through a set of experiences that lead from from aerial insertion into the jungle to the degradation of burning human
feces. Chapter Four, devoted to the thirteen poems of Basil T. Paquet, focuses on the continuation of the descent begun in Chapter Two. In his capacity as a medic, Paquet's entire body of poems details his quotidian tasks which entail tending the maimed, the mortally wounded and the dead. The final chapter deals with Steve Mason's JohnnY's Song, and his depiction of the plight of Vietnam veterans back in "The World" who are still trapped inside the interior landscape of their individual "ghettoes" of the soul created by their war-time experiences
Augmented classification for electrical coil winding defects
A green revolution has accelerated over the recent decades with a look to replace existing transportation power solutions through the adoption of greener electrical alternatives. In parallel the digitisation of manufacturing has enabled progress in the tracking and traceability of processes and improvements in fault detection and classification. This paper explores electrical machine manufacture and the challenges faced in identifying failures modes during this life cycle through the demonstration of state-of-the-art machine vision methods for the classification of electrical coil winding defects. We demonstrate how recent generative adversarial networks can be used to augment training of these models to further improve their accuracy for this challenging task. Our approach utilises pre-processing and dimensionality reduction to boost performance of the model from a standard convolutional neural network (CNN) leading to a significant increase in accuracy
Strategies for Early Learners
Welcome to learning about how to effectively plan curriculum for young children. This textbook will address: • Developing curriculum through the planning cycle • Theories that inform what we know about how children learn and the best ways for teachers to support learning • The three components of developmentally appropriate practice • Importance and value of play and intentional teaching • Different models of curriculum • Process of lesson planning (documenting planned experiences for children) • Physical, temporal, and social environments that set the stage for children’s learning • Appropriate guidance techniques to support children’s behaviors as the self-regulation abilities mature. • Planning for preschool-aged children in specific domains including o Physical development o Language and literacy o Math o Science o Creative (the visual and performing arts) o Diversity (social science and history) o Health and safety • Making children’s learning visible through documentation and assessmenthttps://scholar.utc.edu/open-textbooks/1001/thumbnail.jp
Learning disentangled speech representations
A variety of informational factors are contained within the speech signal and a single short recording of speech reveals much more than the spoken words. The best method to extract and represent informational factors from the speech signal ultimately depends on which informational factors are desired and how they will be used. In addition, sometimes methods will capture more than one informational factor at the same time such as speaker identity, spoken content, and speaker prosody.
The goal of this dissertation is to explore different ways to deconstruct the speech signal into abstract representations that can be learned and later reused in various speech technology tasks. This task of deconstructing, also known as disentanglement, is a form of distributed representation learning. As a general approach to disentanglement, there are some guiding principles that elaborate what a learned representation should contain as well as how it should function. In particular, learned representations should contain all of the requisite information in a more compact manner, be interpretable, remove nuisance factors of irrelevant information, be useful in downstream tasks, and independent of the task at hand. The learned representations should also be able to answer counter-factual questions.
In some cases, learned speech representations can be re-assembled in different ways according to the requirements of downstream applications. For example, in a voice conversion task, the speech content is retained while the speaker identity is changed. And in a content-privacy task, some targeted content may be concealed without affecting how surrounding words sound. While there is no single-best method to disentangle all types of factors, some end-to-end approaches demonstrate a promising degree of generalization to diverse speech tasks.
This thesis explores a variety of use-cases for disentangled representations including phone recognition, speaker diarization, linguistic code-switching, voice conversion, and content-based privacy masking. Speech representations can also be utilised for automatically assessing the quality and authenticity of speech, such as automatic MOS ratings or detecting deep fakes. The meaning of the term "disentanglement" is not well defined in previous work, and it has acquired several meanings depending on the domain (e.g. image vs. speech). Sometimes the term "disentanglement" is used interchangeably with the term "factorization". This thesis proposes that disentanglement of speech is distinct, and offers a viewpoint of disentanglement that can be considered both theoretically and practically
Building body identities - exploring the world of female bodybuilders
This thesis explores how female bodybuilders seek to develop and maintain a viable sense of self despite being stigmatized by the gendered foundations of what Erving Goffman (1983) refers to as the 'interaction order'; the unavoidable presentational context in which identities are forged during the course of social life. Placed in the context of an overview of the historical treatment of women's bodies, and a concern with the development of bodybuilding as a specific form of body modification, the research draws upon a unique two year ethnographic study based in the South of England, complemented by interviews with twenty-six female bodybuilders, all of whom live in the U.K. By mapping these extraordinary women's lives, the research illuminates the pivotal spaces and essential lived experiences that make up the female bodybuilder. Whilst the women appear to be embarking on an 'empowering' radical body project for themselves, the consequences of their activity remains culturally ambivalent. This research exposes the 'Janus-faced' nature of female bodybuilding, exploring the ways in which the women negotiate, accommodate and resist pressures to engage in more orthodox and feminine activities and appearances
Recommended from our members
Co-design As Healing: Exploring The Experiences Of Participants Facing Mental Health Problems
This thesis is an exploration of the healing role of co-design in mental health. Although co-design projects conducted within mental health settings are rising, existing literature tends to focus on the object of design and its outcomes while the experiences of participants per se remain largely unexplored. The guiding research question of this study is not how we design things that improve mental health, but how co-designing, as an act, might do so.
The thesis presents two projects that were organized in collaboration with the mental health charity Islington Mind and the Psychosis Therapy Project (PTP) in London.
The project at Islington Mind used a structured design process inviting participants to design for wellbeing. A case study analysis provides insights on how participants were impacted, summarizing key challenges and opportunities.
The design at PTP worked towards creating a collective brief in an emergent fashion, finally culminating in a board game. The experiences of participants were explored through Interpretative Phenomenological Analysis (IPA), using semi-structured interview data. The analysis served to identify key themes characterising the experience of co-design such as contributing, connecting, thinking and intentioning. In addition, a mixed-methods analysis of questionnaires and interview data exploring participants' wellbeing, showed that all participants who engaged fairly consistently in the project improved after the project ended, although some participants' scores returned to baseline six months later.
Reflecting on both projects, an approach to facilitation within mental health is outlined, detailing how the dimensions of weaving and layered participation, nurturing mattering and facilitating attitudes interlace. This contribution raises awareness of tacit dimensions in the practice of facilitation, articulating the nuances of how to encourage and sustain meaningful and ethical engagement and offering insights into a range of tools. It highlights the importance of remaining reflexive in relation to attitudes and emotions and discusses practical methodological and ethical challenges and ways to resolve them which can be of benefit to researchers embarking on a similar journey.
The thesis also offers detailed insights on how methodologies from different fields were integrated into a whole, arguing for transparency and reflexivity about epistemological assumptions, and how underlying paradigms shift in an interdisciplinary context.
Based on the overall findings, the thesis makes a case for considering design as healing (or a designerly way of healing), highlighting implications at a systems, social and individual level. It makes an original contribution to our understanding of design, highlighting its healing character, and proposes a new way to support mental health. The participants in this study not only had increased their own wellbeing through co-designing, but were also empowered and contributed towards healing the world. Hence, the thesis argues for a unique, holistic perspective of design and mental health, recognizing the interconnectedness of the individual, social and systemic dimensions of the healing processes that are ignited
Data-to-text generation with neural planning
In this thesis, we consider the task of data-to-text generation, which takes non-linguistic
structures as input and produces textual output. The inputs can take the form of
database tables, spreadsheets, charts, and so on. The main application of data-to-text
generation is to present information in a textual format which makes it accessible to
a layperson who may otherwise find it problematic to understand numerical figures.
The task can also automate routine document generation jobs, thus improving human
efficiency. We focus on generating long-form text, i.e., documents with multiple paragraphs. Recent approaches to data-to-text generation have adopted the very successful
encoder-decoder architecture or its variants. These models generate fluent (but often
imprecise) text and perform quite poorly at selecting appropriate content and ordering
it coherently. This thesis focuses on overcoming these issues by integrating content
planning with neural models. We hypothesize data-to-text generation will benefit from
explicit planning, which manifests itself in (a) micro planning, (b) latent entity planning, and (c) macro planning. Throughout this thesis, we assume the input to our
generator are tables (with records) in the sports domain. And the output are summaries
describing what happened in the game (e.g., who won/lost, ..., scored, etc.).
We first describe our work on integrating fine-grained or micro plans with data-to-text generation. As part of this, we generate a micro plan highlighting which records
should be mentioned and in which order, and then generate the document while taking
the micro plan into account.
We then show how data-to-text generation can benefit from higher level latent entity planning. Here, we make use of entity-specific representations which are dynam ically updated. The text is generated conditioned on entity representations and the
records corresponding to the entities by using hierarchical attention at each time step.
We then combine planning with the high level organization of entities, events, and
their interactions. Such coarse-grained macro plans are learnt from data and given
as input to the generator. Finally, we present work on making macro plans latent
while incrementally generating a document paragraph by paragraph. We infer latent
plans sequentially with a structured variational model while interleaving the steps of
planning and generation. Text is generated by conditioning on previous variational
decisions and previously generated text.
Overall our results show that planning makes data-to-text generation more interpretable, improves the factuality and coherence of the generated documents and re duces redundancy in the output document
- …