4,629 research outputs found
Low Frequency Ultrasonic Voice Activity Detection using Convolutional Neural Networks
Low frequency ultrasonic mouth state detection uses reflected audio chirps from the face in the region of the mouth to determine lip state, whether open, closed or partially open.
The chirps are located in a frequency range just above the threshold of human hearing and are thus both inaudible as well as unaffected by interfering speech, yet can be produced and sensed using inexpensive equipment.
To determine mouth open or closed state, and hence form a measure of voice activity detection, this recently invented technique relies upon the difference in the reflected chirp caused by resonances introduced by the open or partially open mouth cavity.
Voice activity is then inferred from lip state through patterns of mouth movement, in a similar way to video-based lip-reading technologies.
This paper introduces a new metric based on spectrogram features extracted from the reflected chirp, with a convolutional neural network classification back-end, that yields excellent performance without needing the periodic resetting of the template closed-mouth reflection required by the original technique
Inaudible Adversarial Perturbation: Manipulating the Recognition of User Speech in Real Time
Automatic speech recognition (ASR) systems have been shown to be vulnerable
to adversarial examples (AEs). Recent success all assumes that users will not
notice or disrupt the attack process despite the existence of music/noise-like
sounds and spontaneous responses from voice assistants. Nonetheless, in
practical user-present scenarios, user awareness may nullify existing attack
attempts that launch unexpected sounds or ASR usage. In this paper, we seek to
bridge the gap in existing research and extend the attack to user-present
scenarios. We propose VRIFLE, an inaudible adversarial perturbation (IAP)
attack via ultrasound delivery that can manipulate ASRs as a user speaks. The
inherent differences between audible sounds and ultrasounds make IAP delivery
face unprecedented challenges such as distortion, noise, and instability. In
this regard, we design a novel ultrasonic transformation model to enhance the
crafted perturbation to be physically effective and even survive long-distance
delivery. We further enable VRIFLE's robustness by adopting a series of
augmentation on user and real-world variations during the generation process.
In this way, VRIFLE features an effective real-time manipulation of the ASR
output from different distances and under any speech of users, with an
alter-and-mute strategy that suppresses the impact of user disruption. Our
extensive experiments in both digital and physical worlds verify VRIFLE's
effectiveness under various configurations, robustness against six kinds of
defenses, and universality in a targeted manner. We also show that VRIFLE can
be delivered with a portable attack device and even everyday-life loudspeakers.Comment: Accepted by NDSS Symposium 202
Recognition of activities of daily living
Activities of daily living (ADL) are things we normally do in daily living, including any daily activity such as feeding ourselves, bathing, dressing, grooming, work, homemaking, and leisure. The ability or inability to perform ADLs can be used as a very practical measure of human capability in many types of disorder and disability. Oftentimes in a health care facility, with the help of observations by nurses and self-reporting by residents, professional staff manually collect ADL data and enter data into the system.
Technologies in smart homes can provide some solutions to detecting and monitoring a resident’s ADL. Typically multiple sensors can be deployed, such as surveillance cameras in the smart home environment, and contacted sensors affixed to the resident’s body. Note that the traditional technologies incur costly and laborious sensor deployment, and cause uncomfortable feeling of contacted sensors with increased inconvenience.
This work presents a novel system facilitated via mobile devices to collect and analyze mobile data pertaining to the human users’ ADL. By employing only one smart phone, this system, named ADL recognition system, significantly reduces set-up costs and saves manpower.
It encapsulates rather sophisticated technologies under the hood, such as an agent-based information management platform integrating both the mobile end and the cloud, observer patterns and a time-series based motion analysis mechanism over sensory data. As a single-point deployment system, ADL recognition system provides further benefits that enable the replay of users’ daily ADL routines, in addition to the timely assessment of their life habits
SmartMirror: A Glance into the Future
In todays society, information is available to us at a glance through our phones, our laptops, our desktops, and more. But an extra level of interaction is required in order to access the information. As technology grows, technology should grow further and further away from the traditional style of interaction with devices. In the past, information was relayed through paper, then through computers, and in todays day and age, through our phones and multiple other mediums. Technology should become more integrated into our lives - more seamless and more invisible. We hope to push the envelope further, into the future. We propose a new simple way of connecting with your morning newspaper. We present our idea, the SmartMirror, information at a glance. Our system aims to deliver your information quickly and comfortably, with a new modern aesthetic. While modern appliances require input through modules such as keyboards or touch screen, we hope to follow a model that can function purely on voice and gesture. We seek to deliver your information during your morning routine and throughout the day, when taking out your phone is not always possible. This will cater to a larger audience base, as the average consumer nowadays hopes to accomplish tasks with minimal active interaction with their adopted technology. This idea has many future applications, such as integration with new virtual or augmented reality devices, or simplifying consumer personal media sources
FIRE AND LIFE SAFETY ANALYSIS BONDERSON ENGINEERING PROJETS CENTER
A Fire and Life Safety Analysis was performed as one of the requirements for the Master of Science Degree in Fire Protection Engineering from California Polytechnic State University San Luis Obispo. The Fire and Life Safety Analysis consists of a prescriptive analysis as well as a performance based analysis. These analyses were performed on the Bonderson Engineering Projects Center which is part of Cal Poly San Luis Obispo. The prescriptive analysis consisted of the four following parts: Egress Analysis and Design, Fire Detection and Alarm Systems, Water-based Fire Suppression, and Structural Fire Protection. The purpose of the prescriptive analysis was to determine if the Bonderson Engineering Projects Center adhered to the codes and standards applicable to the building. The prescriptive analysis was performed using primarily the 2013 edition of the California Building Code (CBC) along with the 2013 editions of NFPA codes. The egress analysis and design met most of the code requirements. One area that the Bonderson Engineering Projects Center did not meet was door swing direction. Room 104 (See Appendix A for building layout) was originally an office classification, but since construction has been utilized as an assembly space. The decreased occupant load factor resulted in a new occupant load which is greater than 50 persons. Per CBC 1008.1.2 exit doors must swing in the direction of egress travel where serving a room or area containing an occupant load of 50 or more persons, which the building does not adhere to. The fire detection and alarm systems analysis was performed primarily utilizing NFPA 72. The building had multiple shortcomings in regards to spacing gaps of the detection devices. These shortcomings were found on the first and second floor, including the lobby, robotics room, project integration room and computer cluster room. The water-based fire suppression system analysis was performed primarily utilizing NFPA 13 and NFPA 25. The water supply and sprinkler system are acceptable. The structural fire protection analysis was performed primarily utilizing the CBC. The main shortcoming discovered was in relation to the atrium. The building must have a 1 hour fire barrier separating atrium spaces from adjacent spaces or it must provide an acceptable smoke control system. The building provides neither of these provisions. The performance based analysis was performed in order to ascertain the ability for the occupant of a building to evacuate safely in the event of a fire. Two separate fire scenarios were evaluated using Fire Dynamics Simulator (FDS) and Pathfinder. Tenability criteria was determined and used in conjunction with FDS in order to determine the available safe egress time (ASET). This was compared against the required safe egress time (RSET) which was determined using Pathfinder. The RSET time was greater than the ASET time, meaning occupants would not be able to safely evacuate the building in the event of an emergency
Sonic stuff : objects and objectiles
PhD ThesisThis thesis investigates the role of objects in creative practice as alluring and evocative
materials that disrupt compositional intentions and trajectories. This research does not
begin from music as a cultural text but rather from the deeper experiences of sound as
resistant materials that animate experiential space with their own styles of atmosphere,
ambience and inaudible-audible signatures. Working across and often at the peripheries
of the theoretical disciplines of object orientated ontology and process philosophy I
address the philosophical issue of how sounds and objects possess the potential to
unsettle, agitate and reconfigure networks of relation.
Practice has informed a hybridisation of concepts derived from various disciplines,
which are held together by threads of fictionalised prose that contribute alternative
insights into the field of studio-based composition. This research employs a
phenomenological method of reduction and at times an object orientated approach in
theorising the autonomous life of sounds and objects. Dense descriptions of
experiences, observations, thoughts and poetics form the basis for developing an
informed creative treatise. Deviating descriptions of sensuous experiences are
deployed throughout this research in order to find personal and meaningful ways of
articulating sonic encounter.
What are the multiple contours of Sonic Stuff? Is there an identity of sonic potential?
What tensions/relations occur between the composer, studio and sonic object? In what
form does Sonic Stuff reveal and characterise experiential time and space? What do the
concepts of the withdrawn and revealed afford an understanding of sonic objects and
sound in-itself
Forensic and Automatic Speaker Recognition System
Current Automatic Speaker Recognition (ASR) System has emerged as an important medium of confirmation of identity in many businesses, ecommerce applications, forensics and law enforcement as well. Specialists trained in criminological recognition can play out this undertaking far superior by looking at an arrangement of acoustic, prosodic, and semantic attributes which has been referred to as structured listening. An algorithmbased system has been developed in the recognition of forensic speakers by physics scientists and forensic linguists to reduce the probability of a contextual bias or pre-centric understanding of a reference model with the validity of an unknown audio sample and any suspicious individual. Many researchers are continuing to develop automatic algorithms in signal processing and machine learning so that improving performance can effectively introduce the speaker’s identity, where the automatic system performs equally with the human audience. In this paper, I examine the literature about the identification of speakers by machines and humans, emphasizing the key technical speaker pattern emerging for the automatic technology in the last decade. I focus on many aspects of automatic speaker recognition (ASR) systems, including speaker-specific features, speaker models, standard assessment data sets, and performance metric
Military Maintenance Hangar
This report describes the building code compliance of a military aircraft hangar. The facility is evaluated for both prescriptive code compliance and for a performance-based compliance via the use of design fires analyzed with computer programs Fire Dynamics Simulator (FDS) and occupant egress analyzed with Pathfinder software as well as hand calculations utilizing Excel.
The prescriptive code analysis is done based upon a combination of Unified Facilities Criteria, International Building Code, and NFPA 101 codes and standards. Prescriptive compliance was checked against the building non separated occupancies of S-1 and B. The building was evaluated for building construction type IIB. The separations, fire ratings, egress component sizing and spacing, fire alarm, and fire suppression systems were all evaluated.
The building is compliant with all the prescriptive standards. All the building systems and construction details were prescriptively compliant with the building codes of record.
The performance-based design has objectives of verifying that the as designed building configuration and occupancy will be provided with an environment for the occupants that is reasonably safe from fire. The objectives were compared to tenability criteria limits of 1,400 ppm of CO concentration, temperature limit of 60°C, and a visibility limit of 10 meters. The building egress time was evaluated with a stairwell inaccessible due to the fire being near the stair door on the 2nd floor of the facility. The hangar areas of the building were evaluated for asset protection of the aircraft housed within. The asset protection analysis involved verification of when the alarm and suppression systems dealt with the fire and determining the highest heat release rate achieved and flame height developed for a pool fire that is generated during the 65 seconds between ignition and the aircraft silhouette being covered by high expansion foam.
The final analysis from FDS shows an ASET of 330 seconds. The final analysis of RSET utilizing Pathfinder and assumed premovement times of 162 seconds and egress time of 186 seconds, indicates an RSET time of 348 seconds. The building does not provide an environment where the ASET is greater than the RSET. The analysis evaluated in this report is very conservative in nature and does not account for occupant reactions to the fire beyond egressing such as closing the door of the room of origin or pulling a pull station prior to the sprinkler setting off the alarm. Due to this the ASET and RSET values are very conservative in their application and any adjustment to the modeling will lead to a greater difference between the ASET and RSET. The performance-based design meets the goals presented since the worst case scenario and it shows that the ASET is within 5% of the RSET. With a reevaluation of the modeling utilizing less conservative tenability criteria and taking into account the reactions of trained facility occupants it is possible to have the ASET become greater than the RSET. This should be evaluated in the future.
The asset protection analysis shows that an evaluated maximum fuel spill of 30 gallons can generate a 48 MW fire with a 10 m flame height. The 10 m flame height will impinge upon the fuselage height of approximately 1 m and under wing height of approximately 2.5 m. The aircraft will sustain damage during the fire
- …