4,629 research outputs found

    Low Frequency Ultrasonic Voice Activity Detection using Convolutional Neural Networks

    Get PDF
    Low frequency ultrasonic mouth state detection uses reflected audio chirps from the face in the region of the mouth to determine lip state, whether open, closed or partially open. The chirps are located in a frequency range just above the threshold of human hearing and are thus both inaudible as well as unaffected by interfering speech, yet can be produced and sensed using inexpensive equipment. To determine mouth open or closed state, and hence form a measure of voice activity detection, this recently invented technique relies upon the difference in the reflected chirp caused by resonances introduced by the open or partially open mouth cavity. Voice activity is then inferred from lip state through patterns of mouth movement, in a similar way to video-based lip-reading technologies. This paper introduces a new metric based on spectrogram features extracted from the reflected chirp, with a convolutional neural network classification back-end, that yields excellent performance without needing the periodic resetting of the template closed-mouth reflection required by the original technique

    Inaudible Adversarial Perturbation: Manipulating the Recognition of User Speech in Real Time

    Full text link
    Automatic speech recognition (ASR) systems have been shown to be vulnerable to adversarial examples (AEs). Recent success all assumes that users will not notice or disrupt the attack process despite the existence of music/noise-like sounds and spontaneous responses from voice assistants. Nonetheless, in practical user-present scenarios, user awareness may nullify existing attack attempts that launch unexpected sounds or ASR usage. In this paper, we seek to bridge the gap in existing research and extend the attack to user-present scenarios. We propose VRIFLE, an inaudible adversarial perturbation (IAP) attack via ultrasound delivery that can manipulate ASRs as a user speaks. The inherent differences between audible sounds and ultrasounds make IAP delivery face unprecedented challenges such as distortion, noise, and instability. In this regard, we design a novel ultrasonic transformation model to enhance the crafted perturbation to be physically effective and even survive long-distance delivery. We further enable VRIFLE's robustness by adopting a series of augmentation on user and real-world variations during the generation process. In this way, VRIFLE features an effective real-time manipulation of the ASR output from different distances and under any speech of users, with an alter-and-mute strategy that suppresses the impact of user disruption. Our extensive experiments in both digital and physical worlds verify VRIFLE's effectiveness under various configurations, robustness against six kinds of defenses, and universality in a targeted manner. We also show that VRIFLE can be delivered with a portable attack device and even everyday-life loudspeakers.Comment: Accepted by NDSS Symposium 202

    Recognition of activities of daily living

    Get PDF
    Activities of daily living (ADL) are things we normally do in daily living, including any daily activity such as feeding ourselves, bathing, dressing, grooming, work, homemaking, and leisure. The ability or inability to perform ADLs can be used as a very practical measure of human capability in many types of disorder and disability. Oftentimes in a health care facility, with the help of observations by nurses and self-reporting by residents, professional staff manually collect ADL data and enter data into the system. Technologies in smart homes can provide some solutions to detecting and monitoring a resident’s ADL. Typically multiple sensors can be deployed, such as surveillance cameras in the smart home environment, and contacted sensors affixed to the resident’s body. Note that the traditional technologies incur costly and laborious sensor deployment, and cause uncomfortable feeling of contacted sensors with increased inconvenience. This work presents a novel system facilitated via mobile devices to collect and analyze mobile data pertaining to the human users’ ADL. By employing only one smart phone, this system, named ADL recognition system, significantly reduces set-up costs and saves manpower. It encapsulates rather sophisticated technologies under the hood, such as an agent-based information management platform integrating both the mobile end and the cloud, observer patterns and a time-series based motion analysis mechanism over sensory data. As a single-point deployment system, ADL recognition system provides further benefits that enable the replay of users’ daily ADL routines, in addition to the timely assessment of their life habits

    SmartMirror: A Glance into the Future

    Get PDF
    In todays society, information is available to us at a glance through our phones, our laptops, our desktops, and more. But an extra level of interaction is required in order to access the information. As technology grows, technology should grow further and further away from the traditional style of interaction with devices. In the past, information was relayed through paper, then through computers, and in todays day and age, through our phones and multiple other mediums. Technology should become more integrated into our lives - more seamless and more invisible. We hope to push the envelope further, into the future. We propose a new simple way of connecting with your morning newspaper. We present our idea, the SmartMirror, information at a glance. Our system aims to deliver your information quickly and comfortably, with a new modern aesthetic. While modern appliances require input through modules such as keyboards or touch screen, we hope to follow a model that can function purely on voice and gesture. We seek to deliver your information during your morning routine and throughout the day, when taking out your phone is not always possible. This will cater to a larger audience base, as the average consumer nowadays hopes to accomplish tasks with minimal active interaction with their adopted technology. This idea has many future applications, such as integration with new virtual or augmented reality devices, or simplifying consumer personal media sources

    FIRE AND LIFE SAFETY ANALYSIS BONDERSON ENGINEERING PROJETS CENTER

    Get PDF
    A Fire and Life Safety Analysis was performed as one of the requirements for the Master of Science Degree in Fire Protection Engineering from California Polytechnic State University San Luis Obispo. The Fire and Life Safety Analysis consists of a prescriptive analysis as well as a performance based analysis. These analyses were performed on the Bonderson Engineering Projects Center which is part of Cal Poly San Luis Obispo. The prescriptive analysis consisted of the four following parts: Egress Analysis and Design, Fire Detection and Alarm Systems, Water-based Fire Suppression, and Structural Fire Protection. The purpose of the prescriptive analysis was to determine if the Bonderson Engineering Projects Center adhered to the codes and standards applicable to the building. The prescriptive analysis was performed using primarily the 2013 edition of the California Building Code (CBC) along with the 2013 editions of NFPA codes. The egress analysis and design met most of the code requirements. One area that the Bonderson Engineering Projects Center did not meet was door swing direction. Room 104 (See Appendix A for building layout) was originally an office classification, but since construction has been utilized as an assembly space. The decreased occupant load factor resulted in a new occupant load which is greater than 50 persons. Per CBC 1008.1.2 exit doors must swing in the direction of egress travel where serving a room or area containing an occupant load of 50 or more persons, which the building does not adhere to. The fire detection and alarm systems analysis was performed primarily utilizing NFPA 72. The building had multiple shortcomings in regards to spacing gaps of the detection devices. These shortcomings were found on the first and second floor, including the lobby, robotics room, project integration room and computer cluster room. The water-based fire suppression system analysis was performed primarily utilizing NFPA 13 and NFPA 25. The water supply and sprinkler system are acceptable. The structural fire protection analysis was performed primarily utilizing the CBC. The main shortcoming discovered was in relation to the atrium. The building must have a 1 hour fire barrier separating atrium spaces from adjacent spaces or it must provide an acceptable smoke control system. The building provides neither of these provisions. The performance based analysis was performed in order to ascertain the ability for the occupant of a building to evacuate safely in the event of a fire. Two separate fire scenarios were evaluated using Fire Dynamics Simulator (FDS) and Pathfinder. Tenability criteria was determined and used in conjunction with FDS in order to determine the available safe egress time (ASET). This was compared against the required safe egress time (RSET) which was determined using Pathfinder. The RSET time was greater than the ASET time, meaning occupants would not be able to safely evacuate the building in the event of an emergency

    Sonic stuff : objects and objectiles

    Get PDF
    PhD ThesisThis thesis investigates the role of objects in creative practice as alluring and evocative materials that disrupt compositional intentions and trajectories. This research does not begin from music as a cultural text but rather from the deeper experiences of sound as resistant materials that animate experiential space with their own styles of atmosphere, ambience and inaudible-audible signatures. Working across and often at the peripheries of the theoretical disciplines of object orientated ontology and process philosophy I address the philosophical issue of how sounds and objects possess the potential to unsettle, agitate and reconfigure networks of relation. Practice has informed a hybridisation of concepts derived from various disciplines, which are held together by threads of fictionalised prose that contribute alternative insights into the field of studio-based composition. This research employs a phenomenological method of reduction and at times an object orientated approach in theorising the autonomous life of sounds and objects. Dense descriptions of experiences, observations, thoughts and poetics form the basis for developing an informed creative treatise. Deviating descriptions of sensuous experiences are deployed throughout this research in order to find personal and meaningful ways of articulating sonic encounter. What are the multiple contours of Sonic Stuff? Is there an identity of sonic potential? What tensions/relations occur between the composer, studio and sonic object? In what form does Sonic Stuff reveal and characterise experiential time and space? What do the concepts of the withdrawn and revealed afford an understanding of sonic objects and sound in-itself

    Forensic and Automatic Speaker Recognition System

    Get PDF
    Current Automatic Speaker Recognition (ASR) System has emerged as an important medium of confirmation of identity in many businesses, ecommerce applications, forensics and law enforcement as well. Specialists trained in criminological recognition can play out this undertaking far superior by looking at an arrangement of acoustic, prosodic, and semantic attributes which has been referred to as structured listening. An algorithmbased system has been developed in the recognition of forensic speakers by physics scientists and forensic linguists to reduce the probability of a contextual bias or pre-centric understanding of a reference model with the validity of an unknown audio sample and any suspicious individual. Many researchers are continuing to develop automatic algorithms in signal processing and machine learning so that improving performance can effectively introduce the speaker’s identity, where the automatic system performs equally with the human audience. In this paper, I examine the literature about the identification of speakers by machines and humans, emphasizing the key technical speaker pattern emerging for the automatic technology in the last decade. I focus on many aspects of automatic speaker recognition (ASR) systems, including speaker-specific features, speaker models, standard assessment data sets, and performance metric

    Military Maintenance Hangar

    Get PDF
    This report describes the building code compliance of a military aircraft hangar. The facility is evaluated for both prescriptive code compliance and for a performance-based compliance via the use of design fires analyzed with computer programs Fire Dynamics Simulator (FDS) and occupant egress analyzed with Pathfinder software as well as hand calculations utilizing Excel. The prescriptive code analysis is done based upon a combination of Unified Facilities Criteria, International Building Code, and NFPA 101 codes and standards. Prescriptive compliance was checked against the building non separated occupancies of S-1 and B. The building was evaluated for building construction type IIB. The separations, fire ratings, egress component sizing and spacing, fire alarm, and fire suppression systems were all evaluated. The building is compliant with all the prescriptive standards. All the building systems and construction details were prescriptively compliant with the building codes of record. The performance-based design has objectives of verifying that the as designed building configuration and occupancy will be provided with an environment for the occupants that is reasonably safe from fire. The objectives were compared to tenability criteria limits of 1,400 ppm of CO concentration, temperature limit of 60°C, and a visibility limit of 10 meters. The building egress time was evaluated with a stairwell inaccessible due to the fire being near the stair door on the 2nd floor of the facility. The hangar areas of the building were evaluated for asset protection of the aircraft housed within. The asset protection analysis involved verification of when the alarm and suppression systems dealt with the fire and determining the highest heat release rate achieved and flame height developed for a pool fire that is generated during the 65 seconds between ignition and the aircraft silhouette being covered by high expansion foam. The final analysis from FDS shows an ASET of 330 seconds. The final analysis of RSET utilizing Pathfinder and assumed premovement times of 162 seconds and egress time of 186 seconds, indicates an RSET time of 348 seconds. The building does not provide an environment where the ASET is greater than the RSET. The analysis evaluated in this report is very conservative in nature and does not account for occupant reactions to the fire beyond egressing such as closing the door of the room of origin or pulling a pull station prior to the sprinkler setting off the alarm. Due to this the ASET and RSET values are very conservative in their application and any adjustment to the modeling will lead to a greater difference between the ASET and RSET. The performance-based design meets the goals presented since the worst case scenario and it shows that the ASET is within 5% of the RSET. With a reevaluation of the modeling utilizing less conservative tenability criteria and taking into account the reactions of trained facility occupants it is possible to have the ASET become greater than the RSET. This should be evaluated in the future. The asset protection analysis shows that an evaluated maximum fuel spill of 30 gallons can generate a 48 MW fire with a 10 m flame height. The 10 m flame height will impinge upon the fuselage height of approximately 1 m and under wing height of approximately 2.5 m. The aircraft will sustain damage during the fire
    • …
    corecore