1,583 research outputs found

    Smart speaker design and implementation with biometric authentication and advanced voice interaction capability

    Full text link
    Advancements in semiconductor technology have reduced dimensions and cost while improving the performance and capacity of chipsets. In addition, advancement in the AI frameworks and libraries brings possibilities to accommodate more AI at the resource-constrained edge of consumer IoT devices. Sensors are nowadays an integral part of our environment which provide continuous data streams to build intelligent applications. An example could be a smart home scenario with multiple interconnected devices. In such smart environments, for convenience and quick access to web-based service and personal information such as calendars, notes, emails, reminders, banking, etc, users link third-party skills or skills from the Amazon store to their smart speakers. Also, in current smart home scenarios, several smart home products such as smart security cameras, video doorbells, smart plugs, smart carbon monoxide monitors, and smart door locks, etc. are interlinked to a modern smart speaker via means of custom skill addition. Since smart speakers are linked to such services and devices via the smart speaker user's account. They can be used by anyone with physical access to the smart speaker via voice commands. If done so, the data privacy, home security and other aspects of the user get compromised. Recently launched, Tensor Cam's AI Camera, Toshiba's Symbio, Facebook's Portal are camera-enabled smart speakers with AI functionalities. Although they are camera-enabled, yet they do not have an authentication scheme in addition to calling out the wake-word. This paper provides an overview of cybersecurity risks faced by smart speaker users due to lack of authentication scheme and discusses the development of a state-of-the-art camera-enabled, microphone array-based modern Alexa smart speaker prototype to address these risks

    On the design and implementation of a high definition multi-view intelligent video surveillance system

    Get PDF
    This paper proposes a distributed architecture for high definition (HD) multi-view video surveillance system. It adopts a modular design where multiple intelligent Internet Protocol (IP)-based video surveillance cameras are connected to a local video server. Each server is equipped with storage and optional graphics processing units (GPUs) for supporting high-level video analytics and processing algorithms such as real-time decoding and tracking for the video captured. The servers are connected to the IP network for supporting distributed processing and remote data access. The DSP-based surveillance camera is equipped with realtime algorithms for streaming compressed videos to the server and performing simple video analytics functions. We also developed video analytics algorithms for security monitoring. Both publicly available data set and real video data that are captured under indoor and outdoor scenarios are used to validate our algorithms. Experimental results show that our distributed system can support real-time video applications with high definition resolution.published_or_final_versio

    Localization of sound sources : a systematic review

    Get PDF
    Sound localization is a vast field of research and advancement which is used in many useful applications to facilitate communication, radars, medical aid, and speech enhancement to but name a few. Many different methods are presented in recent times in this field to gain benefits. Various types of microphone arrays serve the purpose of sensing the incoming sound. This paper presents an overview of the importance of using sound localization in different applications along with the use and limitations of ad-hoc microphones over other microphones. In order to overcome these limitations certain approaches are also presented. Detailed explanation of some of the existing methods that are used for sound localization using microphone arrays in the recent literature is given. Existing methods are studied in a comparative fashion along with the factors that influence the choice of one method over the others. This review is done in order to form a basis for choosing the best fit method for our use

    A Cost Shared Quantization Algorithm and its Implementation for Multi-Standard Video CODECS

    Get PDF
    The current trend of digital convergence creates the need for the video encoder and decoder system, known as codec in short, that should support multiple video standards on a single platform. In a modern video codec, quantization is a key unit used for video compression. In this thesis, a generalized quantization algorithm and hardware implementation is presented to compute quantized coefficient for six different video codecs including the new developing codec High Efficiency Video Coding (HEVC). HEVC, successor to H.264/MPEG-4 AVC, aims to substantially improve coding efficiency compared to AVC High Profile. The thesis presents a high performance circuit shared architecture that can perform the quantization operation for HEVC, H.264/AVC, AVS, VC-1, MPEG- 2/4 and Motion JPEG (MJPEG). Since HEVC is still in drafting stage, the architecture was designed in such a way that any final changes can be accommodated into the design. The proposed quantizer architecture is completely division free as the division operation is replaced by multiplication, shift and addition operations. The design was implemented on FPGA and later synthesized in CMOS 0.18 μm technology. The results show that the proposed design satisfies the requirement of all codecs with a maximum decoding capability of 60 fps at 187.3 MHz for Xilinx Virtex4 LX60 FPGA of a 1080p HD video. The scheme is also suitable for low-cost implementation in modern multi-codec systems

    Functional requirements document for the Earth Observing System Data and Information System (EOSDIS) Scientific Computing Facilities (SCF) of the NASA/MSFC Earth Science and Applications Division, 1992

    Get PDF
    Five scientists at MSFC/ESAD have EOS SCF investigator status. Each SCF has unique tasks which require the establishment of a computing facility dedicated to accomplishing those tasks. A SCF Working Group was established at ESAD with the charter of defining the computing requirements of the individual SCFs and recommending options for meeting these requirements. The primary goal of the working group was to determine which computing needs can be satisfied using either shared resources or separate but compatible resources, and which needs require unique individual resources. The requirements investigated included CPU-intensive vector and scalar processing, visualization, data storage, connectivity, and I/O peripherals. A review of computer industry directions and a market survey of computing hardware provided information regarding important industry standards and candidate computing platforms. It was determined that the total SCF computing requirements might be most effectively met using a hierarchy consisting of shared and individual resources. This hierarchy is composed of five major system types: (1) a supercomputer class vector processor; (2) a high-end scalar multiprocessor workstation; (3) a file server; (4) a few medium- to high-end visualization workstations; and (5) several low- to medium-range personal graphics workstations. Specific recommendations for meeting the needs of each of these types are presented

    Comparative analysis of DIRAC PRO-VC-2, H.264 AVC and AVS CHINA-P7

    Get PDF
    Video codec compresses the input video source to reduce storage and transmission bandwidth requirements while maintaining the quality. It is an essential technology for applications, to name a few such as digital television, DVD-Video, mobile TV, videoconferencing and internet video streaming. There are different video codecs used in the industry today and understanding their operation to target certain video applications is the key to optimization. The latest advanced video codec standards have become of great importance in multimedia industries which provide cost-effective encoding and decoding of video and contribute for high compression and efficiency. Currently, H.264 AVC, AVS, and DIRAC are used in the industry to compress video. H.264 codec standard developed by the ITU-T Video Coding Experts Group (VCEG) together with the ISO/IEC Moving Picture Experts Group (MPEG). Audio-video coding standard (AVS) is a working group of audio and video coding standard in China. VC-2, also known as Dirac Pro developed by BBC, is a royalty free technology that anyone can use and has been standardized through the SMPTE as VC-2. H.264 AVC, Dirac Pro, Dirac and AVS-P2 are dedicated to High Definition Video, while AVS-P7 is to mobile video. Out of many standards, this work performs a comparative analysis for the H.264 AVC, DIRAC PRO/SMPTE-VC-2 and AVS-P7 standards in low bitrate region and high bitrate region. Bitrate control and constant QP are the methods which are employed for analysis. Evaluation parameters like Compression Ratio, PSNR and SSIM are used for quality comparison. Depending on target application and available bitrate, order of performance is mentioned to show the preferred codec

    Human-centered 2D/3D Video Content Analysis and Description

    Get PDF
    In this paper, we propose a way of using the AudioVisual Description Profile (AVDP) of the MPEG-7 standard for stereo video and multichannel audio content description. Our aim is to provide means of using AVDP in such a way, that 3D video and audio content can be correctly and consistently described. Since AVDP semantics do not include ways for dealing with 3D audiovisual content, a new semantic framework within AVDP is proposed and examples of using AVDP to describe the results of analysis algorithms on stereo video and multichannel audio content are presented
    corecore