1,467 research outputs found

    DWT-DCT-Based Data Hiding for Speech Bandwidth Extension

    Get PDF
    The limited narrowband frequency range, about 300-3400Hz, used in telephone network channels results in less intelligible and poor-quality telephony speech. To address this drawback, a novel robust speech bandwidth extension using Discrete Wavelet Transform- Discrete Cosine Transform Based Data Hiding (DWTDCTBDH) is proposed. In this technique, the missing speech information is embedded in the narrowband speech signal. The embedded missing speech information is recovered steadily at the receiver end to generate a wideband speech of considerably better quality. The robustness of the proposed method to quantization and channel noises is confirmed by the mean square error test. The enhancement in the quality of reconstructed wideband speech of the proposed method over conventional methods is reasserted by subjective listening and objective tests

    Multiuser MIMO-OFDM for Next-Generation Wireless Systems

    No full text
    This overview portrays the 40-year evolution of orthogonal frequency division multiplexing (OFDM) research. The amelioration of powerful multicarrier OFDM arrangements with multiple-input multiple-output (MIMO) systems has numerous benefits, which are detailed in this treatise. We continue by highlighting the limitations of conventional detection and channel estimation techniques designed for multiuser MIMO OFDM systems in the so-called rank-deficient scenarios, where the number of users supported or the number of transmit antennas employed exceeds the number of receiver antennas. This is often encountered in practice, unless we limit the number of users granted access in the base stationโ€™s or radio portโ€™s coverage area. Following a historical perspective on the associated design problems and their state-of-the-art solutions, the second half of this treatise details a range of classic multiuser detectors (MUDs) designed for MIMO-OFDM systems and characterizes their achievable performance. A further section aims for identifying novel cutting-edge genetic algorithm (GA)-aided detector solutions, which have found numerous applications in wireless communications in recent years. In an effort to stimulate the cross pollination of ideas across the machine learning, optimization, signal processing, and wireless communications research communities, we will review the broadly applicable principles of various GA-assisted optimization techniques, which were recently proposed also for employment inmultiuser MIMO OFDM. In order to stimulate new research, we demonstrate that the family of GA-aided MUDs is capable of achieving a near-optimum performance at the cost of a significantly lower computational complexity than that imposed by their optimum maximum-likelihood (ML) MUD aided counterparts. The paper is concluded by outlining a range of future research options that may find their way into next-generation wireless systems

    Study to determine potential flight applications and human factors design guidelines for voice recognition and synthesis systems

    Get PDF
    A study was conducted to determine potential commercial aircraft flight deck applications and implementation guidelines for voice recognition and synthesis. At first, a survey of voice recognition and synthesis technology was undertaken to develop a working knowledge base. Then, numerous potential aircraft and simulator flight deck voice applications were identified and each proposed application was rated on a number of criteria in order to achieve an overall payoff rating. The potential voice recognition applications fell into five general categories: programming, interrogation, data entry, switch and mode selection, and continuous/time-critical action control. The ratings of the first three categories showed the most promise of being beneficial to flight deck operations. Possible applications of voice synthesis systems were categorized as automatic or pilot selectable and many were rated as being potentially beneficial. In addition, voice system implementation guidelines and pertinent performance criteria are proposed. Finally, the findings of this study are compared with those made in a recent NASA study of a 1995 transport concept

    ๋ฉ”๋ชจ๋ฆฌ ์ง‘์•ฝ์  ๊ธฐ๊ณ„ํ•™์Šต ์‘์šฉํ”„๋กœ๊ทธ๋žจ์„ ์œ„ํ•œ ๋””๋žจ ๊ธฐ๋ฐ˜ ํ”„๋กœ์„ธ์‹ฑ ์ธ ๋ฉ”๋ชจ๋ฆฌ ๋งˆ์ดํฌ๋กœ์•„ํ‚คํ…์ฒ˜

    Get PDF
    ํ•™์œ„๋…ผ๋ฌธ(๋ฐ•์‚ฌ) -- ์„œ์šธ๋Œ€ํ•™๊ต๋Œ€ํ•™์› : ์œตํ•ฉ๊ณผํ•™๊ธฐ์ˆ ๋Œ€ํ•™์› ์œตํ•ฉ๊ณผํ•™๋ถ€(์ง€๋Šฅํ˜•์œตํ•ฉ์‹œ์Šคํ…œ์ „๊ณต), 2022.2. ์•ˆ์ •ํ˜ธ.Recently, as research on neural networks has gained significant traction, a number of memory-intensive neural network models such as recurrent neural network (RNN) models and recommendation models are introduced to process various tasks. RNN models and recommendation models spend most of their execution time processing matrix-vector multiplication (MV-mul) and processing embedding layers, respectively. A fundamental primitive of embedding layers, tensor gather-and-reduction (GnR), gathers embedding vectors and then reduces them to a new embedding vector. Because the matrices in RNNs and the embedding tables in recommendation models have poor reusability and the ever-increasing sizes of the matrices and the embedding tables become too large to fit in the on-chip storage of devices, the performance and energy efficiency of MV-mul and GnR are determined by those of main-memory DRAM. Therefore, computing these operations within DRAM draws significant attention. In this dissertation, we first propose a main-memory architecture called MViD, which performs MV-mul by placing MAC units inside DRAM banks. For higher computational efficiency, we use a sparse matrix format and exploit quantization. Because of the limited power budget for DRAM devices, we implement the MAC units only on a portion of the DRAM banks. We architect MViD to slow down or pause MV-mul for concurrently processing memory requests from processors while satisfying the limited power budget. Our results show that MViD provides 7.2ร— higher throughput compared to the baseline system with four DRAM ranks (performing MV-mul in a chip-multiprocessor) while running inference of Deep Speech 2 with a memory-intensive workload. Then we propose TRiM, an NDP architecture for accelerating recommendation systems. Based on the observation that the DRAM datapath has a hierarchical tree structure, TRiM augments the DRAM datapath with "in-DRAM" reduction units at the DDR4/5 rank/bank-group/bank level. We modify the interface of DRAM to provide commands effectively to multiple reduction units running in parallel. We also propose a host-side architecture with hot embedding-vector replication to alleviate the load imbalance that arises across the reduction units. An optimal TRiM design based on DDR5 achieves up to a 7.7ร— and 3.9ร— speedup and reduces by 55% and 50% the energy consumption of the embedding vector gather and reduction over the baseline and the state-of-the-art NDP architecture with minimal area overhead equivalent to 2.66% of DRAM chips.์ตœ๊ทผ ๋งŽ์€ ์‹ ๊ฒฝ๋ง ์—ฐ๊ตฌ๋“ค์ด ๊ด€์‹ฌ์„ ๋ฐ›์œผ๋ฉด์„œ, RNN ๋ชจ๋ธ ํ˜น์€ ์ถ”์ฒœ ์‹œ์Šคํ…œ ๋ชจ๋ธ๊ณผ ๊ฐ™์€ ๋ฉ”๋ชจ๋ฆฌ ์ง‘์•ฝ์  ์‹ ๊ฒฝ๋ง ๋ชจ๋ธ๋“ค์ด ๋‹ค์–‘ํ•œ ์ž‘์—…์„ ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด์„œ ๋“ฑ์žฅํ•˜๊ณ ์žˆ๋‹ค. RNN ๋ชจ๋ธ๊ณผ ์ถ”์ฒœ ์‹œ์Šคํ…œ ๋ชจ๋ธ์€ ๋Œ€๋ถ€๋ถ„์˜ ์‹คํ–‰ ์‹œ๊ฐ„ ๋™์•ˆ ๊ฐ๊ฐ ํ–‰๋ ฌ-๋ฒกํ„ฐ ๊ณฑ์„ ์—ฐ์‚ฐํ•˜๊ณ  ์ž„๋ฒ ๋”ฉ ๋ ˆ์ด์–ด๋ฅผ ์ฒ˜๋ฆฌํ•œ๋‹ค. ์ž„๋ฒ ๋”ฉ ๋ ˆ์ด์–ด์˜ ๊ธฐ๋ณธ ์—ฐ์‚ฐ์ธ GnR ์—ฐ์‚ฐ์€ ์—ฌ๋Ÿฌ๊ฐœ์˜ ์ž„๋ฒ ๋”ฉ ๋ฒกํ„ฐ๋ฅผ ๋ชจ์€ ๋‹ค์Œ ์ด๋“ค์„ ํ•ฉ์น˜๋Š” ๋™์ž‘์„ ํ•œ๋‹ค. RNN ์ฒ˜๋ฆฌ์‹œ ํ•„์š”ํ•œ ํ–‰๋ ฌ๊ณผ ์ถ”์ฒœ ์‹œ์Šคํ…œ ๋ชจ๋ธ ์ฒ˜๋ฆฌ์‹œ ํ•„์š”ํ•œ ์ž„๋ฒ ๋”ฉ ํ…Œ์ด๋ธ”์€ ์žฌ์‚ฌ์šฉ์„ฑ์ด ๋‚ฎ๊ณ  ์ด๋“ค์˜ ํฌ๊ธฐ๋Š” ๊ณ„์† ์ฆ๊ฐ€ํ•˜์—ฌ ์˜จ์นฉ ์Šคํ† ๋ฆฌ์ง€์— ์ €์žฅ๋  ์ˆ˜ ์—†๊ธฐ ๋•Œ๋ฌธ์— ํ–‰๋ ฌ-๋ฒกํ„ฐ ๊ณฑ ๋ฐ GnR ์—ฐ์‚ฐ์˜ ์„ฑ๋Šฅ ๋ฐ ์—๋„ˆ์ง€ ํšจ์œจ์„ฑ์€ ์ฃผ ๋ฉ”๋ชจ๋ฆฌ DRAM์˜ ์„ฑ๋Šฅ ๋ฐ ์—๋„ˆ์ง€ ํšจ์œจ์„ฑ์— ์˜ํ•ด ๊ฒฐ์ •๋œ๋‹ค. ๋”ฐ๋ผ์„œ DRAM ๋‚ด์—์„œ ์ด๋Ÿฌํ•œ ์—ฐ์‚ฐ์„ ์ฒ˜๋ฆฌํ•˜๋Š” ๋ฐฉ์‹์ด ๊ด€์‹ฌ์„ ๋Œ๊ณ ์žˆ๋‹ค. ๋ณธ ๋…ผ๋ฌธ์—์„œ๋Š” ๋จผ์ € DRAM ๋ฑ…ํฌ ๋‚ด๋ถ€์— MAC ์œ ๋‹›์„ ๋ฐฐ์น˜ํ•˜์—ฌ ํ–‰๋ ฌ-๋ฒกํ„ฐ ๊ณฑ์„ ์ˆ˜ํ–‰ํ•˜๋Š” MViD๋ผ๋Š” ์ฃผ ๋ฉ”๋ชจ๋ฆฌ ๊ตฌ์กฐ๋ฅผ ์ œ์•ˆํ•œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๋” ๋†’์€ ๊ณ„์‚ฐ ํšจ์œจ์„ฑ์„ ์œ„ํ•ด ํฌ์†Œ ํ–‰๋ ฌ ํ˜•์‹์„ ์‚ฌ์šฉํ•˜๊ณ  ์–‘์žํ™”๋ฅผ ํ™œ์šฉํ•œ๋‹ค. DRAM ์žฅ์น˜๊ฐ€ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋Š” ์ œํ•œ๋œ ์ „๋ ฅ ๋•Œ๋ฌธ์— DRAM ๋ฑ…ํฌ์˜ ์ผ๋ถ€์—๋งŒ MAC ์žฅ์น˜๋ฅผ ๊ตฌํ˜„ํ•œ๋‹ค. ์ „๋ ฅ ์ œํ•œ ์กฐ๊ฑด์„ ์ถฉ์กฑํ•˜๋ฉด์„œ ํ”„๋กœ์„ธ์„œ์˜ ๋ฉ”๋ชจ๋ฆฌ ์š”์ฒญ์„ ๋™์‹œ์— ์ฒ˜๋ฆฌํ•˜๊ธฐ ์œ„ํ•ด ํ–‰๋ ฌ-๋ฒกํ„ฐ๊ณฑ์„ ๋Šฆ์ถ”๊ฑฐ๋‚˜ ์ผ์‹œ ์ค‘์ง€ํ•˜๋„๋ก MViD๋ฅผ ์„ค๊ณ„ํ•œ๋‹ค. ๊ทธ ๊ฒฐ๊ณผ๋กœ MViD๊ฐ€ ๋ฉ”๋ชจ๋ฆฌ ์ง‘์•ฝ์  ์›Œํฌ๋กœ๋“œ๋กœ Deep Speech 2์˜ ์ถ”๋ก ์„ ์‹คํ–‰ํ•˜๋ฉด์„œ 4๊ฐœ์˜ DRAM ๋žญํฌ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ํ”„๋กœ์„ธ์„œ์—์„œ ํ–‰๋ ฌ-๋ฒกํ„ฐ๊ณฑ์„ ์ฒ˜๋ฆฌํ•˜๋Š” ๊ธฐ์ค€ ์‹œ์Šคํ…œ์— ๋น„ํ•ด 7.2๋ฐฐ ๋” ๋†’์€ ์ฒ˜๋ฆฌ๋Ÿ‰์„ ์ œ๊ณตํ•œ๋‹ค๋Š” ๊ฒƒ์„ ๋ณด์—ฌ์ค€๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์šฐ๋ฆฌ๋Š” ์ถ”์ฒœ ์‹œ์Šคํ…œ์„ ๊ฐ€์†ํ•˜๊ธฐ ์œ„ํ•œ ๋ฉ”๋ชจ๋ฆฌ ๊ทผ์ฒ˜ ์ฒ˜๋ฆฌ ๊ตฌ์กฐ์ธ TRiM์„ ์ œ์•ˆํ•œ๋‹ค. DRAM ๋ฐ์ดํ„ฐ ๊ฒฝ๋กœ๊ฐ€ ๊ณ„์ธต์  ํŠธ๋ฆฌ ๊ตฌ์กฐ๋ฅผ ๊ฐ–๋Š”๋‹ค๋Š” ์‚ฌ์‹ค์„ ๊ธฐ๋ฐ˜์œผ๋กœ TRiM์€ DDR4/5 ๋žญํฌ/๋ฑ…ํฌ๊ทธ๋ฃน/๋ฑ…ํฌ ์ˆ˜์ค€์—์„œ DRAM ๋‚ด๋ถ€ ๋ฒกํ„ฐ ๊ฐ์†Œ ์žฅ์น˜๋กœ DRAM ๋ฐ์ดํ„ฐ ๊ฒฝ๋กœ๋ฅผ ๊ฐ•ํ™”ํ•œ๋‹ค. ๋ณ‘๋ ฌ๋กœ ์‹คํ–‰๋˜๋Š” ์—ฌ๋Ÿฌ ๋ฒกํ„ฐ ๊ฐ์†Œ ์žฅ์น˜์— ๋ช…๋ น์„ ํšจ๊ณผ์ ์œผ๋กœ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•ด DRAM์˜ ์ธํ„ฐํŽ˜์ด์Šค๋ฅผ ์ˆ˜์ •ํ•œ๋‹ค. ๋˜ํ•œ ๋ฒกํ„ฐ ๊ฐ์†Œ ์žฅ์น˜์—์„œ ๋ฐœ์ƒํ•˜๋Š” ๋ถ€ํ•˜ ๋ถˆ๊ท ํ˜•์„ ์™„ํ™”ํ•˜๊ธฐ ์œ„ํ•ด ํ˜ธ์ŠคํŠธ ์ธก ๊ตฌ์กฐ์— ํ•ซ ์ž„๋ฒ ๋”ฉ ๋ฒกํ„ฐ ๋ณต์ œ๋ฅผ ์ œ์•ˆํ•œ๋‹ค. DDR5๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋Š” ์ตœ์ ์˜ TRiM ์„ค๊ณ„๋Š” DRAM ์นฉ์˜ 2.66%์— ํ•ด๋‹นํ•˜๋Š” ํฌ๊ธฐ ์˜ค๋ฒ„ํ—ค๋“œ๋งŒ์œผ๋กœ ์ตœ๋Œ€ 7.7๋ฐฐ ๋ฐ 3.9๋ฐฐ์˜ ์†๋„ ํ–ฅ์ƒ์„ ๋‹ฌ์„ฑํ•˜๊ณ  ์ž„๋ฒ ๋”ฉ ๋ฒกํ„ฐ ์ˆ˜์ง‘์˜ ์—๋„ˆ์ง€ ์†Œ๋น„๋ฅผ 55% ๋ฐ 50% ์ค„์ธ๋‹ค.Abstract i Contents iv List of Tables vii List of Figures viii 1 Introduction 1 1.1 Accelerating RNNs on Edge 3 1.2 Accelerating Recommendation Model 5 1.3 Research Contributions 8 1.4 Outline 9 2 Background 11 2.1 Memory-intensive Machine Learning Applications 11 2.2 DRAM Organization and Operations 13 3 MViD: Sparse Matrix-Vector Multiplication in Mobile DRAM for Accelerating Recurrent Neural Networks 18 3.1 Background and Motivation 18 3.1.1 Energy-efficient RNN Mobile Inference 18 3.1.2 How to Improve the Energy Efficiency and Bandwidth of DRAM Accesses in MV-mul 21 3.2 MV-mul in DRAM 23 3.2.1 Exploiting Quantization and Sparsity in RNN's Matrix Elements 23 3.2.2 The Operation Sequence of MV-mul in DRAM 27 3.2.3 Concurrently Serving Requests from Processors and Performing MV-mul in DRAM 32 3.2.4 Put It All Together: MViD Architecture 37 3.2.5 Additional Optimization Schemes 38 3.3 Evaluation 39 3.3.1 Power/Area/Timing Analysis 39 3.3.2 Performance/Energy Evaluation 42 3.4 Discussion 48 4 TRiM: Enhancing Processor-Memory Interfaces with Scalable Tensor Reduction in Memory 51 4.1 Prior NDP architectures for accelerating Tensor Gather-andReduction 51 4.1.1 Tensor Gather-and-Reduction in RecSys 51 4.1.2 Prior NDP accelerators for GnR 52 4.1.3 Quantitative Analysis 56 4.1.4 Additional Schemes for Accelerating GnR 58 4.2 Tensor Reduction in Memory 58 4.2.1 Basic Concept for TRiM 59 4.2.2 How to Provision C/A Bandwidth 62 4.2.3 Exploring NDP Unit Placement 65 4.2.4 TRiM-G Organization and Operations 68 4.2.5 Host-side Architecture for TRiM 70 4.2.6 Schemes for Improving Reliability 75 4.3 Experimental Setup 76 4.4 Evaluation 77 4.4.1 Performance and Energy Efficiency 79 4.4.2 Sensitivity Study of Hot-entry Replication 82 4.4.3 Design Overhead 82 4.5 Discussion 83 5 Discussion 86 6 Related work 89 7 Conclusion 92 REFERENCES 94 ๊ตญ๋ฌธ์ดˆ๋ก 117๋ฐ•

    Pervasive Computing: Embedding the Public Sphere

    Full text link

    EXTRINSIC CHANNEL-LIKE FINGERPRINT EMBEDDING FOR TRANSMITTER AUTHENTICATION IN WIRELESS SYSTEMS

    Get PDF
    We present a physical-layer fingerprint-embedding scheme for wireless signals, focusing on multiple input multiple output (MIMO) and orthogonal frequency division multiplexing (OFDM) transmissions, where the fingerprint signal conveys a low capacity communication suitable for authenticating the transmission and further facilitating secure communications. Our system strives to embed the fingerprint message into the noise subspace of the channel estimates obtained by the receiver, using a number of signal spreading techniques. When side information of channel state is known and leveraged by the transmitter, the performance of the fingerprint embedding can be improved. When channel state information is not known, blind spreading techniques are applied. The fingerprint message is only visible to aware receivers who explicitly preform detection of the signal, but is invisible to receivers employing typical channel equalization. A taxonomy of overlay designs is discussed and these designs are explored through experiment using time-varying channel-state information (CSI) recorded from IEEE802.16e Mobile WiMax base stations. The performance of the fingerprint signal as received by a WiMax subscriber is demonstrated using CSI measurements derived from the downlink signal. Detection performance for the digital fingerprint message in time-varying channel conditions is also presented via simulation

    An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation

    Get PDF
    Speech enhancement and speech separation are two related tasks, whose purpose is to extract either one or more target speech signals, respectively, from a mixture of sounds generated by several sources. Traditionally, these tasks have been tackled using signal processing and machine learning techniques applied to the available acoustic signals. Since the visual aspect of speech is essentially unaffected by the acoustic environment, visual information from the target speakers, such as lip movements and facial expressions, has also been used for speech enhancement and speech separation systems. In order to efficiently fuse acoustic and visual information, researchers have exploited the flexibility of data-driven approaches, specifically deep learning, achieving strong performance. The ceaseless proposal of a large number of techniques to extract features and fuse multimodal information has highlighted the need for an overview that comprehensively describes and discusses audio-visual speech enhancement and separation based on deep learning. In this paper, we provide a systematic survey of this research topic, focusing on the main elements that characterise the systems in the literature: acoustic features; visual features; deep learning methods; fusion techniques; training targets and objective functions. In addition, we review deep-learning-based methods for speech reconstruction from silent videos and audio-visual sound source separation for non-speech signals, since these methods can be more or less directly applied to audio-visual speech enhancement and separation. Finally, we survey commonly employed audio-visual speech datasets, given their central role in the development of data-driven approaches, and evaluation methods, because they are generally used to compare different systems and determine their performance
    • โ€ฆ
    corecore