Search CORE

865 research outputs found

Motion correlation based low complexity and low power schemes for video codec

Author: Liu Chen
Publication venue
Publication date: 01/01/2012
Field of study

制度:新 ; 報告番号:甲3750号 ; 学位の種類:博士(工学) ; 授与年月日:2012/11/19 ; 早大学位記番号:新6121Waseda Universit

Waseda University Repository

The Berlin Brain-Computer Interface: Progress Beyond Communication and Control

Author: Acqualagna Laura
Blankertz Benjamin
Curio Gabriel
Dähne Sven
Haufe Stefan
Müller Klaus-Robert
Schultze-Kraft Matthias
Sturm Irene
Ušćumlic Marija
Wenzel Markus
Publication venue
Publication date: 21/11/2016
Field of study

The combined effect of fundamental results about neurocognitive processes and advancements in decoding mental states from ongoing brain signals has brought forth a whole range of potential neurotechnological applications. In this article, we review our developments in this area and put them into perspective. These examples cover a wide range of maturity levels with respect to their applicability. While we assume we are still a long way away from integrating Brain-Computer Interface (BCI) technology in general interaction with computers, or from implementing neurotechnological measures in safety-critical workplaces, results have already now been obtained involving a BCI as research tool. In this article, we discuss the reasons why, in some of the prospective application domains, considerable effort is still required to make the systems ready to deal with the full complexity of the real world.EC/FP7/611570/EU/Symbiotic Mind Computer Interaction for Information Seeking/MindSeeEC/FP7/625991/EU/Hyperscanning 2.0 Analyses of Multimodal Neuroimaging Data: Concept, Methods and Applications/HYPERSCANNING 2.0DFG, 103586207, GRK 1589: Verarbeitung sensorischer Informationen in neuronalen Systeme

DepositOnce

EIE: Efficient Inference Engine on Compressed Deep Neural Network

Author: Dally William J.
Han Song
Horowitz Mark A.
Liu Xingyu
Mao Huizi
Pedram Ardavan
Pu Jing
Publication venue
Publication date: 03/05/2016
Field of study

State-of-the-art deep neural networks (DNNs) have hundreds of millions of connections and are both computationally and memory intensive, making them difficult to deploy on embedded systems with limited hardware resources and power budgets. While custom hardware helps the computation, fetching weights from DRAM is two orders of magnitude more expensive than ALU operations, and dominates the required power. Previously proposed 'Deep Compression' makes it possible to fit large DNNs (AlexNet and VGGNet) fully in on-chip SRAM. This compression is achieved by pruning the redundant connections and having multiple connections share the same weight. We propose an energy efficient inference engine (EIE) that performs inference on this compressed network model and accelerates the resulting sparse matrix-vector multiplication with weight sharing. Going from DRAM to SRAM gives EIE 120x energy saving; Exploiting sparsity saves 10x; Weight sharing gives 8x; Skipping zero activations from ReLU saves another 3x. Evaluated on nine DNN benchmarks, EIE is 189x and 13x faster when compared to CPU and GPU implementations of the same DNN without compression. EIE has a processing power of 102GOPS/s working directly on a compressed network, corresponding to 3TOPS/s on an uncompressed network, and processes FC layers of AlexNet at 1.88x10^4 frames/sec with a power dissipation of only 600mW. It is 24,000x and 3,400x more energy efficient than a CPU and GPU respectively. Compared with DaDianNao, EIE has 2.9x, 19x and 3x better throughput, energy efficiency and area efficiency.Comment: External Links: TheNextPlatform: http://goo.gl/f7qX0L ; O'Reilly: https://goo.gl/Id1HNT ; Hacker News: https://goo.gl/KM72SV ; Embedded-vision: http://goo.gl/joQNg8 ; Talk at NVIDIA GTC'16: http://goo.gl/6wJYvn ; Talk at Embedded Vision Summit: https://goo.gl/7abFNe ; Talk at Stanford University: https://goo.gl/6lwuer. Published as a conference paper in ISCA 201

arXiv.org e-Print Archive

Crossref

C-RAN CoMP Methods for MPR Receivers

Author: Raposo Tiago Miguel de Góis
Publication venue
Publication date: 01/01/2018
Field of study

The growth in mobile network traffic due to the increase in MTC (Machine Type Communication) applications, brings along a series of new challenges in traffic routing and management. The goals are to have effective resolution times (less delay), low energy consuption (given that wide sensor networks which are included in the MTC category, are built to last years with respect to their battery consuption) and extremely reliable communication (low Packet Error Rates), following the fifth generation (5G) mobile network demands. In order to deal with this type of dense traffic, several uplink strategies can be devised, where diversity variables like space (several Base Stations deployed), time (number of retransmissions of a given packet per user) and power spreading (power value diversity at the receiver, introducing the concept of SIC and Power-NOMA) have to be handled carefully to fulfill the requirements demanded in Ultra-Reliable Low-Latency Communication (URLLC). This thesis, besides being restricted in terms of transmission power and processing of a User Equipment (UE), works on top of an Iterative Block Decision Feedback Equalization Reciever that allows Multi Packet Reception to deal with the diversity types mentioned earlier. The results of this thesis explore the possibility of fragmenting the processing capabilities in an integrated cloud network (C-RAN) environment through an SINR estimation at the receiver to better understand how and where we can break and distribute our processing needs in order to handle near Base Station users and cell-edge users, the latters being the hardest to deal with in dense networks like the ones deployed in a MTC environment

Repositório da Universidade Nova de Lisboa

CASPR: Judiciously Using the Cloud for Wide-Area Packet Recovery

Author: Byers John W.
Dogar Fahad R.
Doucette Cody
Haq Osama
Publication venue
Publication date: 01/01/2019
Field of study

We revisit a classic networking problem -- how to recover from lost packets in the best-effort Internet. We propose CASPR, a system that judiciously leverages the cloud to recover from lost or delayed packets. CASPR supplements and protects best-effort connections by sending a small number of coded packets along the highly reliable but expensive cloud paths. When receivers detect packet loss, they recover packets with the help of the nearby data center, not the sender, thus providing quick and reliable packet recovery for latency-sensitive applications. Using a prototype implementation and its deployment on the public cloud and the PlanetLab testbed, we quantify the benefits of CASPR in providing fast, cost effective packet recovery. Using controlled experiments, we also explore how these benefits translate into improvements up and down the network stack

arXiv.org e-Print Archive

Boston University Institutional Repository (OpenBU)

Taming and Leveraging Directionality and Blockage in Millimeter Wave Communications

Author: Bao Jingchao
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/12/2018
Field of study

To cope with the challenge for high-rate data transmission, Millimeter Wave(mmWave) is one potential solution. The short wavelength unlatched the era of directional mobile communication. The semi-optical communication requires revolutionary thinking. To assist the research and evaluate various algorithms, we build a motion-sensitive mmWave testbed with two degrees of freedom for environmental sensing and general wireless communication.The first part of this thesis contains two approaches to maintain the connection in mmWave mobile communication. The first one seeks to solve the beam tracking problem using motion sensor within the mobile device. A tracking algorithm is given and integrated into the tracking protocol. Detailed experiments and numerical simulations compared several compensation schemes with optical benchmark and demonstrated the efficiency of overhead reduction. The second strategy attempts to mitigate intermittent connections during roaming is multi-connectivity. Taking advantage of properties of rateless erasure code, a fountain code type multi-connectivity mechanism is proposed to increase the link reliability with simplified backhaul mechanism. The simulation demonstrates the efficiency and robustness of our system design with a multi-link channel record.The second topic in this thesis explores various techniques in blockage mitigation. A fast hear-beat like channel with heavy blockage loss is identified in the mmWave Unmanned Aerial Vehicle (UAV) communication experiment due to the propeller blockage. These blockage patterns are detected through Holm\u27s procedure as a problem of multi-time series edge detection. To reduce the blockage effect, an adaptive modulation and coding scheme is designed. The simulation results show that it could greatly improve the throughput given appropriately predicted patterns. The last but not the least, the blockage of directional communication also appears as a blessing because the geometrical information and blockage event of ancillary signal paths can be utilized to predict the blockage timing for the current transmission path. A geometrical model and prediction algorithm are derived to resolve the blockage time and initiate active handovers. An experiment provides solid proof of multi-paths properties and the numeral simulation demonstrates the efficiency of the proposed algorithm

University of Tennessee, Knoxville: Trace

Energy-Efficient Computing for Mobile Signal Processing

Author: Seo Sangwon
Publication venue
Publication date: 01/01/2011
Field of study

Mobile devices have rapidly proliferated, and deployment of handheld devices continues to increase at a spectacular rate. As today's devices not only support advanced signal processing of wireless communication data but also provide rich sets of applications, contemporary mobile computing requires both demanding computation and efficiency. Most mobile processors combine general-purpose processors, digital signal processors, and hardwired application-specific integrated circuits to satisfy their high-performance and low-power requirements. However, such a heterogeneous platform is inefficient in area, power and programmability. Improving the efficiency of programmable mobile systems is a critical challenge and an active area of computer systems research. SIMD (single instruction multiple data) architectures are very effective for data-level-parallelism intense algorithms in mobile signal processing. However, new characteristics of advanced wireless/multimedia algorithms require architectural re-evaluation to achieve better energy efficiency. Therefore, fourth generation wireless protocol and high definition mobile video algorithms are analyzed to enhance a wide-SIMD architecture. The key enhancements include 1) programmable crossbar to support complex data alignment, 2) SIMD partitioning to support fine-grain SIMD computation, and 3) fused operation to support accelerating frequently used instruction pairs. Near-threshold computation has been attractive in low-power architecture research because it balances performance and power. To further improve energy efficiency in mobile computing, near-threshold computation is applied to a wide SIMD architecture. This proposed near-threshold wide SIMD architecture-Diet SODA-presents interesting architectural design decisions such as 1) very wide SIMD datapath to compensate for degraded performance induced by near-threshold computation and 2) scatter-gather data prefetcher to exploit large latency gap between memory and the SIMD datapath. Although near-threshold computation provides excellent energy efficiency, it suffers from increased delay variations. A systematic study of delay variations in near-threshold computing is performed and simple techniques-structural duplication and voltage/frequency margining-are explored to tolerate and mitigate the delay variations in near-threshold wide SIMD architectures. This dissertation analyzes representative wireless/multimedia mobile signal processing algorithms, proposes an energy-efficient programmable platform, and evaluates performance and power. A main theme of this dissertation is that the performance and efficiency of programmable embedded systems can be significantly improved with a combination of parallel SIMD and near-threshold computations.Ph.D.Electrical EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/86356/1/swseo_1.pd

Deep Blue Documents at the University of Michigan

Recommended from our members

A Dynamic Reconfiguration Framework to Maximize Performance/Power in Asymmetric Multicore Processors

Author: Annamalai Arunachalam
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2013
Field of study

Recent trends in technology scaling have shifted the processing paradigm to multicores. Depending on the characteristics of the cores, the multicores can be either symmetric or asymmetric. Prior research has shown that Asymmetric Multicore Processors (AMPs) outperform their symmetric (SMP) counterparts within a given resource and power budget. But, due to the heterogeneity in core-types and time-varying workload behavior, thread-to-core assignment is always a challenge in AMPs. As the computational requirements vary significantly across different applications and with time, there is a need to dynamically allocate appropriate computational resources on demand to suit the applications’ current needs, in order to maximize the performance and minimize the energy consumption. Performance/power of the applications could be further increased by dynamically adapting the voltage and frequency of the cores to better fit the changing characteristics of the workloads. Not only can a core be forced to a low power mode when its activity level is low, but the power saved by doing so could be opportunistically re-budgeted to the other cores to boost the overall system throughput. To this end, we propose a novel solution that seamlessly combines heterogeneity with a Dynamic Reconfiguration Framework (DRF). The proposed dynamic reconfiguration framework is equipped with Dynamic Resource Allocation (DRA) and Voltage/Frequency Adaptation (DVFA) capabilities to adapt the core resources and operating conditions at runtime to the changing demands of the applications. As a proof of concept, we illustrate our proposed approach using a dual-core AMP and demonstrate significant performance/power benefits over various baselines

ScholarWorks@UMass Amherst

Cross-Layer Optimization for Power-Efficient and Robust Digital Circuits and Systems

Author: Huang Yanxiang
Publication venue
Publication date: 15/09/2017
Field of study

With the increasing digital services demand, performance and power-efficiency become vital requirements for digital circuits and systems. However, the enabling CMOS technology scaling has been facing significant challenges of device uncertainties, such as process, voltage, and temperature variations. To ensure system reliability, worst-case corner assumptions are usually made in each design level. However, the over-pessimistic worst-case margin leads to unnecessary power waste and performance loss as high as 2.2x. Since optimizations are traditionally confined to each specific level, those safe margins can hardly be properly exploited. To tackle the challenge, it is therefore advised in this Ph.D. thesis to perform a cross-layer optimization for digital signal processing circuits and systems, to achieve a global balance of power consumption and output quality. To conclude, the traditional over-pessimistic worst-case approach leads to huge power waste. In contrast, the adaptive voltage scaling approach saves power (25% for the CORDIC application) by providing a just-needed supply voltage. The power saving is maximized (46% for CORDIC) when a more aggressive voltage over-scaling scheme is applied. These sparsely occurred circuit errors produced by aggressive voltage over-scaling are mitigated by higher level error resilient designs. For functions like FFT and CORDIC, smart error mitigation schemes were proposed to enhance reliability (soft-errors and timing-errors, respectively). Applications like Massive MIMO systems are robust against lower level errors, thanks to the intrinsically redundant antennas. This property makes it applicable to embrace digital hardware that trades quality for power savings.Comment: 190 page

arXiv.org e-Print Archive

Lirias

Efficient Learning Machines

Author: Awad Mariette
Khanna Rahul
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Computer scienc

OAPEN Library