Search CORE

375,471 research outputs found

Scalable parallel communications

Author: Foudriat E. C.
Khanna S.
Maly K.
Mukkamala R.
Overstreet C. M.
Sekhar Y. S.
Zubair M.
Publication venue
Publication date
Field of study

Coarse-grain parallelism in networking (that is, the use of multiple protocol processors running replicated software sending over several physical channels) can be used to provide gigabit communications for a single application. Since parallel network performance is highly dependent on real issues such as hardware properties (e.g., memory speeds and cache hit rates), operating system overhead (e.g., interrupt handling), and protocol performance (e.g., effect of timeouts), we have performed detailed simulations studies of both a bus-based multiprocessor workstation node (based on the Sun Galaxy MP multiprocessor) and a distributed-memory parallel computer node (based on the Touchstone DELTA) to evaluate the behavior of coarse-grain parallelism. Our results indicate: (1) coarse-grain parallelism can deliver multiple 100 Mbps with currently available hardware platforms and existing networking protocols (such as Transmission Control Protocol/Internet Protocol (TCP/IP) and parallel Fiber Distributed Data Interface (FDDI) rings); (2) scale-up is near linear in n, the number of protocol processors, and channels (for small n and up to a few hundred Mbps); and (3) since these results are based on existing hardware without specialized devices (except perhaps for some simple modifications of the FDDI boards), this is a low cost solution to providing multiple 100 Mbps on current machines. In addition, from both the performance analysis and the properties of these architectures, we conclude: (1) multiple processors providing identical services and the use of space division multiplexing for the physical channels can provide better reliability than monolithic approaches (it also provides graceful degradation and low-cost load balancing); (2) coarse-grain parallelism supports running several transport protocols in parallel to provide different types of service (for example, one TCP handles small messages for many users, other TCP's running in parallel provide high bandwidth service to a single application); and (3) coarse grain parallelism will be able to incorporate many future improvements from related work (e.g., reduced data movement, fast TCP, fine-grain parallelism) also with near linear speed-ups

NASA Technical Reports Server

Near-Memory Address Translation

Author: Falsafi Babak
Jevdjic Djordje
Picorel Javier
Publication venue
Publication date: 21/08/2017
Field of study

Memory and logic integration on the same chip is becoming increasingly cost effective, creating the opportunity to offload data-intensive functionality to processing units placed inside memory chips. The introduction of memory-side processing units (MPUs) into conventional systems faces virtual memory as the first big showstopper: without efficient hardware support for address translation MPUs have highly limited applicability. Unfortunately, conventional translation mechanisms fall short of providing fast translations as contemporary memories exceed the reach of TLBs, making expensive page walks common. In this paper, we are the first to show that the historically important flexibility to map any virtual page to any page frame is unnecessary in today's servers. We find that while limiting the associativity of the virtual-to-physical mapping incurs no penalty, it can break the translate-then-fetch serialization if combined with careful data placement in the MPU's memory, allowing for translation and data fetch to proceed independently and in parallel. We propose the Distributed Inverted Page Table (DIPTA), a near-memory structure in which the smallest memory partition keeps the translation information for its data share, ensuring that the translation completes together with the data fetch. DIPTA completely eliminates the performance overhead of translation, achieving speedups of up to 3.81x and 2.13x over conventional translation using 4KB and 1GB pages respectively.Comment: 15 pages, 9 figure

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Designing a VM-level vertical scalability service in current cloud platforms: A new hope for wearable computers

Author: Kaiiali Mustafa
Publication venue: 'The Scientific and Technological Research Council of Turkey'
Publication date: 06/12/2015
Field of study

The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Public clouds are becoming ripe for enterprise adoption. Many companies, including large enterprises, are increasingly relying on public clouds as a substitute for, or a supplement to, their own computing infrastructures. On the other hand, cloud storage service has attracted over 625 million users. However, apart from the storage service, other cloud services, such as the computing service, have not yet attracted the end users’ interest for economic and technical reasons. Cloud service providers offers horizontal scalability to make their services scalable and economical for enterprises while it is still not economical for the individual users to use their computing services due to the lack of vertical scalability. Moreover, current virtualization technologies and operating systems, specifically the guest operating systems installed on virtual machines, do not support the concept of vertical scalability. In addition, network remote access protocols are meant to administer remote machines but they are unable to run the non-administrative tasks such as playing heavy games and watching high quality videos remotely in a way that makes the users feel as if they are sitting locally on their personal machines. On the other hand, the industry is yet unable to make efficient wearable computers a reality due to the limited size of the wearable devices, where it is infeasible to place efficient processors and big enough hard disks. This paper aims to highlight the need for the vertical scalability service and design the appropriate cloud, virtualization layer, and operating system services to incorporate vertical scalability in current cloud platforms in a way that will make it economically and technically efficient for the end users to use cloud virtual machines as if they are using their personal laptops. Through these services, the cloud takes wearable computing to the next stage and makes wearable computers a reality

De Montfort University Open Research Archive

Differentiation of Normal Cognition and Early Dementia using fNIRS

Author: Ung Wei Chun
Publication venue: 'Whiting & Birch, Ltd.'
Publication date: 01/09/2015
Field of study

This study aimed to assess the effectiveness of functional near-infrared spectroscopy in differentiating normal cognition and early dementia. To date, only pen-and-paper tests, which are time consuming, uneconomical in the sense that the services of psychiatrist or psychologist don’t come cheap, and are just behaviour assessments, are used to screen for dementia. The deployment of functional near-infrared spectroscopy not only could study functional connectivity but also could provide the objective confirmation of dementia diagnosis To observe the difference between the brain signal of normal aging individuals and early dementia patients, tasks to activate working memory were designed. A total of 10 subjects (3 healthy controls and 7 early dementia patients) screened using Mini Mental Status Examination and Clinical Dementia Rating underwent three levels of sequencing tasks and three categories of verbal fluency tasks while getting their brain signals measured. The findings showed that the activation level of healthy controls is higher than that of early dementia patients (sequencing tasks – level 1: 0.08 vs 0.04 mM⋅mm, level 2: 0.07 vs 0.06 mM⋅mm, level 3: 0.05 vs 0.04 mM⋅mm; verbal fluency tasks – 0.2 vs 0.1 mM⋅mm). This activation was found to be in the left and right prefrontal cortex. Besides that, more complicated activations were observed during verbal fluency task as it tests not only working memory but also verbal and executive control abilities. As of now, the sample size is not sufficient enough to conclude this study but the data collection is still on-going. Once the data collection is completed and the sample size is large enough, the role of functional near-infrared spectroscopy in dementia diagnosis can be validated and this study can finally be concluded

UTPedia

TechNews digests: Jan - Nov 2008

Author
Publication venue: British Educational Communications and Technology Agency (BECTA)
Publication date: 01/01/2008
Field of study

TechNews is a technology, news and analysis service aimed at anyone in the education sector keen to stay informed about technology developments, trends and issues. TechNews focuses on emerging technologies and other technology news. TechNews service : digests september 2004 till May 2010 Analysis pieces and News combined publish every 2 to 3 month

Digital Education Resource Archive

Short Block-length Codes for Ultra-Reliable Low-Latency Communications

Author: Abbas Rana
Han Guojun
Johnson Sarah J.
Li Yonghui
Lin Zihuai
Liu Wanchun
Matuz Balazs
Minja Aleksandar
Sadegh Mohammadi Mohammad
Shirvanimoghaddam Mahyar
Vucetic Branka
Yue Chentao
Publication venue
Publication date: 01/01/2018
Field of study

This paper reviews the state of the art channel coding techniques for ultra-reliable low latency communication (URLLC). The stringent requirements of URLLC services, such as ultra-high reliability and low latency, have made it the most challenging feature of the fifth generation (5G) mobile systems. The problem is even more challenging for the services beyond the 5G promise, such as tele-surgery and factory automation, which require latencies less than 1ms and failure rate as low as

10^{-9}

. The very low latency requirements of URLLC do not allow traditional approaches such as re-transmission to be used to increase the reliability. On the other hand, to guarantee the delay requirements, the block length needs to be small, so conventional channel codes, originally designed and optimised for moderate-to-long block-lengths, show notable deficiencies for short blocks. This paper provides an overview on channel coding techniques for short block lengths and compares them in terms of performance and complexity. Several important research directions are identified and discussed in more detail with several possible solutions.Comment: Accepted for publication in IEEE Communications Magazin

arXiv.org e-Print Archive

Institute of Transport Research:Publications

Architecture for Cooperative Prefetching in P2P Video-on- Demand System

Author: Abbasi Ubaid
Ahmed Toufik
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 11/05/2010
Field of study

Most P2P VoD schemes focused on service architectures and overlays optimization without considering segments rarity and the performance of prefetching strategies. As a result, they cannot better support VCRoriented service in heterogeneous environment having clients using free VCR controls. Despite the remarkable popularity in VoD systems, there exist no prior work that studies the performance gap between different prefetching strategies. In this paper, we analyze and understand the performance of different prefetching strategies. Our analytical characterization brings us not only a better understanding of several fundamental tradeoffs in prefetching strategies, but also important insights on the design of P2P VoD system. On the basis of this analysis, we finally proposed a cooperative prefetching strategy called "cooching". In this strategy, the requested segments in VCR interactivities are prefetched into session beforehand using the information collected through gossips. We evaluate our strategy through extensive simulations. The results indicate that the proposed strategy outperforms the existing prefetching mechanisms.Comment: 13 Pages, IJCN

arXiv.org e-Print Archive

CiteSeerX

Crossref