30 research outputs found

    Memory Systems and Interconnects for Scale-Out Servers

    Get PDF
    The information revolution of the last decade has been fueled by the digitization of almost all human activities through a wide range of Internet services. The backbone of this information age are scale-out datacenters that need to collect, store, and process massive amounts of data. These datacenters distribute vast datasets across a large number of servers, typically into memory-resident shards so as to maintain strict quality-of-service guarantees. While data is driving the skyrocketing demands for scale-out servers, processor and memory manufacturers have reached fundamental efficiency limits, no longer able to increase server energy efficiency at a sufficient pace. As a result, energy has emerged as the main obstacle to the scalability of information technology (IT) with huge economic implications. Delivering sustainable IT calls for a paradigm shift in computer system design. As memory has taken a central role in IT infrastructure, memory-centric architectures are required to fully utilize the IT's costly memory investment. In response, processor architects are resorting to manycore architectures to leverage the abundant request-level parallelism found in data-centric applications. Manycore processors fully utilize available memory resources, thereby increasing IT efficiency by almost an order of magnitude. Because manycore server chips execute a large number of concurrent requests, they exhibit high incidence of accesses to the last-level-cache for fetching instructions (due to large instruction footprints), and off-chip memory (due to lack of temporal reuse in on-chip caches) for accessing dataset objects. As a result, on-chip interconnects and the memory system are emerging as major performance and energy-efficiency bottlenecks in servers. This thesis seeks to architect on-chip interconnects and memory systems that are tuned for the requirements of memory-centric scale-out servers. By studying a wide range of data-centric applications, we uncover application phenomena common in data-centric applications, and examine their implications on on-chip network and off-chip memory traffic. Finally, we propose specialized on-chip interconnects and memory systems that leverage common traffic characteristics, thereby improving server throughput and energy efficiency

    Time-Predictable Communication on a Time-Division Multiplexing Network-on-Chip Multicore

    Get PDF

    iOS Technologies & Frameworks

    Get PDF
    Apple’s mobile platform — iOS — currently generates the largest amount of revenue out of all mobile app stores. The majority of iDevices run the latest major iOS version (iOS 10) due to Apple users’ tendency to update their devices. Consequently, iOS developers are pressured into keeping their apps up to date. Advantages to updating apps consist of new features and adapting apps to the platform’s hardware and software evolution. However, this does not always happen. There are apps, some popular (with many users), which either receive slow updates, or not at all. The main consequence of developers not updating to the latest tendencies (i.e. user interface or API changes) is the degradation of their apps’ user experience. This subpar user experience leads to a decrease in the number of installs (and sales) and a search for alternatives that have been updated to support the latest firmware iteration fully. We identified a common pattern amongst ten apps which have subpar reviews on the App Store: excessive battery consumption and lack of user onboarding were just a few of the ssues. Above all, almost all those apps belong to the top 1% of apps (which generate 94% of the App Store’s revenue), so the lack of focus on the user experience is unfortunate considering their massive user bases. We listed the available resources for those wanting to develop or improve iOS apps. Given these requisites, we studied the possibility of developing a mobile app that adopted good engineering practices and, above all, focused on delivering an excellent user experience in a given timeframe of six months. The app’s idea consisted of a wish list management app called Snapwish that allows the user to take photos of objects they want, create wish lists, and share them with family and friends. The app allows for offline usage, with data syncing automatically (in real-time) without user intervention when the app’s Internet connection is present. We tested Snapwish thoroughly to measure the quality of its implementation. Profiling helped assert that core metrics like CPU and memory usage, network data requests and energy consumption were within acceptable values while unit and user interface tests served to validate our code functionally. Furthermore, our team of five beta testers provided valuable feedback and suggestions. Ultimately, the six-month timeframe proved to be insufficient in regards to a release on the App Store, as Snapwish remains in the latter beta stages at the time of writing. This delay is mostly attributed to a lengthy testing process. Thus, we plan on releasing it in the first trimester of 2017.Hoje em dia, a plataforma móvel da Apple — iOS — é a que tem maior revenue em aplicações móveis. A maior parte dos dispositivos móveis iOS corre a versão mais atual (iOS 10), devido à tendência dos seus utilizadores em atualizar o sistema operativo com frequência. Consequentemente, os desenvolvedores da plataforma são pressionados para manterem as suas apps atualizadas. Algumas das vantagens das atualizações consiste em adicionar novas funcionalidades e adaptar as apps à evolução do hardware e do software da plataforma. Contudo, isto nem sempre e verifica. Existem muitas apps, algumas “populares” (com muitas instalações) cuja atualização demora ou não acontece. A principal consequência da não atualização das apps às tendências atuais, quer em termos de interação, quer em termos de mecanismos de proteção de dados, consumo de bateria e outros, é a degradação da experiência de quem as utiliza, consequentemente, a diminuição do número de instalações (e vendas) e a crescente procura de alternativas que tenham estes princípios em conta. Foi identificado um padrão comum em dez aplicações cujas classificações na App Store são medíocres: um consumo exagerado de bateria e falta de user onboarding foram apenas alguns dos problemas. Acima de tudo, quase todas pertencem ao 1% de aplicações que geram 94% das receitas da App Store. A falta de foco na experiência do utilizador é infeliz considerando as enormes bases de utilizadores dessas aplicações. Foram listados os recursos disponíveis para quem pretende desenvolver ou melhorar uma aplicação iOS. Dadas essas premissas, foi estudada a possibilidade de desenvolver uma aplicação móvel que adote boas práticas de engenharia e, acima de tudo, foque na experiência do utilizador, num período de seis meses. A ideia para a aplicação consistiu num gestor de listas de desejos designada Snapwish que permite tirar fotos de objetos que o utilizador deseja, criar listas, e partilhá-las com amigos e familiares. Além disso, a app permite o uso offline e os dados são sincronizados em tempo real sem intervenção do utilizador quando a app dispõe de uma conexão à Internet. A nossa aplicação foi testada extensivamente para medir o nível de qualidade da sua implementação. O profiling ajudou em constatar que métricas fundamentais como o consumo de CPU e memória, pedidos de dados de rede e de consumo de energia (bateria) estavam dentro dos parâmetros aceitáveis. Além disso, uma equipa de cinco beta-testers contribuiu com comentários e sugestões de grande valor. Em última análise, o prazo de seis meses revelou-se insuficiente em relação ao lançamento da app na App Store. O Snapwish permanece numa fase beta avançada (no momento da escrita desta tese). Este atraso é principalmente atribuído a um extenso processo de testes. Assim, pretendemos lançar a aplicação no primeiro trimestre de 2017

    Memory abstractions for parallel programming

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 156-163).A memory abstraction is an abstraction layer between the program execution and the memory that provides a different "view" of a memory location depending on the execution context in which the memory access is made. Properly designed memory abstractions help ease the task of parallel programming by mitigating the complexity of synchronization or admitting more efficient use of resources. This dissertation describes five memory abstractions for parallel programming: (i) cactus stacks that interoperate with linear stacks, (ii) efficient reducers, (iii) reducer arrays, (iv) ownershipaware transactions, and (v) location-based memory fences. To demonstrate the utility of memory abstractions, my collaborators and I developed Cilk-M, a dynamically multithreaded concurrency platform which embodies the first three memory abstractions. Many dynamic multithreaded concurrency platforms incorporate cactus stacks to support multiple stack views for all the active children simultaneously. The use of cactus stacks, albeit essential, forces concurrency platforms to trade off between performance, memory consumption, and interoperability with serial code due to its incompatibility with linear stacks. This dissertation proposes a new strategy to build a cactus stack using thread-local memory mapping (or TLMM), which enables Cilk-M to satisfy all three criteria simultaneously. A reducer hyperobject allows different branches of a dynamic multithreaded program to maintain coordinated local views of the same nonlocal variable. With reducers, one can use nonlocal variables in a parallel computation without restructuring the code or introducing races. This dissertation introduces memory-mapped reducers, which admits a much more efficient access compared to existing implementations. When used in large quantity, reducers incur unnecessarily high overhead in execution time and space consumption. This dissertation describes support for reducer arrays, which offers the same functionality as an array of reducers with significantly less overhead. Transactional memory is a high-level synchronization mechanism, designed to be easier to use and more composable than fine-grain locking. This dissertation presents ownership-aware transactions, the first transactional memory design that provides provable safety guarantees for "opennested" transactions. On architectures that implement memory models weaker than sequential consistency, programs communicating via shared memory must employ memory-fences to ensure correct execution. This dissertation examines the concept of location-based memoryfences, which unlike traditional memory fences, incurs latency only when synchronization is necessary.by I-Ting Angelina Lee.Ph.D

    19th SC@RUG 2022 proceedings 2021-2022

    Get PDF
    corecore