Search CORE

271 research outputs found

Virtualization techniques for memory resource exploitation

Author: Garrido Platero Luis Angel
Publication venue: Universitat Politècnica de Catalunya
Publication date: 26/11/2019
Field of study

Cloud infrastructures have become indispensable in our daily lives with the rise of cloud-based services offered by companies like Facebook, Google, Amazon and many others. These cloud infrastructures use a large numbers of servers provisioned with their own computing resources. Each of these servers use a piece of software, called the Hypervisor (``HV''), that allows them to create multiple virtual instances of the server's physical computing resources and abstract them into "Virtual Machines'' (VMs). A VM runs an Operating System, which in turn runs the applications. The VMs within the servers generate varying memory demand behavior. When the demand increases, costly operations such as (virtual) disk accesses and/or VM migrations can occur. As a result, it is necessary to optimize the utilization of the local memory resources within a single computing server. However, pressure on the memory resources can still increase, making it necessary to migrate the VM to a different server with larger memory or add more memory to the same server. At this point, it is important to consider that some of the servers in the cloud infrastructure might have memory resources that they are not using. Considering the possibility to make memory available to the server, new architectures have been introduced that provide hardware support to enable servers to share their memory capacity. This thesis presents multiple contributions to the memory management problem. First, it addresses the problem of optimizing memory resources in a virtualized server through different types of memory abstractions. Two full contributions are presented for managing memory within a single server called SmarTmem and CARLEMM. In this respect, a third contribution is also presented, called CAVMem, that works as the foundation for CARLEMM. Second, this thesis presents two contributions for memory capacity aggregation across multiple servers, offering two mechanisms called GV-Tmem and vMCA, this latter being based on GV-Tmem but with significant enhancements. These mechanisms distribute the server's total memory within a single-server and globally across computing servers using a user-space process with high-level memory management policies.Las infraestructuras para la nube se han vuelto indispensables en nuestras vidas diarias con la proliferación de los servicios ofrecidos por compañías como Facebook, Google, Amazon entre otras. Estas infraestructuras utilizan una gran cantidad de servidores proveídos con sus propios recursos computacionales. Cada unos de estos servidores utilizan un software, llamado el Hipervisor (“HV”), que les permite crear múltiples instancias virtuales de los recursos físicos de computación del servidor y abstraerlos en “Máquinas Virtuales” (VMs). Una VM ejecuta un Sistema Operativo (OS), el cual a su vez ejecuta aplicaciones. Las VMs dentro de los servidores generan un comportamiento variable de demanda de memoria. Cuando la demanda de memoria aumenta, operaciones costosas como accesos al disco (virtual) y/o migraciones de VMs pueden ocurrir. Como resultado, es necesario optimizar la utilización de los recursos de memoria locales dentro del servidor. Sin embargo, la demanda por memoria puede seguir aumentando, haciendo necesario que la VM migre a otro servidor o que se añada más memoria al servidor. En este punto, es importante considerar que algunos servidores podrían tener recursos de memoria que no están utilizando. Considerando la posibilidad de hacer más memoria disponible a los servidores que lo necesitan, nuevas arquitecturas de servidores han sido introducidos que brindan el soporte de hardware necesario para habilitar que los servidores puedan compartir su capacidad de memoria. Esta tesis presenta múltiples contribuciones para el problema de manejo de memoria. Primero, se enfoca en el problema de optimizar los recursos de memoria en un servidor virtualizado a través de distintos tipos de abstracciones de memoria. Dos contribuciones son presentadas para administrar memoria de manera automática dentro de un servidor virtualizado, llamadas SmarTmem y CARLEMM. En este contexto, una tercera contribución es presentada, llamada CAVMem, que proporciona los fundamentos para el desarrollo de CARLEMM. Segundo, la tesis presenta dos contribuciones enfocadas en la agregación de capacidad de memoria a través de múltiples servidores, ofreciendo dos mecanismos llamados GV-Tmem y vMCA, siendo este último basado en GV-Tmem pero con mejoras significativas. Estos mecanismos administran la memoria total de un servidor a nivel local y de manera global a lo largo de los servidores de la infraestructura de nube utilizando un proceso de usuario que implementa políticas de manejo de ..

Tesis Doctorals en Xarxa

Virtualization techniques for memory resource exploitation

Author: Garrido Platero Luis Ángel
Publication venue: Universitat Politècnica de Catalunya
Publication date: 26/11/2019
Field of study

UPCommons. Portal del coneixement obert de la UPC

Virtualization techniques for memory resource exploitation

Author: Garrido Platero Luis Angel
Publication venue: Universitat Politècnica de Catalunya
Publication date: 26/11/2019
Field of study

Low power architectures for streaming applications

Author: He Y.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2013
Field of study

Repository TU/e

Pure OAI Repository

FatPaths: Routing in Supercomputers and Data Centers when Shortest Paths Fall Short

Author: Besta Maciej
Cynk Karolina
Di Girolamo Salvatore
Henriksson Erik
Hoefler Torsten
Konieczny Marek
Schneider Marcel
Singla Ankit
Publication venue
Publication date: 01/01/2020
Field of study

We introduce FatPaths: a simple, generic, and robust routing architecture that enables state-of-the-art low-diameter topologies such as Slim Fly to achieve unprecedented performance. FatPaths targets Ethernet stacks in both HPC supercomputers as well as cloud data centers and clusters. FatPaths exposes and exploits the rich ("fat") diversity of both minimal and non-minimal paths for high-performance multi-pathing. Moreover, FatPaths uses a redesigned "purified" transport layer that removes virtually all TCP performance issues (e.g., the slow start), and incorporates flowlet switching, a technique used to prevent packet reordering in TCP networks, to enable very simple and effective load balancing. Our design enables recent low-diameter topologies to outperform powerful Clos designs, achieving 15% higher net throughput at 2x lower latency for comparable cost. FatPaths will significantly accelerate Ethernet clusters that form more than 50% of the Top500 list and it may become a standard routing scheme for modern topologies

arXiv.org e-Print Archive

Repository for Publications and Research Data

Recommended from our members

Intelligent and bandwidth-efficient medium access control protocols for IEEE 802.11p-based Vehicular Ad hoc Networks

Author: Pressas Andreas
Publication venue
Publication date: 08/06/2020
Field of study

Vehicle-to-Vehicle (V2V) technology aims to enable safer and more sophisticated transportation via the spontaneous formation of Vehicular Ad hoc Networks (VANETs). This type of wireless networks allows the exchange of kinematic and other data among vehicles, for the primary purpose of safer and more efficient driving, as well as efficient traffic management and other third-party services. Their infrastructure-less, unbounded nature allows the formation of dense networks that present a channel sharing issue, which is harder to tackle than in conventional WLANs. This thesis focuses on optimising channel access strategies, which is important for the efficient usage of the available wireless bandwidth and the successful deployment of VANETs. To start with, the default channel access control method for V2V is evaluated hardware via modifying the appropriate wireless interface Linux driver to enable finer on-the-fly control of IEEE 802.11p access control layer parameters. More complex channel sharing scenarios are evaluated via simulations and findings on the behaviour of the access control mechanism are presented. A complete channel sharing efficiency assessment is conducted, including throughput, fairness and latency measurements. A new IEEE 802.11p-compatible Q-Learning-based access control approach that improves upon the studied protocol is presented. The stations feature algorithms that “learn” how to act optimally in VANETs in order to maximise their achieved packet delivery and minimise bandwidth wastage. The feasibility of Q-Learning to be used as the base of selflearning protocols for IEEE 802.11p-based V2V communication access control in dense environments is investigated in terms of parameter tuning, necessary time of exploration, achieving latency requirements, scaling, multi-hop and accommodation of simultaneous applications. Additionally, the novel Collection Contention Estimation (CCE) mechanism for Q-Learning-based access control is presented. By embedding it on the Q-Learning agents, faster convergence, higher throughput, better service separation and short-term fairness are achieved in simulated network deployments. The acquired new insights on the network performance of the proposed algorithms can provide precise guidelines for efficient designs of practical, reliable, fair and ultra-low latency V2V communication systems for dense topologies. These results can potentially have an impact across a range of related areas, including various types of wireless networks and resource allocation for these, network protocol and transceiver design as well as QLearning applicability and considerations for correct use

Sussex Research Online

Robust and secure resource management for automotive cyber-physical systems

Author: Kukkala Vipin Kumar
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2022
Field of study

2022 Spring.Includes bibliographical references.Modern vehicles are examples of complex cyber-physical systems with tens to hundreds of interconnected Electronic Control Units (ECUs) that manage various vehicular subsystems. With the shift towards autonomous driving, emerging vehicles are being characterized by an increase in the number of hardware ECUs, greater complexity of applications (software), and more sophisticated in-vehicle networks. These advances have resulted in numerous challenges that impact the reliability, security, and real-time performance of these emerging automotive systems. Some of the challenges include coping with computation and communication uncertainties (e.g., jitter), developing robust control software, detecting cyber-attacks, ensuring data integrity, and enabling confidentiality during communication. However, solutions to overcome these challenges incur additional overhead, which can catastrophically delay the execution of real-time automotive tasks and message transfers. Hence, there is a need for a holistic approach to a system-level solution for resource management in automotive cyber-physical systems that enables robust and secure automotive system design while satisfying a diverse set of system-wide constraints. ECUs in vehicles today run a variety of automotive applications ranging from simple vehicle window control to highly complex Advanced Driver Assistance System (ADAS) applications. The aggressive attempts of automakers to make vehicles fully autonomous have increased the complexity and data rate requirements of applications and further led to the adoption of advanced artificial intelligence (AI) based techniques for improved perception and control. Additionally, modern vehicles are becoming increasingly connected with various external systems to realize more robust vehicle autonomy. These paradigm shifts have resulted in significant overheads in resource constrained ECUs and increased the complexity of the overall automotive system (including heterogeneous ECUs, network architectures, communication protocols, and applications), which has severe performance and safety implications on modern vehicles. The increased complexity of automotive systems introduces several computation and communication uncertainties in automotive subsystems that can cause delays in applications and messages, resulting in missed real-time deadlines. Missing deadlines for safety-critical automotive applications can be catastrophic, and this problem will be further aggravated in the case of future autonomous vehicles. Additionally, due to the harsh operating conditions (such as high temperatures, vibrations, and electromagnetic interference (EMI)) of automotive embedded systems, there is a significant risk to the integrity of the data that is exchanged between ECUs which can lead to faulty vehicle control. These challenges demand a more reliable design of automotive systems that is resilient to uncertainties and supports data integrity goals. Additionally, the increased connectivity of modern vehicles has made them highly vulnerable to various kinds of sophisticated security attacks. Hence, it is also vital to ensure the security of automotive systems, and it will become crucial as connected and autonomous vehicles become more ubiquitous. However, imposing security mechanisms on the resource constrained automotive systems can result in additional computation and communication overhead, potentially leading to further missed deadlines. Therefore, it is crucial to design techniques that incur very minimal overhead (lightweight) when trying to achieve the above-mentioned goals and ensure the real-time performance of the system. We address these issues by designing a holistic resource management framework called ROSETTA that enables robust and secure automotive cyber-physical system design while satisfying a diverse set of constraints related to reliability, security, real-time performance, and energy consumption. To achieve reliability goals, we have developed several techniques for reliability-aware scheduling and multi-level monitoring of signal integrity. To achieve security objectives, we have proposed a lightweight security framework that provides confidentiality and authenticity while meeting both security and real-time constraints. We have also introduced multiple deep learning based intrusion detection systems (IDS) to monitor and detect cyber-attacks in the in-vehicle network. Lastly, we have introduced novel techniques for jitter management and security management and deployed lightweight IDSs on resource constrained automotive ECUs while ensuring the real-time performance of the automotive systems

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Proceedings of the 5th International Workshop on Reconfigurable Communication-centric Systems on Chip 2010 - ReCoSoC\u2710 - May 17-19, 2010 Karlsruhe, Germany. (KIT Scientific Reports ; 7551)

Author: Becker Jürgen
Hübner Michael
Lagadec Loïc
Sander Oliver
Publication venue: KIT Scientific Publishing, Karlsruhe
Publication date: 01/01/2010
Field of study

ReCoSoC is intended to be a periodic annual meeting to expose and discuss gathered expertise as well as state of the art research around SoC related topics through plenary invited papers and posters. The workshop aims to provide a prospective view of tomorrow\u27s challenges in the multibillion transistor era, taking into account the emerging techniques and architectures exploring the synergy between flexible on-chip communication and system reconfigurability

KITopen

Design Optimization of IEEE Time-Sensitive Networks (TSN) for Safety-Critical and Real-Time Applications

Author: Gavrilut Voica Maria
Publication venue: DTU Compute
Publication date: 01/01/2018
Field of study

Online Research Database In Technology

Constraint driven operation assignment for retargetable VLIW compilers

Author: Bekooij M.J.G.
Publication venue: Technische Universiteit Eindhoven
Publication date: 01/01/2004
Field of study

In veel consumenten elektronica producten worden processoren toegepast voor het bewerken van gedigitaliseerde signalen. Deze processoren zijn gewoonlijk ingebed in een systeem en moeten wat rekenkracht, vermogensverbruik en fabricage kosten aan stringente eisen voldoen. Door het optimaliseren van een processor voor een specifieke taak, of een kleine verzameling van taken, kan er aan strengere eisen worden voldaan. Deze specialisatie heeft een grotere diversiteit aan processor types tot gevolg. Door het toepassen van geautomatiseerde processor ontwerp en programmeer systemen wordt er getracht om de ontwikkelkosten in de hand te houden. Een processor kan onder andere geoptimaliseerd worden door het toepassen van een incompleet communicatie netwerk in de processor. Daarnaast is het wenselijk om meerdere register files toe te passen in een processor met een groot aantal parallelle bewerkingseenheden. Deze optimalisaties hebben tot gevolg dat er veel hulp en expertise van programmeur nodig is om hoogwaardige microcode te genereren met behulp van traditionele code generatie technieken in een compiler. Met de in dit proefschrift beschreven code generatie methode is het in veel gevallen wel mogelijk om hoogwaardige microcode volledig automatisch te genereren. Het toepassen van een incompleet netwerk in de processor maakt het toekennen van basis bewerkingen aan bewerkingseenheden een moeilijke taak voor de code generator. Een toekenning moet namelijk zo plaatsvinden dat voor iedere bewerking die uitgevoerd wordt op een bewerkingseenheid er een kanaal in het netwerk van de processor is, dat gebruikt kan worden om het resultaat naar de bewerkingseenheid toe te sturen die de resultaat consumerende bewerking uitvoerd. Dit communicatiekanaal en de bewerkingseenheid moeten tevens op het gewenste tijdstip beschikbaar zijn. In de voorgestelde code generatie methode wordt er gezocht naar een oplossing. Na het nemen van een bewerkings toekenningsbelissing wordt er geanalyseerd welke toekomstige beslissings opties niet tot een oplossing kunnen behoren gegeven de reeds gemaakte beslissingen. Deze gevallen worden verwijderd uit de zoekruimte zodat tijdens toekomstige beslissingen andere toekenningsbeslissingen zullen worden geprobeerd. Indien er gedetecteerd wordt dat er gegeven de gemaakt beslissingen geen oplossing bestaat, dan worden er beslissingen ongedaan gemaakt en andere opties geprobeerd. Het verwijderen van zoveel mogelijk beslissings opties die niet tot een oplossing behoren, verminderd het aantal keer dat er op een beslissing terug gekomen moet worden en de tijd die nodig is om een oplossing te vinden Voor het bewerking aan bewerkingseenheid toekenings probleem wordt er een conflict graaf opgesteld waarin alle opties en combinatie van niet toegestane opties gerepresenteerd worden. Gevallen die zeker niet tot een oplossing behoren worden gevonden met algoritmes die rekentijd effici¨ent zijn. Indien door analyse wordt vastgesteld dat twee bewerkingen op hetzelfde tijdstip uitgevoerd moeten worden dan wordt er een kant in de conflict graaf toegevoegd. Deze kant sluit uit dat deze beide bewerkingen aan dezelfde bewerkingseenheid wordt toegekend. Indien er wordt vast gesteld dat een bewerking op een specifieke bewerkingseenheid moet worden uitgevoerd dan wordt deze informatie gebruikt om nauwkeuriger het tijdsinterval te bepalen waarin de operatie uitgevoerd kan worden. De voorgestelde toekenningstechnieken zijn ge-implementeerd in een prototype codegenerator FACTS. Deze code generator is gekoppeld aan de processor synthese omgeving AjRT-designer. Door het koppelen van FACTS aan AjRT-designer kunnen processoren, die bevroren zijn na synthese, hergeprogrammeerd worden. Deze omgeving is gebruikt om de codegeneratie technieken in FACTS te evalueren voor industrieel relevante applicatie domein specifieke processor ontwerpen. De resultaten tonen aan dat er met deze technieken in veel gevallen microcode gegenereerd kan worden die de opslag capaciteit van de register files en de beschikbare verbindingen in de VLIW-processor respecteert en aan stringente eisen wat betreft de rekentijd voldoet

Repository TU/e

Pure OAI Repository