29 research outputs found
Sustainable Edge Computing: Challenges and Future Directions
An increasing amount of data is being injected into the network from IoT
(Internet of Things) applications. Many of these applications, developed to
improve society's quality of life, are latency-critical and inject large
amounts of data into the network. These requirements of IoT applications
trigger the emergence of Edge computing paradigm. Currently, data centers are
responsible for a global energy use between 2% and 3%. However, this trend is
difficult to maintain, as bringing computing infrastructures closer to the edge
of the network comes with its own set of challenges for energy efficiency. In
this paper, we propose our approach for the sustainability of future computing
infrastructures to provide (i) an energy-efficient and economically viable
deployment, (ii) a fault-tolerant automated operation, and (iii) a
collaborative resource management to improve resource efficiency. We identify
the main limitations of applying Cloud-based approaches close to the data
sources and present the research challenges to Edge sustainability arising from
these constraints. We propose two-phase immersion cooling, formal modeling,
machine learning, and energy-centric federated management as Edge-enabling
technologies. We present our early results towards the sustainability of an
Edge infrastructure to demonstrate the benefits of our approach for future
computing environments and deployments.Comment: 26 pages, 16 figure
Energy-aware service provisioning in P2P-assisted cloud ecosystems
Cotutela Universitat Politècnica de Catalunya i Instituto Tecnico de LisboaEnergy has been emerged as a first-class computing resource in modern systems. The trend has primarily led to the strong focus on reducing the energy consumption of data centers, coupled with the growing awareness of the adverse impact on the environment due to data centers. This has led to a strong focus on energy management for server class systems.
In this work, we intend to address the energy-aware service provisioning in P2P-assisted cloud ecosystems, leveraging economics-inspired mechanisms. Toward this goal, we addressed a number of challenges.
To frame an energy aware service provisioning mechanism in the P2P-assisted cloud, first, we need to compare the energy consumption of each individual service in P2P-cloud and data centers.
However, in the procedure of decreasing the energy consumption of cloud services, we may be trapped with the performance violation.
Therefore, we need to formulate a performance aware energy analysis metric, conceptualized across the service provisioning stack. We leverage this metric to derive energy analysis framework.
Then, we sketch a framework to analyze the energy effectiveness in P2P-cloud and data center platforms to choose the right service platform, according to the performance and energy characteristics. This framework maps energy from the hardware oblivious, top level to the particular hardware setting in the bottom layer of the stack.
Afterwards, we introduce an economics-inspired mechanism to increase the energy effectiveness in the P2P-assisted cloud platform as well as moving toward a greener ICT for ICT for a greener ecosystem.La energĂa se ha convertido en un recurso de computaciĂłn de primera clase en los sistemas modernos. La tendencia ha dado lugar principalmente a un fuerte enfoque hacia la reducciĂłn del consumo de energĂa de los centros de datos, asĂ como una creciente conciencia sobre los efectos ambientales negativos, producidos por los centros de datos. Esto ha llevado a un fuerte enfoque en la gestiĂłn de energĂa de los sistemas de tipo servidor. En este trabajo, se pretende hacer frente a la provisiĂłn de servicios de bajo consumo energĂ©tico en los ecosistemas de la nube asistida por P2P, haciendo uso de mecanismos basados en economĂa. Con este objetivo, hemos abordado una serie de desafĂos. Para instrumentar un mecanismo de servicio de aprovisionamiento de energĂa consciente en la nube asistida por P2P, en primer lugar, tenemos que comparar el consumo energĂ©tico de cada servicio en la nube P2P y en los centros de datos. Sin embargo, en el procedimiento de disminuir el consumo de energĂa de los servicios en la nube, podemos quedar atrapados en el incumplimiento del rendimiento. Por lo tanto, tenemos que formular una mĂ©trica, sobre el rendimiento energĂ©tico, a travĂ©s de la pila de servicio de aprovisionamiento. Nos aprovechamos de esta mĂ©trica para derivar un marco de análisis de energĂa. Luego, se esboza un marco para analizar la eficacia energĂ©tica en la nube asistida por P2P y en la plataforma de centros de datos para elegir la plataforma de servicios adecuada, de acuerdo con las caracterĂsticas de rendimiento y energĂa. Este marco mapea la energĂa desde el alto nivel independiente del hardware a la configuraciĂłn de hardware particular en la capa inferior de la pila. Posteriormente, se introduce un mecanismo basado en economĂa para aumentar la eficacia energĂ©tica en la plataforma en la nube asistida por P2P, asĂ como avanzar hacia unas TIC más verdes, para las TIC en un ecosistema más verde.Postprint (published version
Architecting Data Centers for High Efficiency and Low Latency
Modern data centers, housing remarkably powerful computational capacity, are built in massive scales and consume a huge amount of energy. The energy consumption of data centers has mushroomed from virtually nothing to about three percent of the global electricity supply in the last decade, and will continuously grow. Unfortunately, a significant fraction of this energy consumption is wasted due to the inefficiency of current data center architectures, and one of the key reasons behind this inefficiency is the stringent response latency requirements of the user-facing services hosted in these data centers such as web search and social networks. To deliver such low response latency, data center operators often have to overprovision resources to handle high peaks in user load and unexpected load spikes, resulting in low efficiency.
This dissertation investigates data center architecture designs that reconcile high system efficiency and low response latency. To increase the efficiency, we propose techniques that understand both microarchitectural-level resource sharing and system-level resource usage dynamics to enable highly efficient co-locations of latency-critical services and low-priority batch workloads. We investigate the resource sharing on real-system simultaneous multithreading (SMT) processors to enable SMT co-locations by precisely predicting the performance interference. We then leverage historical resource usage patterns to further optimize the task scheduling algorithm and data placement policy to improve the efficiency of workload co-locations. Moreover, we introduce methodologies to better manage the response latency by automatically attributing the source of tail latency to low-level architectural and system configurations in both offline load testing environment and online production environment. We design and develop a response latency evaluation framework at microsecond-level precision for data center applications, with which we construct statistical inference procedures to attribute the source of tail latency. Finally, we present an approach that proactively enacts carefully designed causal inference micro-experiments to diagnose the root causes of response latency anomalies, and automatically correct them to reduce the response latency.PHDComputer Science & EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/144144/1/yunqi_1.pd
Recommended from our members
Energy-Efficient Content Delivery Networks
Internet-scale distributed systems such as content delivery networks (CDNs) operate hundreds of thousands of servers deployed in thousands of data center locations around the globe. Since the energy costs of operating such a large IT infrastructure are a significant fraction of the total operating costs, we argue for redesigning them to incorporate energy optimization as a first-order principle. We focus on CDNs and demonstrate techniques to save energy while meeting client-perceived service level agreements (SLAs) and minimizing impact on hardware reliability.
Servers deployed at individual data centers can be switched off at low load to save energy. We show that it is possible to save energy while providing client-perceived availability and limited impact on hardware reliability. We propose an optimal offline algorithm and an online algorithm to extract energy savings and evaluate them on real production workload traces. Our results show that it is possible to reduce the energy consumption of a CDN by 51% while ensuring a high level of availability and incurring an average of one on-off transition per server per day.
We propose a novel technique called cluster shutdown that switches off an entire cluster of servers, thus saving on both server and cooling power. We present an algorithm for cluster shutdown that is based on realistic power models for servers and cooling equipment and can be implemented as a part of the global load balancer of a CDN. We argue that cluster shutdown has intrinsic architectural advantages over server shutdown techniques in the CDN context, and show that it outperforms server shutdown in a wide range of operating regimes.
To reduce energy costs, we propose a demand-response technique that responds to pricing signals from a smart grid by deferring elastic load. We propose an optimal offline algorithm for demand response and evaluate it on production workloads from a commercial CDN using realistic electricity pricing models. We show that energy cost savings can be achieved with no increase in the bandwidth cost
Energy and performance-optimized scheduling of tasks in distributed cloud and edge computing systems
Infrastructure resources in distributed cloud data centers (CDCs) are shared by heterogeneous applications in a high-performance and cost-effective way. Edge computing has emerged as a new paradigm to provide access to computing capacities in end devices. Yet it suffers from such problems as load imbalance, long scheduling time, and limited power of its edge nodes. Therefore, intelligent task scheduling in CDCs and edge nodes is critically important to construct energy-efficient cloud and edge computing systems. Current approaches cannot smartly minimize the total cost of CDCs, maximize their profit and improve quality of service (QoS) of tasks because of aperiodic arrival and heterogeneity of tasks. This dissertation proposes a class of energy and performance-optimized scheduling algorithms built on top of several intelligent optimization algorithms. This dissertation includes two parts, including background work, i.e., Chapters 3–6, and new contributions, i.e., Chapters 7–11.
1) Background work of this dissertation.
Chapter 3 proposes a spatial task scheduling and resource optimization method to minimize the total cost of CDCs where bandwidth prices of Internet service providers, power grid prices, and renewable energy all vary with locations. Chapter 4 presents a geography-aware task scheduling approach by considering spatial variations in CDCs to maximize the profit of their providers by intelligently scheduling tasks. Chapter 5 presents a spatio-temporal task scheduling algorithm to minimize energy cost by scheduling heterogeneous tasks among CDCs while meeting their delay constraints. Chapter 6 gives a temporal scheduling algorithm considering temporal variations of revenue, electricity prices, green energy and prices of public clouds.
2) Contributions of this dissertation.
Chapter 7 proposes a multi-objective optimization method for CDCs to maximize their profit, and minimize the average loss possibility of tasks by determining task allocation among Internet service providers, and task service rates of each CDC. A simulated annealing-based bi-objective differential evolution algorithm is proposed to obtain an approximate Pareto optimal set. A knee solution is selected to schedule tasks in a high-profit and high-quality-of-service way. Chapter 8 formulates a bi-objective constrained optimization problem, and designs a novel optimization method to cope with energy cost reduction and QoS improvement. It jointly minimizes both energy cost of CDCs, and average response time of all tasks by intelligently allocating tasks among CDCs and changing task service rate of each CDC. Chapter 9 formulates a constrained bi-objective optimization problem for joint optimization of revenue and energy cost of CDCs. It is solved with an improved multi-objective evolutionary algorithm based on decomposition. It determines a high-quality trade-off between revenue maximization and energy cost minimization by considering CDCs’ spatial differences in energy cost while meeting tasks’ delay constraints. Chapter 10 proposes a simulated annealing-based bees algorithm to find a close-to-optimal solution. Then, a fine-grained spatial task scheduling algorithm is designed to minimize energy cost of CDCs by allocating tasks among multiple green clouds, and specifies running speeds of their servers. Chapter 11 proposes a profit-maximized collaborative computation offloading and resource allocation algorithm to maximize the profit of systems and guarantee that response time limits of tasks are met in cloud-edge computing systems. A single-objective constrained optimization problem is solved by a proposed simulated annealing-based migrating birds optimization. This dissertation evaluates these algorithms, models and software with real-life data and proves that they improve scheduling precision and cost-effectiveness of distributed cloud and edge computing systems
Performance modelling with adaptive hidden Markov models and discriminatory processor sharing queues
In modern computer systems, workload varies at different times and locations. It is important to model the performance of such systems via workload models that are both representative and efficient. For example, model-generated workloads represent realistic system behaviour, especially during peak times, when it is crucial to predict and address performance bottlenecks. In this thesis, we model performance, namely throughput and delay, using adaptive models and discrete queues. Hidden Markov models (HMMs) parsimoniously capture the correlation and burstiness of workloads with spatiotemporal characteristics. By adapting the batch training of standard HMMs to incremental learning, online HMMs act as benchmarks on workloads obtained from live systems (i.e. storage systems and financial markets) and reduce time complexity of the Baum-Welch algorithm. Similarly, by extending HMM capabilities to train on multiple traces simultaneously it follows that workloads of different types are modelled in parallel by a multi-input HMM. Typically, the HMM-generated traces verify the throughput and burstiness of the real data. Applications of adaptive HMMs include predicting user behaviour in social networks and performance-energy measurements in smartphone applications. Equally important is measuring system delay through response times. For example, workloads such as Internet traffic arriving at routers are affected by queueing delays. To meet quality of service needs, queueing delays must be minimised and, hence, it is important to model and predict such queueing delays in an efficient and cost-effective manner. Therefore, we propose a class of discrete, processor-sharing queues for approximating queueing delay as response time distributions, which represent service level agreements at specific spatiotemporal levels. We adapt discrete queues to model job arrivals with distributions given by a Markov-modulated Poisson process (MMPP) and served under discriminatory processor-sharing scheduling. Further, we propose a dynamic strategy of service allocation to minimise delays in UDP traffic flows whilst maximising a utility function.Open Acces
Modélisation temporelle de la consommation électrique en analyse du cycle de vie appliquée au contexte des TIC
La Terre a des ressources limitées. Depuis la révolution industrielle l’Homme utilise des
ressources énergétiques non renouvelables qui sont responsables d’impacts environnementaux
majeurs sur l’environnement. La production d’énergie est un enjeu de taille pour l’ensemble du
développement durable.
Les systèmes de technologies de l’information et de la communication (TIC) prennent une
place de plus en plus importante dans notre quotidien (internet, téléphonie etc.). À l’échelle de la
société il est observé que la croissance des TIC est exponentielle. Les avancées en terme de TIC
sont la porte ouverte à de nombreux systèmes intelligents, optimisés et dynamiques, permettant
de dématérialiser les services et de lutter contre le réchauffement climatique. Néanmoins, les TIC
sont aussi responsables d’une quantité non négligeable d’émissions de gaz à effet de serre GES
(3%), induite par leurs consommations Ă©lectriques importantes. De ce fait, le secteur des TIC
collabore activement à mettre en place des mesures visant à réduire les émissions de GES des ses
activités. Pour optimiser et évaluer les services de TIC de façon adéquate il est nécessaire
d’utiliser des méthodes d’évaluations environnementales qui tiennent compte des particularités
des systèmes étudiés. Actuellement, les méthodes de calculs des émissions de GES ne sont pas
adaptées aux problématiques dynamiques dont font parties les TIC. En particulier, la variabilité
de la production d’électricité demeure absente des lignes directrices des méthodes de calculs des
impacts. Au delà de la question de la modélisation des GES c’est toute la problématique
temporelle à la fois de la consommation et de la production d’électricité qui se pose. La méthode
d’analyse du cycle de vie (ACV) apparaît comme un outil complet d’analyse de l’ensemble des
impacts environnementaux mais à l’instar des méthodes de calculs de GES, elle doit aussi être
adaptée à des problématiques dynamiques telle que celles de l’électricité et des TIC. Dans le
cadre de l’analyse des TIC, il devient donc nécessaire en ACV de modéliser les variations au
cours du temps des technologies de production d’électricité susceptibles de faire changer les
impacts environnementaux associés à la consommation d’électricité.
Ce mémoire de maîtrise propose un nouveau cadre méthodologique afin d’incorporer dans
l’ACV les aspects temporels de la production et de la consommation d’électricité. L’étude
développe un modèle temporel donnant accès à une série de données de production,
d’importations et d’exportations électriques. Le travail est mené autour d’un projet de recherche
vi
d’implantation d’un réseau interprovincial de « Cloud Computing » au Canada. Le modèle
temporel permet de définir historiquement l’impact environnemental par heure induit par la
consommation électrique dans trois provinces canadiennes: Alberta, Ontario et Québec. La
modélisation temporelle des différentes technologies de production de l’électricité au sein de
l’ACV permet d’optimiser le choix du moment d’utilisation de service de TIC, comme par
exemple une conversation internet ou encore la maintenance d’un serveur. Ces travaux sont
prometteurs car ils autorisent une Ă©valuation environnementale des TIC plus novatrice et
permettent l’obtention de données d’inventaire en ACV plus précises. La désagrégation des flux
d’inventaire d’électricité en ACV rend le calcul des impacts de la production électrique plus
précis à la fois historiquement mais aussi en temps réel.
Il a Ă©galement Ă©tĂ© menĂ© lors de ce mĂ©moire, une première rĂ©flexion sur l’aspect prĂ©dictif Ă
très court terme des importations et des exportations électriques, afin de pouvoir anticiper
l’optimisation dans le temps d’un service de TIC. À partir des profils historiques de
consommation, un modèle prédictif de consommation du Québec a été établi. Le profil
environnemental d’un kilowattheure consommé au Québec est étroitement relié aux échanges
électriques entre le Québec et les régions avoisinantes. Ces échanges étant corrélés au prix, la
température et la demande en puissance, il est possible de prédire le profil environnemental d’un
kilowattheure consommé au Québec en fonction de l’évolution de ces paramètres dans le temps.
Les résultats obtenus, ouvrent d’importantes perspectives sur l’application prévisionnelle des
impacts environnementaux de services comme le « Cloud Computing » ou encore l’ensemble des
services dits intelligent comme les « Smart-grid ». Une gestion intelligente entre consommation
électrique et impacts environnementaux offre une prise de décision en accords avec le
développement durable.
----------
Fossil fuels are a scarce energy resource. Since the industrial revolution, mankind uses and
abuses of non-renewable energies. They are responsible for many environmental damages. The
production of energy is one of the main challenges for a global sustainable development.
In our society, we can witness an exponential increase of the usage of the systems of
Information and Communication Technologies (ICT) such as Internet, phone calls, etc. The ICT
development allows the creation and optimization of many smart systems, the pooling of
services, and it also helps damping the climate change. However, because of their electric
consumption, the ICT are also responsible for some green house gases (GHG) emissions: 3% in
total. This fact gives them the willingness to change in order to limit their GHG emissions. In
order to properly evaluate and optimize the ICT services, it is necessary to use some methods of
evaluation that comply with the specificity of these systems. Currently, the methods used to
evaluate the GHG emissions are not adapted to dynamic systems, which include the ICT systems.
The variations of the production of electricity in a day or even a month are not yet taken into
account. This problem is far from being restricted to the modelling of GHG emissions, it widens
to the global variation in production and consumption of electricity. The Life Cycle Assessment
(LCA) method grants useful and complete tools to analyse their environmental impacts, but, as
with the GHG computation methods, it should be dynamically adapted. In the ICT framework,
the first step to solve this LCA problem is to be able to model the variations in time of the
electricity production.
This master thesis introduces a new way to include the variation in time of the consumption
and production of electricity in LCA methods. First, it generates an historical hourly database of
the electricity production and import-export of three Canadian states: Alberta, Ontario and
Quebec. Then it develops a model in function of time to predict their electricity consumption.
This study is done for a project implementing a « cloud computing » service in between these
states. The consumption model then provides information to optimize the best place and time to
make use of ICT services such as Internet messaging or server maintenance. This first-ever
implementation of time parameter allows more precision and vision in LCA data. The
disintegration of electrical inventory flows in LCA refines the effects of the electricity production
both historically and in real time.
viii
Some short-term predictions for the state of Quebec electrical exportations and
importations were also computed in this thesis. The goal is to foresee and optimize in real time
the ICT services use. The origin of a kilowatt-hour consumed in Quebec depends on the importexport
variable with its surrounding states. This parameter relies mainly on the price of the
electricity, the weather and the need for the state of Quebec in energy. This allows to plot a timevarying
estimate of the environmental consequences for the consumption of a kilowatt-hour in
Quebec. This can then be used to limit the GHG emission of ICT services like « cloudcomputing
» or « smart-grids ». A smart trade-off between electricity consumption and
environmental issues will lead to a more efficient sustainable development
MACHS: Mitigating the Achilles Heel of the Cloud through High Availability and Performance-aware Solutions
Cloud computing is continuously growing as a business model for hosting information and communication technology applications. However, many concerns arise regarding the quality of service (QoS) offered by the cloud. One major challenge is the high availability (HA) of cloud-based applications. The key to achieving availability requirements is to develop an approach that is immune to cloud failures while minimizing the service level agreement (SLA) violations. To this end, this thesis addresses the HA of cloud-based applications from different perspectives. First, the thesis proposes a component’s HA-ware scheduler (CHASE) to manage the deployments of carrier-grade cloud applications while maximizing their HA and satisfying the QoS requirements. Second, a Stochastic Petri Net (SPN) model is proposed to capture the stochastic characteristics of cloud services and quantify the expected availability offered by an application deployment. The SPN model is then associated with an extensible policy-driven cloud scoring system that integrates other cloud challenges (i.e. green and cost concerns) with HA objectives. The proposed HA-aware solutions are extended to include a live virtual machine migration model that provides a trade-off between the migration time and the downtime while maintaining HA objective. Furthermore, the thesis proposes a generic input template for cloud simulators, GITS, to facilitate the creation of cloud scenarios while ensuring reusability, simplicity, and portability. Finally, an availability-aware CloudSim extension, ACE, is proposed. ACE extends CloudSim simulator with failure injection, computational paths, repair, failover, load balancing, and other availability-based modules
Model-Based Design, Analysis, and Implementations for Power and Energy-Efficient Computing Systems
Modern computing systems are becoming increasingly complex. On one end of
the spectrum, personal computers now commonly support multiple processing
cores, and, on the other end, Internet services routinely employ thousands of
servers in distributed locations to provide the desired service to its users. In
such complex systems, concerns about energy usage and power consumption
are increasingly important. Moreover, growing awareness of environmental
issues has added to the overall complexity by introducing new variables to the
problem. In this regard, the ability to abstractly focus on the relevant details
allows model-based design to help significantly in the analysis and solution of
such problems.
In this dissertation, we explore and analyze model-based design for energy
and power considerations in computing systems. Although the presented techniques
are more generally applicable, we focus their application on large-scale
Internet services operating in U.S. electricity markets. Internet services are becoming
increasingly popular in the ICT ecosystem of today. The physical infrastructure
to support such services is commonly based on a group of cooperative
data centers (DCs) operating in tandem. These DCs are geographically
distributed to provide security and timing guarantees for their customers. To
provide services to millions of customers, DCs employ hundreds of thousands
of servers. These servers consume a large amount of energy that is traditionally
produced by burning coal and employing other environmentally hazardous
methods, such as nuclear and gas power generation plants. This large energy
consumption results in significant and fast-growing financial and environmental
costs. Consequently, for protection of local and global environments, governing
bodies around the globe have begun to introduce legislation to encourage
energy consumers, especially corporate entities, to increase the share of
renewable energy (green energy) in their total energy consumption. However,
in U.S. electricity markets, green energy is usually more expensive than energy
generated from traditional sources like coal or petroleum.
We model the overall problem in three sub-areas and explore different approaches
aimed at reducing the environmental foot print and operating costs
of multi-site Internet services, while honoring the Quality of Service (QoS) constraints
as contracted in service level agreements (SLAs).
Firstly, we model the load distribution among member DCs of a multi-site Internet
service. The use of green energy is optimized considering different factors
such as (a) geographically and temporally variable electricity prices, (b)
the multitude of available energy sources to choose from at each DC, (c) the necessity
to support more than one SLA, and, (d) the requirements to offer more
than one service at each DC. Various approaches are presented for solving this
problem and extensive simulations using Google’s setup in North America are
used to evaluate the presented approaches.
Secondly, we explore the area of shaving the peaks in the energy demand of
large electricity consumers, such as DCs by using a battery-based energy storage
system. Electrical demand of DCs is typically peaky based on the usage
cycle of their customers. Resultant peaks in the electrical demand require development
and maintenance of a costlier energy delivery mechanism, and are
often met using expensive gas or diesel generators which often have a higher
environmental impact. To shave the peak power demand, a battery can be used
which is charged during low load and is discharged during the peak loads.
Since the batteries are costly, we present a scheme to estimate the size of battery
required for any variable electrical load. The electrical load is modeled using
the concept of arrival curves from Network Calculus. Our analysis mechanism
can help determine the appropriate battery size for a given load arrival curve
to reduce the peak.
Thirdly, we present techniques to employ intra-DC scheduling to regulate the
peak power usage of each DC. The model we develop is equally applicable to
an individual server with multi-/many-core chips as well as a complete DC
with an intermix of homogeneous and heterogeneous servers. We evaluate
these approaches on single-core and multi-core chip processors and present the
results.
Overall, our work demonstrates the value of model-based design for intelligent
load distribution across DCs, storage integration, and per DC optimizations
for efficient energy management to reduce operating costs and environmental
footprint for multi-site Internet services
High-Performance Modelling and Simulation for Big Data Applications
This open access book was prepared as a Final Publication of the COST Action IC1406 “High-Performance Modelling and Simulation for Big Data Applications (cHiPSet)“ project. Long considered important pillars of the scientific method, Modelling and Simulation have evolved from traditional discrete numerical methods to complex data-intensive continuous analytical optimisations. Resolution, scale, and accuracy have become essential to predict and analyse natural and complex systems in science and engineering. When their level of abstraction raises to have a better discernment of the domain at hand, their representation gets increasingly demanding for computational and data resources. On the other hand, High Performance Computing typically entails the effective use of parallel and distributed processing units coupled with efficient storage, communication and visualisation systems to underpin complex data-intensive applications in distinct scientific and technical domains. It is then arguably required to have a seamless interaction of High Performance Computing with Modelling and Simulation in order to store, compute, analyse, and visualise large data sets in science and engineering. Funded by the European Commission, cHiPSet has provided a dynamic trans-European forum for their members and distinguished guests to openly discuss novel perspectives and topics of interests for these two communities. This cHiPSet compendium presents a set of selected case studies related to healthcare, biological data, computational advertising, multimedia, finance, bioinformatics, and telecommunications