1,031 research outputs found
Toward High-Performance Computing and Big Data Analytics Convergence: The Case of Spark-DIY
Convergence between high-performance computing (HPC) and big data analytics (BDA) is currently an established research area that has spawned new opportunities for unifying the platform layer and data abstractions in these ecosystems. This work presents an architectural model that enables the interoperability of established BDA and HPC execution models, reflecting the key design features that interest both the HPC and BDA communities, and including an abstract data collection and operational model that generates a unified interface for hybrid applications. This architecture can be implemented in different ways depending on the process- and data-centric platforms of choice and the mechanisms put in place to effectively meet the requirements of the architecture. The Spark-DIY platform is introduced in the paper as a prototype implementation of the architecture proposed. It preserves the interfaces and execution environment of the popular BDA platform Apache Spark, making it compatible with any Spark-based application and tool, while providing efficient communication and kernel execution via DIY, a powerful communication pattern library built on top of MPI. Later, Spark-DIY is analyzed in terms of performance by building a representative use case from the hydrogeology domain, EnKF-HGS. This application is a clear example of how current HPC simulations are evolving toward hybrid HPC-BDA applications, integrating HPC simulations within a BDA environment.This work was supported in part by the Spanish Ministry of Economy, Industry and Competitiveness under Grant TIN2016-79637-P(toward Unification of HPC and Big Data Paradigms), in part by the Spanish Ministry of Education under Grant FPU15/00422 TrainingProgram for Academic and Teaching Staff Grant, in part by the Advanced Scientific Computing Research, Office of Science, U.S.Department of Energy, under Contract DE-AC02-06CH11357, and in part by the DOE with under Agreement DE-DC000122495,Program Manager Laura Biven
Trustworthy Knowledge Planes For Federated Distributed Systems
In federated distributed systems, such as the Internet and the public cloud, the constituent systems can differ in their configuration and provisioning, resulting in significant impacts on the performance, robustness, and security of applications. Yet these systems lack support for distinguishing such characteristics, resulting in uninformed service selection and poor inter-operator coordination. This thesis presents the design and implementation of a trustworthy knowledge plane that can determine such characteristics about autonomous networks on the Internet. A knowledge plane collects the state of network devices and participants. Using this state, applications infer whether a network possesses some characteristic of interest. The knowledge plane uses attestation to attribute state descriptions to the principals that generated them, thereby making the results of inference more trustworthy. Trustworthy knowledge planes enable applications to establish stronger assumptions about their network operating environment, resulting in improved robustness and reduced deployment barriers. We have prototyped the knowledge plane and associated devices. Experience with deploying analyses over production networks demonstrate that knowledge planes impose low cost and can scale to support Internet-scale networks
Prism: Revealing Hidden Functional Clusters from Massive Instances in Cloud Systems
Ensuring the reliability of cloud systems is critical for both cloud vendors
and customers. Cloud systems often rely on virtualization techniques to create
instances of hardware resources, such as virtual machines. However,
virtualization hinders the observability of cloud systems, making it
challenging to diagnose platform-level issues. To improve system observability,
we propose to infer functional clusters of instances, i.e., groups of instances
having similar functionalities. We first conduct a pilot study on a large-scale
cloud system, i.e., Huawei Cloud, demonstrating that instances having similar
functionalities share similar communication and resource usage patterns.
Motivated by these findings, we formulate the identification of functional
clusters as a clustering problem and propose a non-intrusive solution called
Prism. Prism adopts a coarse-to-fine clustering strategy. It first partitions
instances into coarse-grained chunks based on communication patterns. Within
each chunk, Prism further groups instances with similar resource usage patterns
to produce fine-grained functional clusters. Such a design reduces noises in
the data and allows Prism to process massive instances efficiently. We evaluate
Prism on two datasets collected from the real-world production environment of
Huawei Cloud. Our experiments show that Prism achieves a v-measure of ~0.95,
surpassing existing state-of-the-art solutions. Additionally, we illustrate the
integration of Prism within monitoring systems for enhanced cloud reliability
through two real-world use cases.Comment: The paper was accepted by the 38th IEEE/ACM International Conference
on Automated Software Engineering (ASE 2023
Trustworthy Knowledge Planes For Federated Distributed Systems
In federated distributed systems, such as the Internet and the public cloud, the constituent systems can differ in their configuration and provisioning, resulting in significant impacts on the performance, robustness, and security of applications. Yet these systems lack support for distinguishing such characteristics, resulting in uninformed service selection and poor inter-operator coordination. This thesis presents the design and implementation of a trustworthy knowledge plane that can determine such characteristics about autonomous networks on the Internet. A knowledge plane collects the state of network devices and participants. Using this state, applications infer whether a network possesses some characteristic of interest. The knowledge plane uses attestation to attribute state descriptions to the principals that generated them, thereby making the results of inference more trustworthy. Trustworthy knowledge planes enable applications to establish stronger assumptions about their network operating environment, resulting in improved robustness and reduced deployment barriers. We have prototyped the knowledge plane and associated devices. Experience with deploying analyses over production networks demonstrate that knowledge planes impose low cost and can scale to support Internet-scale networks
New Waves of IoT Technologies Research – Transcending Intelligence and Senses at the Edge to Create Multi Experience Environments
The next wave of Internet of Things (IoT) and Industrial Internet of Things (IIoT) brings new technological developments that incorporate radical advances in Artificial Intelligence (AI), edge computing processing, new sensing capabilities, more security protection and autonomous functions accelerating progress towards the ability for IoT systems to self-develop, self-maintain and self-optimise. The emergence of hyper autonomous IoT applications with enhanced sensing, distributed intelligence, edge processing and connectivity, combined with human augmentation, has the potential to power the transformation and optimisation of industrial sectors and to change the innovation landscape. This chapter is reviewing the most recent advances in the next wave of the IoT by looking not only at the technology enabling the IoT but also at the platforms and smart data aspects that will bring intelligence, sustainability, dependability, autonomy, and will support human-centric solutions.acceptedVersio
Recommended from our members
Novel processes for smart grid information exchange and knowledge representation using the IEC common information model
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The IEC Common Information Model (CIM) is of central importance in enabling smart grid interoperability. Its continual development aims to meet the needs of the smart grid for semantic understanding and knowledge
representation for a widening domain of resources and processes. With smart grid evolution the importance of information and data management has become an increasingly pressing issue not only because far more data is being generated using modern sensing, control and measuring devices but
also because information is now becoming recognised as the ‘integral component’ that facilitates the optimal flexibility required of the smart grid. This thesis looks at the impacts of CIM implementation upon the landscape of smart grid issues and presents research from within National Grid
contributing to three key areas in support of further CIM deployment. Taking the issue of Enterprise Information Management first, an information management framework is presented for CIM deployment at National Grid. Following this the development and demonstration of a novel secure cloud
computing platform to handle such information is described. Power system application (PSA) models of the grid are partial knowledge representations of a shared reality. To develop the completeness of our understanding of this reality it is necessary to combine these representations.
The second research contribution reports on a novel methodology for a CIM-based
model repository to align PSA representations and provide a
knowledge resource for building utility business intelligence of the grid.
The third contribution addresses the need for greater integration of information relating to energy storage, an essential aspect of smart energy management. It presents the strategic rationale for integrated energy modeling and a novel extension to the existing CIM standards for modeling grid-scale energy storage. Significantly, this work has already contributed to a larger body of work on modeling Distributed Energy Resources currently under development at the Electric Power Research Institute (EPRI) in the
USA.Dr. Martin Bradley on behalf of National Grid Plc. and the Engineering and Physical
Sciences Research Council (EPSRC
Universal Mobile Service Execution Framework for Device-To-Device Collaborations
There are high demands of effective and high-performance of collaborations between mobile devices in the places where traditional Internet connections are unavailable, unreliable, or significantly overburdened, such as on a battlefield, disaster zones, isolated rural areas, or crowded public venues. To enable collaboration among the devices in opportunistic networks, code offloading and Remote Method Invocation are the two major mechanisms to ensure code portions of applications are successfully transmitted to and executed on the remote platforms. Although these domains are highly enjoyed in research for a decade, the limitations of multi-device connectivity, system error handling or cross platform compatibility prohibit these technologies from being broadly applied in the mobile industry.
To address the above problems, we designed and developed UMSEF - an Universal Mobile Service Execution Framework, which is an innovative and radical approach for mobile computing in opportunistic networks. Our solution is built as a component-based mobile middleware architecture that is flexible and adaptive with multiple network topologies, tolerant for network errors and compatible for multiple platforms. We provided an effective algorithm to estimate the resource availability of a device for higher performance and energy consumption and a novel platform for mobile remote method invocation based on declarative annotations over multi-group device networks. The experiments in reality exposes our approach not only achieve the better performance and energy consumption, but can be extended to large-scaled ubiquitous or IoT systems
Monitoring in Hybrid Cloud-Edge Environments
The increasing number of mobile and IoT(Internet of Things) devices accessing cloud
services contributes to a surge of requests towards the Cloud and consequently, higher
latencies. This is aggravated by the possible congestion of the communication networks
connecting the end devices and remote cloud datacenters, due to the large data volume
generated at the Edge (e.g. in the domains of smart cities, smart cars, etc.). One solution
for this problem is the creation of hybrid Cloud/Edge execution platforms composed of
computational nodes located in the periphery of the system, near data producers and consumers,
as a way to complement the cloud resources. These edge nodes offer computation
and data storage resources to accommodate local services in order to ensure rapid responses
to clients (enhancing the perceived quality of service) and to filter data, reducing
the traffic volume towards the Cloud. Usually these nodes (e.g. ISP access points and onpremises
servers) are heterogeneous, geographically distributed, and resource-restricted
(including in communication networks), which increase their management’s complexity.
At the application level, the microservices paradigm, represented by applications composed
of small, loosely coupled services, offers an adequate and flexible solution to design
applications that may explore the limited computational resources in the Edge.
Nevertheless, the inherent difficult management of microservices within such complex
infrastructure demands an agile and lightweight monitoring system that takes into
account the Edge’s limitations, which goes behind traditional monitoring solutions at the
Cloud. Monitoring in these new domains is not a simple process since it requires supporting
the elasticity of the monitored system, the dynamic deployment of services and,
moreover, doing so without overloading the infrastructure’s resources with its own computational
requirements and generated data. Towards this goal, this dissertation presents
an hybrid monitoring architecture where the heavier (resource-wise) components reside
in the Cloud while the lighter (computationally less demanding) components reside in
the Edge. The architecture provides relevant monitoring functionalities such as metrics’
acquisition, their analysis and mechanisms for real-time alerting. The objective is the efficient use of computational resources in the infrastructure while guaranteeing an agile
delivery of monitoring data where and when it is needed.Tem-se vindo a verificar um aumento significativo de dispositivos móveis e do domínio
IoT(Internet of Things) em áreas emergentes como Smart Cities, Smart Cars, etc., que
fazem pedidos a serviços localizados normalmente na Cloud, muitas vezes a partir de
locais remotos. Como consequência, prevê-se um aumento da latência no processamento
destes pedidos, que poderá ser agravado pelo congestionamento dos canais de comunicação,
da periferia até aos centros de dados. Uma forma de solucionar este problema
passa pela criação de sistemas híbridos Cloud/Edge, compostos por nós computacionais
que estão localizados na periferia do sistema, perto dos produtores e consumidores de
dados, complementando assim os recursos computacionais da Cloud. Os nós da Edge
permitem não só alojar dados e computações, garantindo uma resposta mais rápida aos
clientes e uma melhor qualidade do serviço, como também permitem filtrar alguns dos
dados, evitando deste modo transferências de dados desnecessárias para o núcleo do sistema.
Contudo, muitos destes nós (e.g. pontos de acesso, servidores proprietários) têm
uma capacidade limitada, são bastante heterogéneos e/ou encontram-se espalhados geograficamente,
o que dificulta a gestão dos recursos. O paradigma de micro-serviços,
representado por aplicações compostas por serviços de reduzida dimensão, desacoplados
na sua funcionalidade e que comunicam por mensagens, fornece uma solução adequada
para explorar os recursos computacionais na periferia.
No entanto, o mapeamento adequado dos micro-serviços na infra-estrutura, além de
ser complexo, é difícil de gerir e requer um sistema de monitorização ligeiro e ágil, que
considere as capacidades limitadas da infra-estrutura de suporte na periferia. A monitorização
não é um processo simples pois deve possibilitar a elasticidade do sistema, tendo
em conta as adaptações de "deployment", e sem sobrecarregar os recursos computacionais
ou de rede. Este trabalho apresenta uma arquitectura de monitorização híbrida, com
componentes de maior complexidade na Cloud e componentes mais simples na Edge. A
arquitectura fornece funcionalidades importantes de monitorização, como a recolha de métricas variadas, a sua análise e alertas em tempo real. O objetivo é rentabilizar os recursos
computacionais garantindo a entrega dos dados mais relevantes quando necessário
- …