31 research outputs found
Pando: Personal Volunteer Computing in Browsers
The large penetration and continued growth in ownership of personal
electronic devices represents a freely available and largely untapped source of
computing power. To leverage those, we present Pando, a new volunteer computing
tool based on a declarative concurrent programming model and implemented using
JavaScript, WebRTC, and WebSockets. This tool enables a dynamically varying
number of failure-prone personal devices contributed by volunteers to
parallelize the application of a function on a stream of values, by using the
devices' browsers. We show that Pando can provide throughput improvements
compared to a single personal device, on a variety of compute-bound
applications including animation rendering and image processing. We also show
the flexibility of our approach by deploying Pando on personal devices
connected over a local network, on Grid5000, a French-wide computing grid in a
virtual private network, and seven PlanetLab nodes distributed in a wide area
network over Europe.Comment: 14 pages, 12 figures, 2 table
Cookery: A Framework for Creating Data Processing Pipeline Using Online Services
With the increasing amount of data the importance of data analysis in various scientific domains has grown. A large amount of the scientific data has shifted to cloud based storage. The cloud offers storage and computation power. The Cookery framework is a tool developed to build scientific applications using cloud services. In this paper we present the Cookery systems and how they can be used to authenticate and use standard online third party services to easily create data analytic pipelines. Cookery framework is not limited to work with standard web services; it can also integrate and work with the emerging AWS Lambda which is part of a new computing paradigm, collectively, known as serverless computing. The combination of AWS Lambda and Cookery, which makes it possible for users in many scientific domains, who do not have any program experience, to create data processing pipelines using cloud services in a short time
Reference Exascale Architecture (Extended Version)
While political commitments for building exascale systems have been made, turning these systems into platforms for a wide range of exascale applications faces several technical, organisational and skills-related challenges. The key technical challenges are related to the availability of data. While the first exascale machines are likely to be built within a single site, the input data is in many cases impossible to store within a single site. Alongside handling of extreme-large amount of data, the exascale system has to process data from different sources, support accelerated computing, handle high volume of requests per day, minimize the size of data flows, and be extensible in terms of continuously increasing data as well as an increase in parallel requests being sent. These technical challenges are addressed by the general reference exascale architecture. It is divided into three main blocks: virtualization layer, distributed virtual file system, and manager of computing resources. Its main property is modularity which is achieved by containerization at two levels: 1) application containers - containerization of scientific workflows, 2) micro-infrastructure - containerization of extreme-large data service-oriented infrastructure. The paper also presents an instantiation of the reference architecture - the architecture of the PROCESS project (PROviding Computing solutions for ExaScale ChallengeS) and discusses its relation to the reference exascale architecture. The PROCESS architecture has been used as an exascale platform within various exascale pilot applications. This paper also presents performance modelling of exascale platform with its validation
PROCESS Data Infrastructure and Data Services
Due to energy limitation and high operational costs, it is likely that exascale computing will not be achieved by one or two datacentres but will require many more. A simple calculation, which aggregates the computation power of the 2017 Top500 supercomputers, can only reach 418 petaflops. Companies like Rescale, which claims 1.4 exaflops of peak computing power, describes its infrastructure as composed of 8 million servers spread across 30 datacentres. Any proposed solution to address exascale computing challenges has to take into consideration these facts and by design should aim to support the use of geographically distributed and likely independent datacentres. It should also consider, whenever possible, the co-allocation of the storage with the computation as it would take 3 years to transfer 1 exabyte on a dedicated 100 Gb Ethernet connection. This means we have to be smart about managing data more and more geographically dispersed and spread across different administrative domains. As the natural settings of the PROCESS project is to operate within the European Research Infrastructure and serve the European research communities facing exascale challenges, it is important that PROCESS architecture and solutions are well positioned within the European computing and data management landscape namely PRACE, EGI, and EUDAT. In this paper we propose a scalable and programmable data infrastructure that is easy to deploy and can be tuned to support various data-intensive scientific applications
A risk-level assessment system based on the STRIDE/DREAD model for digital data marketplaces
AbstractSecurity is a top concern in digital infrastructure and there is a basic need to assess the level of security ensured for any given application. To accommodate this requirement, we propose a new risk assessment system. Our system identifies threats of an application workflow, computes the severity weights with the modified Microsoft STRIDE/DREAD model and estimates the final risk exposure after applying security countermeasures in the available digital infrastructures. This allows potential customers to rank these infrastructures in terms of security for their own specific use cases. We additionally present a method to validate the stability and resolution of our ranking system with respect to subjective choices of the DREAD model threat rating parameters. Our results show that our system is stable against unavoidable subjective choices of the DREAD model parameters for a specific use case, with a rank correlation higher than 0.93 and normalised mean square error lower than 0.05.</jats:p
