13 research outputs found
Cookery: A Framework for Creating Data Processing Pipeline Using Online Services
With the increasing amount of data the importance of data analysis in various scientific domains has grown. A large amount of the scientific data has shifted to cloud based storage. The cloud offers storage and computation power. The Cookery framework is a tool developed to build scientific applications using cloud services. In this paper we present the Cookery systems and how they can be used to authenticate and use standard online third party services to easily create data analytic pipelines. Cookery framework is not limited to work with standard web services; it can also integrate and work with the emerging AWS Lambda which is part of a new computing paradigm, collectively, known as serverless computing. The combination of AWS Lambda and Cookery, which makes it possible for users in many scientific domains, who do not have any program experience, to create data processing pipelines using cloud services in a short time
Pando: Personal Volunteer Computing in Browsers
The large penetration and continued growth in ownership of personal
electronic devices represents a freely available and largely untapped source of
computing power. To leverage those, we present Pando, a new volunteer computing
tool based on a declarative concurrent programming model and implemented using
JavaScript, WebRTC, and WebSockets. This tool enables a dynamically varying
number of failure-prone personal devices contributed by volunteers to
parallelize the application of a function on a stream of values, by using the
devices' browsers. We show that Pando can provide throughput improvements
compared to a single personal device, on a variety of compute-bound
applications including animation rendering and image processing. We also show
the flexibility of our approach by deploying Pando on personal devices
connected over a local network, on Grid5000, a French-wide computing grid in a
virtual private network, and seven PlanetLab nodes distributed in a wide area
network over Europe.Comment: 14 pages, 12 figures, 2 table
PROCESS Data Infrastructure and Data Services
Due to energy limitation and high operational costs, it is likely that exascale computing will not be achieved by one or two datacentres but will require many more. A simple calculation, which aggregates the computation power of the 2017 Top500 supercomputers, can only reach 418 petaflops. Companies like Rescale, which claims 1.4 exaflops of peak computing power, describes its infrastructure as composed of 8 million servers spread across 30 datacentres. Any proposed solution to address exascale computing challenges has to take into consideration these facts and by design should aim to support the use of geographically distributed and likely independent datacentres. It should also consider, whenever possible, the co-allocation of the storage with the computation as it would take 3 years to transfer 1 exabyte on a dedicated 100 Gb Ethernet connection. This means we have to be smart about managing data more and more geographically dispersed and spread across different administrative domains. As the natural settings of the PROCESS project is to operate within the European Research Infrastructure and serve the European research communities facing exascale challenges, it is important that PROCESS architecture and solutions are well positioned within the European computing and data management landscape namely PRACE, EGI, and EUDAT. In this paper we propose a scalable and programmable data infrastructure that is easy to deploy and can be tuned to support various data-intensive scientific applications
Reference Exascale Architecture (Extended Version)
While political commitments for building exascale systems have been made, turning these systems into platforms for a wide range of exascale applications faces several technical, organisational and skills-related challenges. The key technical challenges are related to the availability of data. While the first exascale machines are likely to be built within a single site, the input data is in many cases impossible to store within a single site. Alongside handling of extreme-large amount of data, the exascale system has to process data from different sources, support accelerated computing, handle high volume of requests per day, minimize the size of data flows, and be extensible in terms of continuously increasing data as well as an increase in parallel requests being sent. These technical challenges are addressed by the general reference exascale architecture. It is divided into three main blocks: virtualization layer, distributed virtual file system, and manager of computing resources. Its main property is modularity which is achieved by containerization at two levels: 1) application containers - containerization of scientific workflows, 2) micro-infrastructure - containerization of extreme-large data service-oriented infrastructure. The paper also presents an instantiation of the reference architecture - the architecture of the PROCESS project (PROviding Computing solutions for ExaScale ChallengeS) and discusses its relation to the reference exascale architecture. The PROCESS architecture has been used as an exascale platform within various exascale pilot applications. This paper also presents performance modelling of exascale platform with its validation
Toward Executable Scientific Publications
AbstractReproducibility of experiments is considered as one of the main principles of the scientific method. Recent developments in data and computation intensive science, i.e. e-Science, and state of the art in Cloud computing provide the necessary components to preserve data sets and re-run code and software that create research data. The Executable Paper (EP) concept uses state of the art technology to include data sets, code, and software in the electronic publication such that readers can validate the presented results. In this paper we present how to advance current state of the art to preserve, data sets, code, and software that create research data, the basic components of an execution platform to preserve long term compatibility of EP, and we identify a number of issues and challenges in the realization of EP
Recommended from our members
Observer’s anxiety facilitates magnocellular processing of clear facial threat cues, but impairs parvocellular processing of ambiguous facial threat cues
Facial expression and eye gaze provide a shared signal about threats. While a fear expression with averted gaze clearly points to the source of threat, direct-gaze fear renders the source of threat ambiguous. Separable routes have been proposed to mediate these processes, with preferential attunement of the magnocellular (M) pathway to clear threat, and of the parvocellular (P) pathway to threat ambiguity. Here we investigated how observers’ trait anxiety modulates M- and P-pathway processing of clear and ambiguous threat cues. We scanned subjects (N = 108) widely ranging in trait anxiety while they viewed fearful or neutral faces with averted or directed gaze, with the luminance and color of face stimuli calibrated to selectively engage M- or P-pathways. Higher anxiety facilitated processing of clear threat projected to M-pathway, but impaired perception of ambiguous threat projected to P-pathway. Increased right amygdala reactivity was associated with higher anxiety for M-biased averted-gaze fear, while increased left amygdala reactivity was associated with higher anxiety for P-biased, direct-gaze fear. This lateralization was more pronounced with higher anxiety. Our findings suggest that trait anxiety differentially affects perception of clear (averted-gaze fear) and ambiguous (direct-gaze fear) facial threat cues via selective engagement of M and P pathways and lateralized amygdala reactivity
Recommended from our members
Neurodynamics and connectivity during facial fear perception: The role of threat exposure and signal congruity
Fearful faces convey threat cues whose meaning is contextualized by eye gaze: While averted gaze is congruent with facial fear (both signal avoidance), direct gaze (an approach signal) is incongruent with it. We have previously shown using fMRI that the amygdala is engaged more strongly by fear with averted gaze during brief exposures. However, the amygdala also responds more to fear with direct gaze during longer exposures. Here we examined previously unexplored brain oscillatory responses to characterize the neurodynamics and connectivity during brief (~250 ms) and longer (~883 ms) exposures of fearful faces with direct or averted eye gaze. We performed two experiments: one replicating the exposure time by gaze direction interaction in fMRI (N = 23), and another where we confirmed greater early phase locking to averted-gaze fear (congruent threat signal) with MEG (N = 60) in a network of face processing regions, regardless of exposure duration. Phase locking to direct-gaze fear (incongruent threat signal) then increased significantly for brief exposures at ~350 ms, and at ~700 ms for longer exposures. Our results characterize the stages of congruent and incongruent facial threat signal processing and show that stimulus exposure strongly affects the onset and duration of these stages
Reference exascale architecture (extended version)
While political commitments for building exascale systems have been made, turning these systems into platforms for a wide range of exascale applications faces several technical, organisational and skills-related challenges. The key technical challenges are related to the availability of data. While the first exascale machines are likely to be built within a single site, the input data is in many cases impossible to store within a single site. Alongside handling of extreme-large amount of data, the exascale system has to process data from different sources, support accelerated computing, handle high volume of requests per day, minimize the size of data flows, and be extensible in terms of continuously increasing data as well as an increase in parallel requests being sent. These technical challenges are addressed by the general reference exascale architecture. It is divided into three main blocks: virtualization layer, distributed virtual file system, and manager of computing resources. Its main property is modularity which is achieved by containerization at two levels: 1) application containers - containerization of scientific workflows, 2) micro-infrastructure - containerization of extreme-large data service-oriented infrastructure. The paper also presents an instantiation of the reference architecture - the architecture of the PROCESS project (PROviding Computing solutions for ExaScale ChallengeS) and discusses its relation to the reference exascale architecture. The PROCESS architecture has been used as an exascale platform within various exascale pilot applications. This paper also presents performance modelling of exascale platform with its validation