16,565 research outputs found
HotGrid: Graduated Access to Grid-based Science Gateways
We describe the idea of a Science Gateway, an application-specific task wrapped as a web service, and some examples of these that are being implemented on the US TeraGrid cyberinfrastructure. We also describe HotGrid, a means of providing simple, immediate access to the Grid through one of these gateways, which we hope will broaden the use of the Grid, drawing in a wide community of users. The secondary purpose of HotGrid is to acclimate a science community to the concepts of certificate use. Our system provides these weakly authenticated users with immediate power to use the Grid resources for science, but without the dangerous power of running arbitrary code. We describe the implementation of these Science Gateways with the Clarens secure web server
Scalable secure multi-party network vulnerability analysis via symbolic optimization
Threat propagation analysis is a valuable tool in improving the cyber resilience of enterprise networks. As
these networks are interconnected and threats can propagate not only within but also across networks, a holistic view of the entire network can reveal threat propagation trajectories unobservable from within a single enterprise. However, companies are reluctant to share internal vulnerability measurement data as it is highly sensitive and (if leaked) possibly damaging. Secure Multi-Party Computation (MPC) addresses this concern. MPC is a cryptographic technique that allows distrusting parties to compute analytics over their joint data while protecting its confidentiality. In this work we apply MPC to threat propagation analysis on large, federated networks. To address the prohibitively high performance cost of general-purpose MPC we develop two novel applications of optimizations that can be leveraged to execute many relevant graph algorithms under MPC more efficiently: (1) dividing the computation into separate stages such that the first stage is executed privately by each party without MPC and the second stage is an MPC computation dealing with a much smaller shared network, and (2) optimizing the second stage by
treating the execution of the analysis algorithm as a symbolic expression that can be optimized to reduce the number of costly operations and subsequently executed under MPC.We evaluate the scalability of this technique by analyzing the potential for threat propagation on examples of network graphs and propose several directions along which this work can be expanded
Business Process Configuration According to Data Dependency Specification
Configuration techniques have been used in several fields, such as the design of business
process models. Sometimes these models depend on the data dependencies, being easier to describe
what has to be done instead of how. Configuration models enable to use a declarative representation
of business processes, deciding the most appropriate work-flow in each case. Unfortunately,
data dependencies among the activities and how they can affect the correct execution of the process,
has been overlooked in the declarative specifications and configurable systems found in the literature.
In order to find the best process configuration for optimizing the execution time of processes according
to data dependencies, we propose the use of Constraint Programming paradigm with the aim of
obtaining an adaptable imperative model in function of the data dependencies of the activities
described declarative.Ministerio de Ciencia y Tecnología TIN2015-63502-C3-2-RFondo Europeo de Desarrollo Regiona
Distributed Coverage Area Reporting for Wireless Sensor Networks
In order to efficiently deal with subscriptions or other location dependent information, it is key that the wireless sensor network informs the gateways what geographical area is serviced by which gateway. The gateways are then able to e.g. efficiently route subscriptions which are only valid in particular regions of the deployment. \ud
\ud
In our distributed approach of establishing a description of WSN coverage area per gateway, we let nodes keep track of the convex hull of the coverage area. In this way, gateways are efficiently informed of the service areas, while we limit the amount of information each node needs to store, transmit and receive
Deploying Jupyter Notebooks at scale on XSEDE resources for Science Gateways and workshops
Jupyter Notebooks have become a mainstream tool for interactive computing in
every field of science. Jupyter Notebooks are suitable as companion
applications for Science Gateways, providing more flexibility and
post-processing capability to the users. Moreover they are often used in
training events and workshops to provide immediate access to a pre-configured
interactive computing environment. The Jupyter team released the JupyterHub web
application to provide a platform where multiple users can login and access a
Jupyter Notebook environment. When the number of users and memory requirements
are low, it is easy to setup JupyterHub on a single server. However, setup
becomes more complicated when we need to serve Jupyter Notebooks at scale to
tens or hundreds of users. In this paper we will present three strategies for
deploying JupyterHub at scale on XSEDE resources. All options share the
deployment of JupyterHub on a Virtual Machine on XSEDE Jetstream. In the first
scenario, JupyterHub connects to a supercomputer and launches a single node job
on behalf of each user and proxies back the Notebook from the computing node
back to the user's browser. In the second scenario, implemented in the context
of a XSEDE consultation for the IRIS consortium for Seismology, we deploy
Docker in Swarm mode to coordinate many XSEDE Jetstream virtual machines to
provide Notebooks with persistent storage and quota. In the last scenario we
install the Kubernetes containers orchestration framework on Jetstream to
provide a fault-tolerant JupyterHub deployment with a distributed filesystem
and capability to scale to thousands of users. In the conclusion section we
provide a link to step-by-step tutorials complete with all the necessary
commands and configuration files to replicate these deployments.Comment: 7 pages, 3 figures, PEARC '18: Practice and Experience in Advanced
Research Computing, July 22--26, 2018, Pittsburgh, PA, US
Survey and Analysis of Production Distributed Computing Infrastructures
This report has two objectives. First, we describe a set of the production
distributed infrastructures currently available, so that the reader has a basic
understanding of them. This includes explaining why each infrastructure was
created and made available and how it has succeeded and failed. The set is not
complete, but we believe it is representative.
Second, we describe the infrastructures in terms of their use, which is a
combination of how they were designed to be used and how users have found ways
to use them. Applications are often designed and created with specific
infrastructures in mind, with both an appreciation of the existing capabilities
provided by those infrastructures and an anticipation of their future
capabilities. Here, the infrastructures we discuss were often designed and
created with specific applications in mind, or at least specific types of
applications. The reader should understand how the interplay between the
infrastructure providers and the users leads to such usages, which we call
usage modalities. These usage modalities are really abstractions that exist
between the infrastructures and the applications; they influence the
infrastructures by representing the applications, and they influence the ap-
plications by representing the infrastructures
- …