34,962 research outputs found
HPC compact quasi-Newton algorithm for interface problems
In this work we present a robust interface coupling algorithm called Compact
Interface quasi-Newton (CIQN). It is designed for computationally intensive
applications using an MPI multi-code partitioned scheme. The algorithm allows
to reuse information from previous time steps, feature that has been previously
proposed to accelerate convergence. Through algebraic manipulation, an
efficient usage of the computational resources is achieved by: avoiding
construction of dense matrices and reduce every multiplication to a
matrix-vector product and reusing the computationally expensive loops. This
leads to a compact version of the original quasi-Newton algorithm. Altogether
with an efficient communication, in this paper we show an efficient scalability
up to 4800 cores. Three examples with qualitatively different dynamics are
shown to prove that the algorithm can efficiently deal with added mass
instability and two-field coupled problems. We also show how reusing histories
and filtering does not necessarily makes a more robust scheme and, finally, we
prove the necessity of this HPC version of the algorithm. The novelty of this
article lies in the HPC focused implementation of the algorithm, detailing how
to fuse and combine the composing blocks to obtain an scalable MPI
implementation. Such an implementation is mandatory in large scale cases, for
which the contact surface cannot be stored in a single computational node, or
the number of contact nodes is not negligible compared with the size of the
domain. \c{opyright} Elsevier. This manuscript version is made available
under the CC-BY-NC-ND 4.0 license
http://creativecommons.org/licenses/by-nc-nd/4.0/Comment: 33 pages: 23 manuscript, 10 appendix. 16 figures: 4 manuscript, 12
appendix. 10 Tables: 3 manuscript, 7 appendi
Improving the Performance and Endurance of Persistent Memory with Loose-Ordering Consistency
Persistent memory provides high-performance data persistence at main memory.
Memory writes need to be performed in strict order to satisfy storage
consistency requirements and enable correct recovery from system crashes.
Unfortunately, adhering to such a strict order significantly degrades system
performance and persistent memory endurance. This paper introduces a new
mechanism, Loose-Ordering Consistency (LOC), that satisfies the ordering
requirements at significantly lower performance and endurance loss. LOC
consists of two key techniques. First, Eager Commit eliminates the need to
perform a persistent commit record write within a transaction. We do so by
ensuring that we can determine the status of all committed transactions during
recovery by storing necessary metadata information statically with blocks of
data written to memory. Second, Speculative Persistence relaxes the write
ordering between transactions by allowing writes to be speculatively written to
persistent memory. A speculative write is made visible to software only after
its associated transaction commits. To enable this, our mechanism supports the
tracking of committed transaction ID and multi-versioning in the CPU cache. Our
evaluations show that LOC reduces the average performance overhead of memory
persistence from 66.9% to 34.9% and the memory write traffic overhead from
17.1% to 3.4% on a variety of workloads.Comment: This paper has been accepted by IEEE Transactions on Parallel and
Distributed System
Brain connectivity Patterns Dissociate action of specific Acupressure Treatments in Fatigued Breast cancer survivors
Funding This work was supported by grants R01 CA151445 and 2UL1 TR000433-06 from the National Institutes of Health. The funding source had no role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; preparation, review, or approval of the manuscript; and decision to submit the manuscript for publication. We thank the expert assistance by Dr. Bradley Foerster in acquisition of 1H-MRS and fMRI data.Peer reviewedPublisher PD
Avoiding core's DUE & SDC via acoustic wave detectors and tailored error containment and recovery
The trend of downsizing transistors and operating voltage scaling has made the processor chip more sensitive against radiation phenomena making soft errors an important challenge. New reliability techniques for handling soft errors in the logic and memories that allow meeting the desired failures-in-time (FIT) target are key to keep harnessing the benefits of Moore's law. The failure to scale the soft error rate caused by particle strikes, may soon limit the total number of cores that one may have running at the same time. This paper proposes a light-weight and scalable architecture to eliminate silent data corruption errors (SDC) and detected unrecoverable errors (DUE) of a core. The architecture uses acoustic wave detectors for error detection. We propose to recover by confining the errors in the cache hierarchy, allowing us to deal with the relatively long detection latencies. Our results show that the proposed mechanism protects the whole core (logic, latches and memory arrays) incurring performance overhead as low as 0.60%. © 2014 IEEE.Peer ReviewedPostprint (author's final draft
The Role of Inter-Controller Traffic for Placement of Distributed SDN Controllers
We consider a distributed Software Defined Networking (SDN) architecture
adopting a cluster of multiple controllers to improve network performance and
reliability. Besides the Openflow control traffic exchanged between controllers
and switches, we focus on the control traffic exchanged among the controllers
in the cluster, needed to run coordination and consensus algorithms to keep the
controllers synchronized. We estimate the effect of the inter-controller
communications on the reaction time perceived by the switches depending on the
data-ownership model adopted in the cluster. The model is accurately validated
in an operational Software Defined WAN (SDWAN). We advocate a careful placement
of the controllers, that should take into account both the above kinds of
control traffic. We evaluate, for some real ISP network topologies, the delay
tradeoffs for the controllers placement problem and we propose a novel
evolutionary algorithm to find the corresponding Pareto frontier. Our work
provides novel quantitative tools to optimize the planning and the design of
the network supporting the control plane of SDN networks, especially when the
network is very large and in-band control plane is adopted. We also show that
for operational distributed controllers (e.g. OpenDaylight and ONOS), the
location of the controller which acts as a leader in the consensus algorithm
has a strong impact on the reactivity perceived by switches.Comment: 14 page
A diagnostic model framework for water use in rice-based irrigation systems
Models / Crop-based irrigation / Rice / Water use / Irrigation management / Constraints / Water availability / Water balance / Sensitivity analysis / Irrigation requirements / Ivory Coast / Loka Catchment Area
- …