406,365 research outputs found

    Reusable, Extensible High-Level Data-Distribution Concept

    Get PDF
    A framework for high-level specification of data distributions in data-parallel application programs has been conceived. [As used here, distributions signifies means to express locality (more specifically, locations of specified pieces of data) in a computing system composed of many processor and memory components connected by a network.] Inasmuch as distributions exert a great effect on the performances of application programs, it is important that a distribution strategy be flexible, so that distributions can be adapted to the requirements of those programs. At the same time, for the sake of productivity in programming and execution, it is desirable that users be shielded from such error-prone, tedious details as those of communication and synchronization. As desired, the present framework enables a user to refine a distribution type and adjust it to optimize the performance of an application program and conceals, from the user, the low-level details of communication and synchronization. The framework provides for a reusable, extensible, data-distribution design, denoted the design pattern, that is independent of a concrete implementation. The design pattern abstracts over coding patterns that have been found to be commonly encountered in both manually and automatically generated distributed parallel programs. The following description of the present framework is necessarily oversimplified to fit within the space available for this article. Distributions are among the elements of a conceptual data-distribution machinery, some of the other elements being denoted domains, index sets, and data collections (see figure). Associated with each domain is one index set and one distribution. A distribution class interface (where "class" is used in the object-oriented-programming sense) includes operations that enable specification of the mapping of an index to a unit of locality. Thus, "Map(Index)" specifies a unit, while "LocalLayout(Index)" specifies the local address within that unit. The distribution class can be extended to enable specification of commonly used distributions or novel user-defined distributions. A data collection can be defined over a domain. The term "data collection" in this context signifies, more specifically, an abstraction of mappings from index sets to variables. Since the index set is distributed, the addresses of the variables are also distributed

    Implementing Access to Data Distributed on Many Processors

    Get PDF
    A reference architecture is defined for an object-oriented implementation of domains, arrays, and distributions written in the programming language Chapel. This technology primarily addresses domains that contain arrays that have regular index sets with the low-level implementation details being beyond the scope of this discussion. What is defined is a complete set of object-oriented operators that allows one to perform data distributions for domain arrays involving regular arithmetic index sets. What is unique is that these operators allow for the arbitrary regions of the arrays to be fragmented and distributed across multiple processors with a single point of access giving the programmer the illusion that all the elements are collocated on a single processor. Today's massively parallel High Productivity Computing Systems (HPCS) are characterized by a modular structure, with a large number of processing and memory units connected by a high-speed network. Locality of access as well as load balancing are primary concerns in these systems that are typically used for high-performance scientific computation. Data distributions address these issues by providing a range of methods for spreading large data sets across the components of a system. Over the past two decades, many languages, systems, tools, and libraries have been developed for the support of distributions. Since the performance of data parallel applications is directly influenced by the distribution strategy, users often resort to low-level programming models that allow fine-tuning of the distribution aspects affecting performance, but, at the same time, are tedious and error-prone. This technology presents a reusable design of a data-distribution framework for data parallel high-performance applications. Distributions are a means to express locality in systems composed of large numbers of processor and memory components connected by a network. Since distributions have a great effect on the performance of applications, it is important that the distribution strategy is flexible, so its behavior can change depending on the needs of the application. At the same time, high productivity concerns require that the user be shielded from error-prone, tedious details such as communication and synchronization

    Outlier Identification in Spatio-Temporal Processes

    Full text link
    This dissertation answers some of the statistical challenges arising in spatio-temporal data from Internet traffic, electricity grids and climate models. It begins with methodological contributions to the problem of anomaly detection in communication networks. Using electricity consumption patterns for University of Michigan campus, the well known spatial prediction method kriging has been adapted for identification of false data injections into the system. Events like Distributed Denial of Service (DDoS), Botnet/Malware attacks, Port Scanning etc. call for methods which can identify unusual activity in Internet traffic patterns. Storing information on the entire network though feasible cannot be done at the time scale at which data arrives. In this work, hashing techniques which can produce summary statistics for the network have been used. The hashed data so obtained indeed preserves the heavy tailed nature of traffic payloads, thereby providing a platform for the application of extreme value theory (EVT) to identify heavy hitters in volumetric attacks. These methods based on EVT require the estimation of the tail index of a heavy tailed distribution. The traditional estimators (Hill et al. (1975)) for the tail index tend to be biased in the presence of outliers. To circumvent this issue, a trimmed version of the classic Hill estimator has been proposed and studied from a theoretical perspective. For the Pareto domain of attraction, the optimality and asymptotic normality of the estimator has been established. Additionally, a data driven strategy to detect the number of extreme outliers in heavy tailed data has also been presented. The dissertation concludes with the statistical formulation of m-year return levels of extreme climatic events (heat/cold waves). The Generalized Pareto distribution (GPD) serves as good fit for modeling peaks over threshold of a distribution. Allowing the parameters of the GPD to vary as a function of covariates such as time of the year, El-Nino and location in the US, extremes of the areal impact of heat waves have been well modeled and inferred.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/145789/1/shrijita_1.pd

    Non-centralized Control for Flow-based Distribution Networks: A Game-theoretical Insight

    Get PDF
    This paper solves a data-driven control problem for a flow-based distribution network with two objectives: a resource allocation and a fair distribution of costs. These objectives represent both cooperation and competition directions. It is proposed a solution that combines either a centralized or distributed cooperative game approach using the Shapley value to determine a proper partitioning of the system and a fair communication cost distribution. On the other hand, a decentralized noncooperative game approach computing the Nash equilibrium is used to achieve the control objective of the resource allocation under a non-complete information topology. Furthermore, an invariant-set property is presented and the closed-loop system stability is analyzed for the non cooperative game approach. Another contribution regarding the cooperative game approach is an alternative way to compute the Shapley value for the proposed specific characteristic function. Unlike the classical cooperative-games approach, which has a limited application due to the combinatorial explosion issues, the alternative method allows calculating the Shapley value in polynomial time and hence can be applied to large-scale problems.Generalitat de Catalunya FI 2014Ministerio de Ciencia y Educación DPI2016-76493-C3-3-RMinisterio de Ciencia y Educación DPI2008-05818Proyecto europeo FP7-ICT DYMASO

    Improving the Performance of Low Voltage Networks by an Optimized Unbalance Operation of Three-Phase Distributed Generators

    Get PDF
    This work focuses on using the full potential of PV inverters in order to improve the efficiency of low voltage networks. More specifically, the independent per-phase control capability of PV three-phase four-wire inverters, which are able to inject different active and reactive powers in each phase, in order to reduce the system phase unbalance is considered. This new operational procedure is analyzed by raising an optimization problem which uses a very accurate modelling of European low voltage networks. The paper includes a comprehensive quantitative comparison of the proposed strategy with two state-of-the-art methodologies to highlight the obtained benefits. The achieved results evidence that the proposed independent per-phase control of three-phase PV inverters improves considerably the network performance contributing to increase the penetration of renewable energy sources.Ministerio de Economía y Competitividad ENE2017-84813-R, ENE2014-54115-
    corecore