213 research outputs found
Recommended from our members
Maintaining Coherency of Dynamic Data in Cooperating Repositories
In this paper, we consider techniques for disseminating dynamic data—such as stock prices and real-time weather information—from sources to a set of repositories. We focus on the problem of maintaining coherency of dynamic data items in a network of cooperating repositories. We show that cooperation among repositories— where each repository pushes updates of data items to other repositories—helps reduce system-wide communication and computation overheads for coherency maintenance. However, contrary to intuition, we also show that increasing the degree of cooperation beyond a certain point can, in fact, be detrimental to the goal of maintaining coherency at low communication and computational overheads. We present techniques (i) to derive the “optimal” degree of cooperation among repositories, (ii) to construct an efficient dissemination tree for propagating changes from sources to cooperating repositories, and (iii) to determine when to push an update from one repository to another for coherency maintenance. We evaluate the efficacy of our techniques using real-world traces of dynamically changing data items (specifically, stock prices) and show that careful dissemination of updates through a network of cooperating repositories can substantially lower the cost of coherency maintenance
Dynamic data consistency maintenance in peer-to-peer caching system
Master'sMASTER OF SCIENC
Disseminating streaming data in a dynamic environment: an adaptive and cost-based approach
In a distributed stream processing system, streaming data are continuously disseminated from the sources to the distributed processing servers. To enhance the dissemination efficiency, these servers are typically organized into one or more dissemination trees. In this paper, we focus on the problem of constructing dissemination trees to minimize the average loss of fidelity of the system. We observe that existing heuristic-based approaches can only explore a limited solution space and hence may lead to sub-optimal solutions. On the contrary, we propose an adaptive and cost-based approach. Our cost model takes into account both the processing cost and the communication cost. Furthermore, as a distributed stream processing system is vulnerable to inaccurate statistics, runtime fluctuations of data characteristics, server workloads, and network conditions, we have designed our scheme to be adaptive to these situations: an operational dissemination tree may be incrementally transformed to a more cost-effective one. Our adaptive strategy employs distributed decisions made by the distributed servers independently based on localized statistics collected by each server at runtime. For a relatively static environment, we also propose two static tree construction algorithms relying on apriori system statistics. These static trees can also be used as initial trees in a dynamic environment. We apply our schemes to both single- and multi-object dissemination. Our extensive performance study shows that the adaptive mechanisms are effective in a dynamic context and the proposed static tree construction algorithms perform close to optimal in a static environmen
Peer-To-Peer Backup for Personal Area Networks
FlashBack is a peer-to-peer backup algorithm designed for power-constrained devices running in a personal area network (PAN). Backups are performed transparently as local updates initiate the spread of backup data among a subset of the currently available peers. Flashback limits power usage by avoiding flooding and keeping small neighbor sets. Flashback has also been designed to utilize powered infrastructure when possible to further extend device lifetime. We propose our architecture and algorithms, and present initial experimental results that illustrate FlashBack’s performance characteristic
Dynamics of Innovation in an “Open Source” Collaboration Environment: Lurking, Laboring and Launching FLOSS Projects on SourceForge
A systems analysis perspective is adopted to examine the critical properties of the Free/Libre/Open Source Software (FLOSS) mode of innovation, as reflected on the SourceForge platform (SF.net). This approach re-scales March’s (1991) framework and applies it to characterize the “innovation system” of a “distributed organization” of interacting agents in a virtual collaboration environment. The innovation system of the virtual collaboration environment is an emergent property of two “coupled” processes: one involves interactions among agents searching for information to use in designing novel software products, and the other involves the mobilization of individual capabilities for application in the software development projects. Micro-dynamics of this system are studied empirically by constructing transition probability matrices representing movements of 222,835 SF.net users among 7 different activity states. Estimated probabilities are found to form first-order Markov chains describing ergodic processes. This makes it possible to computate the equilibrium distribution of agents among the states, thereby suppressing transient effects and revealing persisting patterns of project-joining and project-launching.innovation systems, collaborative development environments, industrial districts, exploration and exploitation dynamics, open source software, FLOSS, SourceForge, project-joining, project-founding, Markov chain analysis.
Bridging the Global Divide in AI Regulation: A Proposal for a Contextual, Coherent, and Commensurable Framework
This paper examines the current landscape of AI regulations, highlighting the
divergent approaches being taken, and proposes an alternative contextual,
coherent, and commensurable (3C) framework. The EU, Canada, South Korea, and
Brazil follow a horizontal or lateral approach that postulates the homogeneity
of AI systems, seeks to identify common causes of harm, and demands uniform
human interventions. In contrast, the U.K., Israel, Switzerland, Japan, and
China have pursued a context-specific or modular approach, tailoring
regulations to the specific use cases of AI systems. The U.S. is reevaluating
its strategy, with growing support for controlling existential risks associated
with AI. Addressing such fragmentation of AI regulations is crucial to ensure
the interoperability of AI. The present degree of proportionality, granularity,
and foreseeability of the EU AI Act is not sufficient to garner consensus. The
context-specific approach holds greater promises but requires further
development in terms of details, coherency, and commensurability. To strike a
balance, this paper proposes a hybrid 3C framework. To ensure contextuality,
the framework categorizes AI into distinct types based on their usage and
interaction with humans: autonomous, allocative, punitive, cognitive, and
generative AI. To ensure coherency, each category is assigned specific
regulatory objectives: safety for autonomous AI; fairness and explainability
for allocative AI; accuracy and explainability for punitive AI; accuracy,
robustness, and privacy for cognitive AI; and the mitigation of infringement
and misuse for generative AI. To ensure commensurability, the framework
promotes the adoption of international industry standards that convert
principles into quantifiable metrics. In doing so, the framework is expected to
foster international collaboration and standardization without imposing
excessive compliance costs
Adaptive Filters for Continuous Queries over Distributed Data Stream
We consider an environment where distributed data sources continuously stream updates to a centralized processor that monitors continuous queries over the distributed data. Significant communication overhead is incurred in the presence of rapid update streams, and we propose a new technique for reducing the overhead. Users register continuous queries with precision requirements at the central stream processor, which installs filters at remote data sources. The filters adapt to changing conditions to minimize stream rates while guaranteeing that all continuous queries still receive the updates necessary to provide answers of adequate precision at all times. Our approach enables applications to trade precision for communication overhead at a fine granularity by individually adjusting the precision constraints of continuous queries over streams in a multi-query workload
The EPOS Research Infrastructure: a federated approach to integrate solid Earth science data and services
The European Plate Observing System (EPOS) is a Research Infrastructure (RI) committed to enabling excellent science through the integration, accessibility, use and re-use of solid Earth science data, research products and services, as well as by promoting physical access to research facilities. This article presents and describes the EPOS RI and introduces the contents of its Delivery Framework. In November 2018, EPOS ERIC (European Research Infrastructure Consortium) has been granted by the European Commission and was established to design and implement a long-term plan for the integration of research infrastructures for solid Earth science in Europe. Specifically, the EPOS mission is to create and operate a highly distributed and sustainable research infrastructure to provide coordinated access to harmonized, interoperable and quality-controlled data from diverse solid Earth science disciplines, together with tools for their use in analysis and modelling. EPOS relies on leading-edge e-science solutions and is committed to open access, thus enabling a step towards the change in multidisciplinary and cross-disciplinary scientific research in Earth science. The EPOS architecture and its Delivery Framework are discussed in this article to present the contributions to open science and FAIR (Findable, Accessible, Interoperable, and Reusable) data management, as well as to emphasize the community building process that supported the design, implementation and construction of the EPOS RI.publishedVersio
Models of higher-order, type-safe, distributed computation over autonomous persistent object stores
A remote procedure call (RPC) mechanism permits the calling of procedures in another
address space. RPC is a simple but highly effective mechanism for interprocess communication
and enjoys nowadays a great popularity as a tool for building distributed applications.
This popularity is partly a result of their overall simplicity but also partly a consequence
of more than 20 years of research in transpaxent distribution that have failed to deliver
systems that meet the expectations of real-world application programmers.
During the same 20 years, persistent systems have proved their suitability for building
complex database applications by seamlessly integrating features traditionally found in
database management systems into the programming language itself. Some research. effort
has been invested on distributed persistent systems, but the outcomes commonly suffer
from the same problems found with transparent distribution.
In this thesis I claim that a higher-order persistent RPC is useful for building distributed
persistent applications. The proposed mechanism is: realistic in the sense that it uses
current technology and tolerates partial failures; understandable by application programmers;
and general to support the development of many classes of distributed persistent
applications.
In order to demonstrate the validity of these claims, I propose and have implemented three
models for distributed higher-order computation over autonomous persistent stores. Each
model has successively exposed new problems which have then been overcome by the next
model. Together, the three models provide a general yet simple higher-order persistent
RPC that is able to operate in realistic environments with partial failures.
The real strength of this thesis is the demonstration of realism and simplicity. A higherorder
persistent RPC was not only implemented but also used by programmers without
experience of programming distributed applications. Furthermore, a distributed persistent
application has been built using these models which would not have been feasible with a
traditional (non-persistent) programming language
- …