Search CORE

877 research outputs found

Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduce Framework

Author: A. Berry
B. Ganter
B. Ganter
C.E. Dowling
E.M. Norris
J.-P. Bordat
L. Lakhal
P. Krajca
R. Godin
S.O. Kuznetsov
S.O. Kuznetsov
Y. Yu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

While many existing formal concept analysis algorithms are efficient, they are typically unsuitable for distributed implementation. Taking the MapReduce (MR) framework as our inspiration we introduce a distributed approach for performing formal concept mining. Our method has its novelty in that we use a light-weight MapReduce runtime called Twister which is better suited to iterative algorithms than recent distributed approaches. First, we describe the theoretical foundations underpinning our distributed formal concept analysis approach. Second, we provide a representative exemplar of how a classic centralized algorithm can be implemented in a distributed fashion using our methodology: we modify Ganter's classic algorithm by introducing a family of MR* algorithms, namely MRGanter and MRGanter+ where the prefix denotes the algorithm's lineage. To evaluate the factors that impact distributed algorithm performance, we compare our MR* algorithms with the state-of-the-art. Experiments conducted on real datasets demonstrate that MRGanter+ is efficient, scalable and an appealing algorithm for distributed problems.Comment: 17 pages, ICFCA 201, Formal Concept Analysis 201

arXiv.org e-Print Archive

Crossref

Arrow@TUDublin

WIT Repository

Scather: programming with multi-party computation and MapReduce

Author: Bestavros Azer
Lapets Andrei
Volgushev Nikolaj
Publication venue: Computer Science Department, Boston University
Publication date: 29/08/2015
Field of study

We present a prototype of a distributed computational infrastructure, an associated high level programming language, and an underlying formal framework that allow multiple parties to leverage their own cloud-based computational resources (capable of supporting MapReduce [27] operations) in concert with multi-party computation (MPC) to execute statistical analysis algorithms that have privacy-preserving properties. Our architecture allows a data analyst unfamiliar with MPC to: (1) author an analysis algorithm that is agnostic with regard to data privacy policies, (2) to use an automated process to derive algorithm implementation variants that have different privacy and performance properties, and (3) to compile those implementation variants so that they can be deployed on an infrastructures that allows computations to take place locally within each participant’s MapReduce cluster as well as across all the participants’ clusters using an MPC protocol. We describe implementation details of the architecture, discuss and demonstrate how the formal framework enables the exploration of tradeoffs between the efficiency and privacy properties of an analysis algorithm, and present two example applications that illustrate how such an infrastructure can be utilized in practice.This work was supported in part by NSF Grants: #1430145, #1414119, #1347522, and #1012798

Boston University Institutional Repository (OpenBU)

CPL: A Core Language for Cloud Computing -- Technical Report

Author: Bračevac Oliver
Erdweg Sebastian
Mezini Mira
Salvaneschi Guido
Publication venue
Publication date: 05/02/2016
Field of study

Running distributed applications in the cloud involves deployment. That is, distribution and configuration of application services and middleware infrastructure. The considerable complexity of these tasks resulted in the emergence of declarative JSON-based domain-specific deployment languages to develop deployment programs. However, existing deployment programs unsafely compose artifacts written in different languages, leading to bugs that are hard to detect before run time. Furthermore, deployment languages do not provide extension points for custom implementations of existing cloud services such as application-specific load balancing policies. To address these shortcomings, we propose CPL (Cloud Platform Language), a statically-typed core language for programming both distributed applications as well as their deployment on a cloud platform. In CPL, application services and deployment programs interact through statically typed, extensible interfaces, and an application can trigger further deployment at run time. We provide a formal semantics of CPL and demonstrate that it enables type-safe, composable and extensible libraries of service combinators, such as load balancing and fault tolerance.Comment: Technical report accompanying the MODULARITY '16 submissio

arXiv.org e-Print Archive

TUbiblio

Big Data Refinement

Author: Boiten
Boiten
Crépeau
Derrick
Eerke A. Boiten
Eerke Boiten
John Derrick
Kusakabe
McIver
Ono
Pasquale
Reddy
Steve Reeves
Su
Ying
Zuiderveen Borgesius
Publication venue: 'Open Publishing Association'
Publication date: 01/06/2016
Field of study

"Big data" has become a major area of research and associated funding, as well as a focus of utopian thinking. In the still growing research community, one of the favourite optimistic analogies for data processing is that of the oil refinery, extracting the essence out of the raw data. Pessimists look for their imagery to the other end of the petrol cycle, and talk about the "data exhausts" of our society. Obviously, the refinement community knows how to do "refining". This paper explores the extent to which notions of refinement and data in the formal methods community relate to the core concepts in "big data". In particular, can the data refinement paradigm can be used to explain aspects of big data processing

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Kent Academic Repository

A Formal, Resource Consumption-Preserving Translation of Actors to Haskell

Author: E Albert
E Albert
E Albert
EB Johnsen
FS Boer
J Dean
J Schäfer
K Nakata
N Bezirgiannis
P Tarau
PY Wong
S Srinivasan
Publication venue
Publication date: 09/08/2016
Field of study

We present a formal translation of an actor-based language with cooperative scheduling to the functional language Haskell. The translation is proven correct with respect to a formal semantics of the source language and a high-level operational semantics of the target, i.e. a subset of Haskell. The main correctness theorem is expressed in terms of a simulation relation between the operational semantics of actor programs and their translation. This allows us to then prove that the resource consumption is preserved over this translation, as we establish an equivalence of the cost of the original and Haskell-translated execution traces.Comment: Pre-proceedings paper presented at the 26th International Symposium on Logic-Based Program Synthesis and Transformation (LOPSTR 2016), Edinburgh, Scotland UK, 6-8 September 2016 (arXiv:1608.02534

arXiv.org e-Print Archive

Crossref

CWI's Institutional Repository

A survey of large-scale reasoning on the Web of data

Author: Antoniou Grigoris
Batsakis Sotiris
Mutharaju Raghava
Pan Jeff Z.
Qi Guilin
Tachmazidis Ilias
Urbani Jacopo
Zhou Zhangquan
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 31/10/2018
Field of study

As more and more data is being generated by sensor networks, social media and organizations, the Webinterlinking this wealth of information becomes more complex. This is particularly true for the so-calledWeb of Data, in which data is semantically enriched and interlinked using ontologies. In this large anduncoordinated environment, reasoning can be used to check the consistency of the data and of asso-ciated ontologies, or to infer logical consequences which, in turn, can be used to obtain new insightsfrom the data. However, reasoning approaches need to be scalable in order to enable reasoning over theentire Web of Data. To address this problem, several high-performance reasoning systems, whichmainly implement distributed or parallel algorithms, have been proposed in the last few years. Thesesystems differ significantly; for instance in terms of reasoning expressivity, computational propertiessuch as completeness, or reasoning objectives. In order to provide afirst complete overview of thefield,this paper reports a systematic review of such scalable reasoning approaches over various ontologicallanguages, reporting details about the methods and over the conducted experiments. We highlight theshortcomings of these approaches and discuss some of the open problems related to performing scalablereasoning

VU Research Portal

Huddersfield Research Portal

Proceedings of the 4th DIKU-IST Joint Workshop on the Foundations of Software

Author
Publication venue: Department of Computer Science, University of Copenhagen
Publication date: 01/01/2011
Field of study

Copenhagen University Research Information System