Search CORE

3,276 research outputs found

MLI: An API for Distributed Machine Learning

Author: Franklin Michael J.
Gonzalez Joseph
Jordan Michael I.
Kottalam Jey
Kraska Tim
Pan Xinghao
Smith Virginia
Sparks Evan R.
Talwalkar Ameet
Publication venue
Publication date: 25/10/2013
Field of study

MLI is an Application Programming Interface designed to address the challenges of building Machine Learn- ing algorithms in a distributed setting based on data-centric computing. Its primary goal is to simplify the development of high-performance, scalable, distributed algorithms. Our initial results show that, relative to existing systems, this interface can be used to build distributed implementations of a wide variety of common Machine Learning algorithms with minimal complexity and highly competitive performance and scalability

arXiv.org e-Print Archive

Crossref

Julia: A Fresh Approach to Numerical Computing

Author: Bezanson Jeff
Edelman Alan
Karpinski Stefan
Shah Viral B.
Publication venue
Publication date: 01/12/2014
Field of study

Bridging cultures that have often been distant, Julia combines expertise from the diverse fields of computer science and computational science to create a new approach to numerical computing. Julia is designed to be easy and fast. Julia questions notions generally held as "laws of nature" by practitioners of numerical computing: 1. High-level dynamic programs have to be slow. 2. One must prototype in one language and then rewrite in another language for speed or deployment, and 3. There are parts of a system for the programmer, and other parts best left untouched as they are built by the experts. We introduce the Julia programming language and its design --- a dance between specialization and abstraction. Specialization allows for custom treatment. Multiple dispatch, a technique from computer science, picks the right algorithm for the right circumstance. Abstraction, what good computation is really about, recognizes what remains the same after differences are stripped away. Abstractions in mathematics are captured as code through another technique from computer science, generic programming. Julia shows that one can have machine performance without sacrificing human convenience.Comment: 37 page

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Graphulo Implementation of Server-Side Sparse Matrix Multiply in the Accumulo Database

Author: Fuchs Adam
Gadepally Vijay
Hutchison Dylan
Kepner Jeremy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/08/2015
Field of study

The Apache Accumulo database excels at distributed storage and indexing and is ideally suited for storing graph data. Many big data analytics compute on graph data and persist their results back to the database. These graph calculations are often best performed inside the database server. The GraphBLAS standard provides a compact and efficient basis for a wide range of graph applications through a small number of sparse matrix operations. In this article, we implement GraphBLAS sparse matrix multiplication server-side by leveraging Accumulo's native, high-performance iterators. We compare the mathematics and performance of inner and outer product implementations, and show how an outer product implementation achieves optimal performance near Accumulo's peak write rate. We offer our work as a core component to the Graphulo library that will deliver matrix math primitives for graph analytics within Accumulo.Comment: To be presented at IEEE HPEC 2015: http://www.ieee-hpec.org

arXiv.org e-Print Archive

Crossref

Interactive Supercomputing with MIT Matlab

Author: Edelman Alan
Husbands Parry
Isbell Charles Lee, Jr.
Publication venue
Publication date: 01/01/1998
Field of study

This paper describes MITMatlab, a system that enables users of supercomputers or networked PCs to work on large data sets within Matlab transparently. MITMatlab is based on the Parallel Problems Server (PPServer), a standalone 'linear algebra server' that provides a mechanism for running distributed memory algorithms on large data sets. The PPServer and MITMatlab enable high-performance interactive supercomputing. With such a tool, researchers can now use Matlab as more than a prototyping tool for experimenting with small problems. Instead, MITMatlab makes is possible to visualize and operate interactively on large data sets. This has implications not only in supercomputing, but for Artificial Intelligence applicatons such as Machine Learning, Information Retrieval and Image Processing

CiteSeerX

DSpace@MIT