Search CORE

1,529 research outputs found

Performance comparison between Java and JNI for optimal implementation of computational micro-kernels

Author: Charles Henri-Pierre
Halli Nassim A.
Mehaut Jean-François
Publication venue
Publication date: 21/12/2014
Field of study

General purpose CPUs used in high performance computing (HPC) support a vector instruction set and an out-of-order engine dedicated to increase the instruction level parallelism. Hence, related optimizations are currently critical to improve the performance of applications requiring numerical computation. Moreover, the use of a Java run-time environment such as the HotSpot Java Virtual Machine (JVM) in high performance computing is a promising alternative. It benefits from its programming flexibility, productivity and the performance is ensured by the Just-In-Time (JIT) compiler. Though, the JIT compiler suffers from two main drawbacks. First, the JIT is a black box for developers. We have no control over the generated code nor any feedback from its optimization phases like vectorization. Secondly, the time constraint narrows down the degree of optimization compared to static compilers like GCC or LLVM. So, it is compelling to use statically compiled code since it benefits from additional optimization reducing performance bottlenecks. Java enables to call native code from dynamic libraries through the Java Native Interface (JNI). Nevertheless, JNI methods are not inlined and require an additional cost to be invoked compared to Java ones. Therefore, to benefit from better static optimization, this call overhead must be leveraged by the amount of computation performed at each JNI invocation. In this paper we tackle this problem and we propose to do this analysis for a set of micro-kernels. Our goal is to select the most efficient implementation considering the amount of computation defined by the calling context. We also investigate the impact on performance of several different optimization schemes which are vectorization, out-of-order optimization, data alignment, method inlining and the use of native memory for JNI methods.Comment: Part of ADAPT Workshop proceedings, 2015 (arXiv:1412.2347

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

HAL-CEA

Julia: A Fresh Approach to Numerical Computing

Author: Bezanson Jeff
Edelman Alan
Karpinski Stefan
Shah Viral B.
Publication venue
Publication date: 01/12/2014
Field of study

Bridging cultures that have often been distant, Julia combines expertise from the diverse fields of computer science and computational science to create a new approach to numerical computing. Julia is designed to be easy and fast. Julia questions notions generally held as "laws of nature" by practitioners of numerical computing: 1. High-level dynamic programs have to be slow. 2. One must prototype in one language and then rewrite in another language for speed or deployment, and 3. There are parts of a system for the programmer, and other parts best left untouched as they are built by the experts. We introduce the Julia programming language and its design --- a dance between specialization and abstraction. Specialization allows for custom treatment. Multiple dispatch, a technique from computer science, picks the right algorithm for the right circumstance. Abstraction, what good computation is really about, recognizes what remains the same after differences are stripped away. Abstractions in mathematics are captured as code through another technique from computer science, generic programming. Julia shows that one can have machine performance without sacrificing human convenience.Comment: 37 page

arXiv.org e-Print Archive

CiteSeerX

DSpace@MIT

Tupleware: Redefining Modern Analytics

Author: Cetintemel Ugur
Crotty Andrew
Dursun Kayhan
Galakatos Alex
Kraska Tim
Zdonik Stan
Publication venue
Publication date: 30/07/2014
Field of study

There is a fundamental discrepancy between the targeted and actual users of current analytics frameworks. Most systems are designed for the data and infrastructure of the Googles and Facebooks of the world---petabytes of data distributed across large cloud deployments consisting of thousands of cheap commodity machines. Yet, the vast majority of users operate clusters ranging from a few to a few dozen nodes, analyze relatively small datasets of up to a few terabytes, and perform primarily compute-intensive operations. Targeting these users fundamentally changes the way we should build analytics systems. This paper describes the design of Tupleware, a new system specifically aimed at the challenges faced by the typical user. Tupleware's architecture brings together ideas from the database, compiler, and programming languages communities to create a powerful end-to-end solution for data analysis. We propose novel techniques that consider the data, computations, and hardware together to achieve maximum performance on a case-by-case basis. Our experimental evaluation quantifies the impact of our novel techniques and shows orders of magnitude performance improvement over alternative systems

arXiv.org e-Print Archive

CiteSeerX

The Scalable Brain Atlas: instant web-based access to public brain atlases and related content

Author: Bakker Rembrandt
Kötter Rolf
Tiesinga Paul
Publication venue
Publication date: 01/01/2014
Field of study

The Scalable Brain Atlas (SBA) is a collection of web services that provide unified access to a large collection of brain atlas templates for different species. Its main component is an atlas viewer that displays brain atlas data as a stack of slices in which stereotaxic coordinates and brain regions can be selected. These are subsequently used to launch web queries to resources that require coordinates or region names as input. It supports plugins which run inside the viewer and respond when a new slice, coordinate or region is selected. It contains 20 atlas templates in six species, and plugins to compute coordinate transformations, display anatomical connectivity and fiducial points, and retrieve properties, descriptions, definitions and 3d reconstructions of brain regions. The ambition of SBA is to provide a unified representation of all publicly available brain atlases directly in the web browser, while remaining a responsive and light weight resource that specializes in atlas comparisons, searches, coordinate transformations and interactive displays.Comment: Rolf K\"otter sadly passed away on June 9th, 2010. He co-initiated this project and played a crucial role in the design and quality assurance of the Scalable Brain Atla

arXiv.org e-Print Archive

Crossref

Springer - Publisher Connector

PubMed Central

Juelich Shared Electronic Resources

Radboud Repository

Personalized Fuzzy Text Search Using Interest Prediction and Word Vectorization

Author: Gao Tianmeng
Huang Chengwei
Liu Yun
Song Baolin
Publication venue
Publication date: 01/10/2017
Field of study

In this paper we study the personalized text search problem. The keyword based search method in conventional algorithms has a low efficiency in understanding users' intention since the semantic meaning, user profile, user interests are not always considered. Firstly, we propose a novel text search algorithm using a inverse filtering mechanism that is very efficient for label based item search. Secondly, we adopt the Bayesian network to implement the user interest prediction for an improved personalized search. According to user input, it searches the related items using keyword information, predicted user interest. Thirdly, the word vectorization is used to discover potential targets according to the semantic meaning. Experimental results show that the proposed search engine has an improved efficiency and accuracy and it can operate on embedded devices with very limited computational resources

arXiv.org e-Print Archive

Julia Programming Language Benchmark Using a Flight Simulation

Author: Sells Ray
Publication venue
Publication date
Field of study

Julias goal to provide scripting language ease-of-coding with compiled language speed is explored. The runtime speed of the relatively new Julia programming language is assessed against other commonly used languages including Python, Java, and C++. An industry-standard missile and rocket simulation, coded in multiple languages, was used as a test bench for runtime speed. All language versions of the simulation, including Julia, were coded to a highly-developed object-oriented simulation architecture tailored specifically for time-domain flight simulation. A speed-of-coding second-dimension is plotted against runtime for each language to portray a space that characterizes Julias scripting language efficiencies in the context of the other languages. With caveats, Julia runtime speed was found to be in the class of compiled or semi-compiled languages. However, some factors that affect runtime speed at the cost of ease-of-coding are shown. Julias built-in functionality for multi-core processing is briefly examined as a means for obtaining even faster runtime speed. The major contribution of this research to the extensive language benchmarking body-of-work is comparing Julia to other mainstream languages using a complex flight simulation as opposed to benchmarking with single algorithms

NASA Technical Reports Server