69,698 research outputs found

    Ringo: Interactive Graph Analytics on Big-Memory Machines

    Full text link
    We present Ringo, a system for analysis of large graphs. Graphs provide a way to represent and analyze systems of interacting objects (people, proteins, webpages) with edges between the objects denoting interactions (friendships, physical interactions, links). Mining graphs provides valuable insights about individual objects as well as the relationships among them. In building Ringo, we take advantage of the fact that machines with large memory and many cores are widely available and also relatively affordable. This allows us to build an easy-to-use interactive high-performance graph analytics system. Graphs also need to be built from input data, which often resides in the form of relational tables. Thus, Ringo provides rich functionality for manipulating raw input data tables into various kinds of graphs. Furthermore, Ringo also provides over 200 graph analytics functions that can then be applied to constructed graphs. We show that a single big-memory machine provides a very attractive platform for performing analytics on all but the largest graphs as it offers excellent performance and ease of use as compared to alternative approaches. With Ringo, we also demonstrate how to integrate graph analytics with an iterative process of trial-and-error data exploration and rapid experimentation, common in data mining workloads.Comment: 6 pages, 2 figure

    {BiQ} Analyzer {HiMod}: An Interactive Software Tool for High-throughput Locus-specific Analysis of 5-Methylcytosine and its Oxidized Derivatives

    Get PDF
    Recent data suggest important biological roles for oxidative modifications of methylated cytosines, specifically hydroxymethylation, formylation and carboxylation. Several assays are now available for profiling these DNA modifications genome-wide as well as in targeted, locus-specific settings. Here we present BiQ Analyzer HiMod, a user-friendly software tool for sequence alignment, quality control and initial analysis of locus-specific DNA modification data. The software supports four different assay types, and it leads the user from raw sequence reads to DNA modification statistics and publication-quality plots. BiQ Analyzer HiMod combines well-established graphical user interface of its predecessor tool, BiQ Analyzer HT, with new and extended analysis modes. BiQ Analyzer HiMod also includes updates of the analysis workspace, an intuitive interface, a custom vector graphics engine and support of additional input and output data formats. The tool is freely available as a stand-alone installation package from http://biq-analyzer-himod.bioinf.mpi-inf.mpg.de/

    Migrating existing multimedia courseware to Moodle

    Get PDF
    Open source course management systems offer increased flexibility for instructors and instructional designers. Communities can influence the development of these systems and on an individual basis, the possibility to modify the system software exists. Migrating existing courseware to these systems can therefore be beneficial, sometimes even required. We report here about our experience in migrating an existing courseware system consisting of multimedia content and interactive, integrated infrastructure functionality to an open source course management system called Moodle. We will assess the difficulties that we have encountered during this process and, discuss the importance of standards in this context, and we aim to provide other instructors or instructional designers with guidelines and assessment support for other migration projects

    User Applications Driven by the Community Contribution Framework MPContribs in the Materials Project

    Full text link
    This work discusses how the MPContribs framework in the Materials Project (MP) allows user-contributed data to be shown and analyzed alongside the core MP database. The Materials Project is a searchable database of electronic structure properties of over 65,000 bulk solid materials that is accessible through a web-based science-gateway. We describe the motivation for enabling user contributions to the materials data and present the framework's features and challenges in the context of two real applications. These use-cases illustrate how scientific collaborations can build applications with their own "user-contributed" data using MPContribs. The Nanoporous Materials Explorer application provides a unique search interface to a novel dataset of hundreds of thousands of materials, each with tables of user-contributed values related to material adsorption and density at varying temperature and pressure. The Unified Theoretical and Experimental x-ray Spectroscopy application discusses a full workflow for the association, dissemination and combined analyses of experimental data from the Advanced Light Source with MP's theoretical core data, using MPContribs tools for data formatting, management and exploration. The capabilities being developed for these collaborations are serving as the model for how new materials data can be incorporated into the Materials Project website with minimal staff overhead while giving powerful tools for data search and display to the user community.Comment: 12 pages, 5 figures, Proceedings of 10th Gateway Computing Environments Workshop (2015), to be published in "Concurrency in Computation: Practice and Experience

    AKARI-CAS --- Online Service for AKARI All-Sky Catalogues

    Full text link
    The AKARI All-Sky Catalogues are an important infrared astronomical database for next-generation astronomy that take over the IRAS catalog. We have developed an online service, AKARI Catalogue Archive Server (AKARI-CAS), for astronomers. The service includes useful and attractive search tools and visual tools. One of the new features of AKARI-CAS is cached SIMBAD/NED entries, which can match AKARI catalogs with other catalogs stored in SIMBAD or NED. To allow advanced queries to the databases, direct input of SQL is also supported. In those queries, fast dynamic cross-identification between registered catalogs is a remarkable feature. In addition, multiwavelength quick-look images are displayed in the visualization tools, which will increase the value of the service. In the construction of our service, we considered a wide variety of astronomers' requirements. As a result of our discussion, we concluded that supporting users' SQL submissions is the best solution for the requirements. Therefore, we implemented an RDBMS layer so that it covered important facilities including the whole processing of tables. We found that PostgreSQL is the best open-source RDBMS products for such purpose, and we wrote codes for both simple and advanced searches into the SQL stored functions. To implement such stored functions for fast radial search and cross-identification with minimum cost, we applied a simple technique that is not based on dividing celestial sphere such as HTM or HEALPix. In contrast, the Web application layer became compact, and was written in simple procedural PHP codes. In total, our system realizes cost-effective maintenance and enhancements.Comment: Yamauchi, C. et al. 2011, PASP..123..852

    Vaex: Big Data exploration in the era of Gaia

    Get PDF
    We present a new Python library called vaex, to handle extremely large tabular datasets, such as astronomical catalogues like the Gaia catalogue, N-body simulations or any other regular datasets which can be structured in rows and columns. Fast computations of statistics on regular N-dimensional grids allows analysis and visualization in the order of a billion rows per second. We use streaming algorithms, memory mapped files and a zero memory copy policy to allow exploration of datasets larger than memory, e.g. out-of-core algorithms. Vaex allows arbitrary (mathematical) transformations using normal Python expressions and (a subset of) numpy functions which are lazily evaluated and computed when needed in small chunks, which avoids wasting of RAM. Boolean expressions (which are also lazily evaluated) can be used to explore subsets of the data, which we call selections. Vaex uses a similar DataFrame API as Pandas, a very popular library, which helps migration from Pandas. Visualization is one of the key points of vaex, and is done using binned statistics in 1d (e.g. histogram), in 2d (e.g. 2d histograms with colormapping) and 3d (using volume rendering). Vaex is split in in several packages: vaex-core for the computational part, vaex-viz for visualization mostly based on matplotlib, vaex-jupyter for visualization in the Jupyter notebook/lab based in IPyWidgets, vaex-server for the (optional) client-server communication, vaex-ui for the Qt based interface, vaex-hdf5 for hdf5 based memory mapped storage, vaex-astro for astronomy related selections, transformations and memory mapped (column based) fits storage. Vaex is open source and available under MIT license on github, documentation and other information can be found on the main website: https://vaex.io, https://docs.vaex.io or https://github.com/maartenbreddels/vaexComment: 14 pages, 8 figures, Submitted to A&A, interactive version of Fig 4: https://vaex.io/paper/fig

    VisIVO - Integrated Tools and Services for Large-Scale Astrophysical Visualization

    Full text link
    VisIVO is an integrated suite of tools and services specifically designed for the Virtual Observatory. This suite constitutes a software framework for effective visual discovery in currently available (and next-generation) very large-scale astrophysical datasets. VisIVO consists of VisiVO Desktop - a stand alone application for interactive visualization on standard PCs, VisIVO Server - a grid-enabled platform for high performance visualization and VisIVO Web - a custom designed web portal supporting services based on the VisIVO Server functionality. The main characteristic of VisIVO is support for high-performance, multidimensional visualization of very large-scale astrophysical datasets. Users can obtain meaningful visualizations rapidly while preserving full and intuitive control of the relevant visualization parameters. This paper focuses on newly developed integrated tools in VisIVO Server allowing intuitive visual discovery with 3D views being created from data tables. VisIVO Server can be installed easily on any web server with a database repository. We discuss briefly aspects of our implementation of VisiVO Server on a computational grid and also outline the functionality of the services offered by VisIVO Web. Finally we conclude with a summary of our work and pointers to future developments
    • 

    corecore