69,698 research outputs found
Ringo: Interactive Graph Analytics on Big-Memory Machines
We present Ringo, a system for analysis of large graphs. Graphs provide a way
to represent and analyze systems of interacting objects (people, proteins,
webpages) with edges between the objects denoting interactions (friendships,
physical interactions, links). Mining graphs provides valuable insights about
individual objects as well as the relationships among them.
In building Ringo, we take advantage of the fact that machines with large
memory and many cores are widely available and also relatively affordable. This
allows us to build an easy-to-use interactive high-performance graph analytics
system. Graphs also need to be built from input data, which often resides in
the form of relational tables. Thus, Ringo provides rich functionality for
manipulating raw input data tables into various kinds of graphs. Furthermore,
Ringo also provides over 200 graph analytics functions that can then be applied
to constructed graphs.
We show that a single big-memory machine provides a very attractive platform
for performing analytics on all but the largest graphs as it offers excellent
performance and ease of use as compared to alternative approaches. With Ringo,
we also demonstrate how to integrate graph analytics with an iterative process
of trial-and-error data exploration and rapid experimentation, common in data
mining workloads.Comment: 6 pages, 2 figure
{BiQ} Analyzer {HiMod}: An Interactive Software Tool for High-throughput Locus-specific Analysis of 5-Methylcytosine and its Oxidized Derivatives
Recent data suggest important biological roles for oxidative modifications of methylated cytosines, specifically hydroxymethylation, formylation and carboxylation. Several assays are now available for profiling these DNA modifications genome-wide as well as in targeted, locus-specific settings. Here we present BiQ Analyzer HiMod, a user-friendly software tool for sequence alignment, quality control and initial analysis of locus-specific DNA modification data. The software supports four different assay types, and it leads the user from raw sequence reads to DNA modification statistics and publication-quality plots. BiQ Analyzer HiMod combines well-established graphical user interface of its predecessor tool, BiQ Analyzer HT, with new and extended analysis modes. BiQ Analyzer HiMod also includes updates of the analysis workspace, an intuitive interface, a custom vector graphics engine and support of additional input and output data formats. The tool is freely available as a stand-alone installation package from http://biq-analyzer-himod.bioinf.mpi-inf.mpg.de/
Migrating existing multimedia courseware to Moodle
Open source course management systems offer increased flexibility for instructors and instructional designers. Communities can influence the development of these systems and on an individual basis, the possibility to modify the system software exists. Migrating existing courseware to these systems can therefore be beneficial, sometimes even required. We report here about our experience in migrating an existing courseware system consisting of multimedia content and interactive, integrated infrastructure functionality to an open source course management system called Moodle. We will assess the difficulties that we have encountered during this process and, discuss the importance of standards in this context, and we aim to provide other instructors or instructional designers with guidelines and assessment support for other migration projects
Recommended from our members
ranacapa: An R package and Shiny web app to explore environmental DNA data with exploratory statistics and interactive visualizations.
Environmental DNA (eDNA) metabarcoding is becoming a core tool in ecology and conservation biology, and is being used in a growing number of education, biodiversity monitoring, and public outreach programs in which professional research scientists engage community partners in primary research. Results from eDNA analyses can engage and educate natural resource managers, students, community scientists, and naturalists, but without significant training in bioinformatics, it can be difficult for this diverse audience to interact with eDNA results. Here we present the R package ranacapa, at the core of which is a Shiny web app that helps perform exploratory biodiversity analyses and visualizations of eDNA results. The app requires a taxonomy-by-sample matrix and a simple metadata file with descriptive information about each sample. The app enables users to explore the data with interactive figures and presents results from simple community ecology analyses. We demonstrate the value of ranacapa to two groups of community partners engaging with eDNA metabarcoding results
User Applications Driven by the Community Contribution Framework MPContribs in the Materials Project
This work discusses how the MPContribs framework in the Materials Project
(MP) allows user-contributed data to be shown and analyzed alongside the core
MP database. The Materials Project is a searchable database of electronic
structure properties of over 65,000 bulk solid materials that is accessible
through a web-based science-gateway. We describe the motivation for enabling
user contributions to the materials data and present the framework's features
and challenges in the context of two real applications. These use-cases
illustrate how scientific collaborations can build applications with their own
"user-contributed" data using MPContribs. The Nanoporous Materials Explorer
application provides a unique search interface to a novel dataset of hundreds
of thousands of materials, each with tables of user-contributed values related
to material adsorption and density at varying temperature and pressure. The
Unified Theoretical and Experimental x-ray Spectroscopy application discusses a
full workflow for the association, dissemination and combined analyses of
experimental data from the Advanced Light Source with MP's theoretical core
data, using MPContribs tools for data formatting, management and exploration.
The capabilities being developed for these collaborations are serving as the
model for how new materials data can be incorporated into the Materials Project
website with minimal staff overhead while giving powerful tools for data search
and display to the user community.Comment: 12 pages, 5 figures, Proceedings of 10th Gateway Computing
Environments Workshop (2015), to be published in "Concurrency in Computation:
Practice and Experience
AKARI-CAS --- Online Service for AKARI All-Sky Catalogues
The AKARI All-Sky Catalogues are an important infrared astronomical database
for next-generation astronomy that take over the IRAS catalog. We have
developed an online service, AKARI Catalogue Archive Server (AKARI-CAS), for
astronomers. The service includes useful and attractive search tools and visual
tools.
One of the new features of AKARI-CAS is cached SIMBAD/NED entries, which can
match AKARI catalogs with other catalogs stored in SIMBAD or NED. To allow
advanced queries to the databases, direct input of SQL is also supported. In
those queries, fast dynamic cross-identification between registered catalogs is
a remarkable feature. In addition, multiwavelength quick-look images are
displayed in the visualization tools, which will increase the value of the
service.
In the construction of our service, we considered a wide variety of
astronomers' requirements. As a result of our discussion, we concluded that
supporting users' SQL submissions is the best solution for the requirements.
Therefore, we implemented an RDBMS layer so that it covered important
facilities including the whole processing of tables. We found that PostgreSQL
is the best open-source RDBMS products for such purpose, and we wrote codes for
both simple and advanced searches into the SQL stored functions. To implement
such stored functions for fast radial search and cross-identification with
minimum cost, we applied a simple technique that is not based on dividing
celestial sphere such as HTM or HEALPix. In contrast, the Web application layer
became compact, and was written in simple procedural PHP codes. In total, our
system realizes cost-effective maintenance and enhancements.Comment: Yamauchi, C. et al. 2011, PASP..123..852
Vaex: Big Data exploration in the era of Gaia
We present a new Python library called vaex, to handle extremely large
tabular datasets, such as astronomical catalogues like the Gaia catalogue,
N-body simulations or any other regular datasets which can be structured in
rows and columns. Fast computations of statistics on regular N-dimensional
grids allows analysis and visualization in the order of a billion rows per
second. We use streaming algorithms, memory mapped files and a zero memory copy
policy to allow exploration of datasets larger than memory, e.g. out-of-core
algorithms. Vaex allows arbitrary (mathematical) transformations using normal
Python expressions and (a subset of) numpy functions which are lazily evaluated
and computed when needed in small chunks, which avoids wasting of RAM. Boolean
expressions (which are also lazily evaluated) can be used to explore subsets of
the data, which we call selections. Vaex uses a similar DataFrame API as
Pandas, a very popular library, which helps migration from Pandas.
Visualization is one of the key points of vaex, and is done using binned
statistics in 1d (e.g. histogram), in 2d (e.g. 2d histograms with colormapping)
and 3d (using volume rendering). Vaex is split in in several packages:
vaex-core for the computational part, vaex-viz for visualization mostly based
on matplotlib, vaex-jupyter for visualization in the Jupyter notebook/lab based
in IPyWidgets, vaex-server for the (optional) client-server communication,
vaex-ui for the Qt based interface, vaex-hdf5 for hdf5 based memory mapped
storage, vaex-astro for astronomy related selections, transformations and
memory mapped (column based) fits storage. Vaex is open source and available
under MIT license on github, documentation and other information can be found
on the main website: https://vaex.io, https://docs.vaex.io or
https://github.com/maartenbreddels/vaexComment: 14 pages, 8 figures, Submitted to A&A, interactive version of Fig 4:
https://vaex.io/paper/fig
VisIVO - Integrated Tools and Services for Large-Scale Astrophysical Visualization
VisIVO is an integrated suite of tools and services specifically designed for
the Virtual Observatory. This suite constitutes a software framework for
effective visual discovery in currently available (and next-generation) very
large-scale astrophysical datasets. VisIVO consists of VisiVO Desktop - a stand
alone application for interactive visualization on standard PCs, VisIVO Server
- a grid-enabled platform for high performance visualization and VisIVO Web - a
custom designed web portal supporting services based on the VisIVO Server
functionality. The main characteristic of VisIVO is support for
high-performance, multidimensional visualization of very large-scale
astrophysical datasets. Users can obtain meaningful visualizations rapidly
while preserving full and intuitive control of the relevant visualization
parameters. This paper focuses on newly developed integrated tools in VisIVO
Server allowing intuitive visual discovery with 3D views being created from
data tables. VisIVO Server can be installed easily on any web server with a
database repository. We discuss briefly aspects of our implementation of VisiVO
Server on a computational grid and also outline the functionality of the
services offered by VisIVO Web. Finally we conclude with a summary of our work
and pointers to future developments
- âŠ