11,518 research outputs found

    Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration.

    Get PDF
    Data visualization is an essential component of genomic data analysis. However, the size and diversity of the data sets produced by today's sequencing and array-based profiling methods present major challenges to visualization tools. The Integrative Genomics Viewer (IGV) is a high-performance viewer that efficiently handles large heterogeneous data sets, while providing a smooth and intuitive user experience at all levels of genome resolution. A key characteristic of IGV is its focus on the integrative nature of genomic studies, with support for both array-based and next-generation sequencing data, and the integration of clinical and phenotypic data. Although IGV is often used to view genomic data from public sources, its primary emphasis is to support researchers who wish to visualize and explore their own data sets or those from colleagues. To that end, IGV supports flexible loading of local and remote data sets, and is optimized to provide high-performance data visualization and exploration on standard desktop systems. IGV is freely available for download from http://www.broadinstitute.org/igv, under a GNU LGPL open-source license

    Senses, brain and spaces workshop

    Get PDF

    ROOT - A C++ Framework for Petabyte Data Storage, Statistical Analysis and Visualization

    Full text link
    ROOT is an object-oriented C++ framework conceived in the high-energy physics (HEP) community, designed for storing and analyzing petabytes of data in an efficient way. Any instance of a C++ class can be stored into a ROOT file in a machine-independent compressed binary format. In ROOT the TTree object container is optimized for statistical data analysis over very large data sets by using vertical data storage techniques. These containers can span a large number of files on local disks, the web, or a number of different shared file systems. In order to analyze this data, the user can chose out of a wide set of mathematical and statistical functions, including linear algebra classes, numerical algorithms such as integration and minimization, and various methods for performing regression analysis (fitting). In particular, ROOT offers packages for complex data modeling and fitting, as well as multivariate classification based on machine learning techniques. A central piece in these analysis tools are the histogram classes which provide binning of one- and multi-dimensional data. Results can be saved in high-quality graphical formats like Postscript and PDF or in bitmap formats like JPG or GIF. The result can also be stored into ROOT macros that allow a full recreation and rework of the graphics. Users typically create their analysis macros step by step, making use of the interactive C++ interpreter CINT, while running over small data samples. Once the development is finished, they can run these macros at full compiled speed over large data sets, using on-the-fly compilation, or by creating a stand-alone batch program. Finally, if processing farms are available, the user can reduce the execution time of intrinsically parallel tasks - e.g. data mining in HEP - by using PROOF, which will take care of optimally distributing the work over the available resources in a transparent way

    Tangos: the agile numerical galaxy organization system

    Get PDF
    We present Tangos, a Python framework and web interface for database-driven analysis of numerical structure formation simulations. To understand the role that such a tool can play, consider constructing a history for the absolute magnitude of each galaxy within a simulation. The magnitudes must first be calculated for all halos at all timesteps and then linked using a merger tree; folding the required information into a final analysis can entail significant effort. Tangos is a generic solution to this information organization problem, aiming to free users from the details of data management. At the querying stage, our example of gathering properties over history is reduced to a few clicks or a simple, single-line Python command. The framework is highly extensible; in particular, users are expected to define their own properties which tangos will write into the database. A variety of parallelization options are available and the raw simulation data can be read using existing libraries such as pynbody or yt. Finally, tangos-based databases and analysis pipelines can easily be shared with collaborators or the broader community to ensure reproducibility. User documentation is provided separately.Comment: Clarified various points and further improved code performance; accepted for publication in ApJS. Tutorials (including video) at http://tiny.cc/tango

    IAC user manual

    Get PDF
    The User Manual for the Integrated Analysis Capability (IAC) Level 1 system is presented. The IAC system currently supports the thermal, structures, controls and system dynamics technologies, and its development is influenced by the requirements for design/analysis of large space systems. The system has many features which make it applicable to general problems in engineering, and to management of data and software. Information includes basic IAC operation, executive commands, modules, solution paths, data organization and storage, IAC utilities, and module implementation

    Event views and graph reductions for understanding system level C code

    Get PDF
    Concurrent processing, runtime bindings and an extensive use of aggregate data structures make system level C codes difficult to understand. We propose event views and graph reductions as techniques to facilitate program comprehension. Starting with some domain knowledge, a user can apply these techniques to quickly identify and analyze exactly those parts of the program that are relevant to a given concern. We have built a tool called CVision to demonstrate applicability of the proposed techniques. CVi-sion is an interactive tool that allows the user to: (a) quickly get to the relevant parts of the code, (b) graphically visualize relationships between program elements, (c) interactively apply different graph reductions to eliminate irrelevant relationships. Using these capabilities, the user can quickly distill a large body of code and extract meaningful views of runtime events that capture the user\u27s concern. The proposed program comprehension techniques are demonstrated through two case studies based on Linux and XINU operating systems

    Software Infrastructure for Natural Language Processing

    Full text link
    We classify and review current approaches to software infrastructure for research, development and delivery of NLP systems. The task is motivated by a discussion of current trends in the field of NLP and Language Engineering. We describe a system called GATE (a General Architecture for Text Engineering) that provides a software infrastructure on top of which heterogeneous NLP processing modules may be evaluated and refined individually, or may be combined into larger application systems. GATE aims to support both researchers and developers working on component technologies (e.g. parsing, tagging, morphological analysis) and those working on developing end-user applications (e.g. information extraction, text summarisation, document generation, machine translation, and second language learning). GATE promotes reuse of component technology, permits specialisation and collaboration in large-scale projects, and allows for the comparison and evaluation of alternative technologies. The first release of GATE is now available - see http://www.dcs.shef.ac.uk/research/groups/nlp/gate/Comment: LaTeX, uses aclap.sty, 8 page

    Design and Implementation of a Tracer Driver: Easy and Efficient Dynamic Analyses of Constraint Logic Programs

    Full text link
    Tracers provide users with useful information about program executions. In this article, we propose a ``tracer driver''. From a single tracer, it provides a powerful front-end enabling multiple dynamic analysis tools to be easily implemented, while limiting the overhead of the trace generation. The relevant execution events are specified by flexible event patterns and a large variety of trace data can be given either systematically or ``on demand''. The proposed tracer driver has been designed in the context of constraint logic programming; experiments have been made within GNU-Prolog. Execution views provided by existing tools have been easily emulated with a negligible overhead. Experimental measures show that the flexibility and power of the described architecture lead to good performance. The tracer driver overhead is inversely proportional to the average time between two traced events. Whereas the principles of the tracer driver are independent of the traced programming language, it is best suited for high-level languages, such as constraint logic programming, where each traced execution event encompasses numerous low-level execution steps. Furthermore, constraint logic programming is especially hard to debug. The current environments do not provide all the useful dynamic analysis tools. They can significantly benefit from our tracer driver which enables dynamic analyses to be integrated at a very low cost.Comment: To appear in Theory and Practice of Logic Programming (TPLP), Cambridge University Press. 30 pages
    • …
    corecore