Search CORE

999 research outputs found

Simplifying the design of workflows for large-scale data exploration and visualization

Author: Freire Juliana
Publication venue: Microsoft eScience Workshop
Publication date: 01/01/2008
Field of study

PresentationWorkflows and Computational Processes. Workflows are emerging as a paradigm for representing and managing complex computations - Simulations, data analysis, visualization, data integration

The University of Utah: J. Willard Marriott Digital Library

Towards enabling social analysis of scientific data

Author: Freire Juliana
Publication venue: CHI Social Data Analysis Workshop
Publication date: 01/01/2008
Field of study

Journal ArticleFlickr, Facebook, Yahoo! Pipes), which facilitate collaboration and sharing between users, are becoming increasingly popular. An important benefit of these sites is that they enable users to leverage the wisdom of the crowds. For example, in Flickr, users, in a mass collaboration approach, tag large volumes of pictures. These tags, in turn, help them to more easily find pictures they are looking for. In the (very) recent past, a new class of Web site has emerged that enables users to upload and collectively analyze many types of data (e.g., Many Eyes and Swivel). These are part of a broad phenomenon that has been called social data analysis". This trend is expanding to the scientific domain where a number of collaboratories are under development. As the cost of hardware decreases over time, the cost of people goes up as analyses get more involved, larger groups need to collaborate, and the volume of data manipulated increases. Science collaboratories aim to bridge this gap by allowing scientists to share, re-use and refine their computational tasks (workflows). In this position paper, we discuss the challenges and key components that are needed to enable the development of effective social data analysis (SDA) sites for the scientific domain

The University of Utah: J. Willard Marriott Digital Library

A Collaborative Approach to Computational Reproducibility

Author: Capone Rebecca
Chirigati Fernando
Freire Juliana
Rampin Remi
Shasha Dennis
Publication venue
Publication date: 01/01/2016
Field of study

Although a standard in natural science, reproducibility has been only episodically applied in experimental computer science. Scientific papers often present a large number of tables, plots and pictures that summarize the obtained results, but then loosely describe the steps taken to derive them. Not only can the methods and the implementation be complex, but also their configuration may require setting many parameters and/or depend on particular system configurations. While many researchers recognize the importance of reproducibility, the challenge of making it happen often outweigh the benefits. Fortunately, a plethora of reproducibility solutions have been recently designed and implemented by the community. In particular, packaging tools (e.g., ReproZip) and virtualization tools (e.g., Docker) are promising solutions towards facilitating reproducibility for both authors and reviewers. To address the incentive problem, we have implemented a new publication model for the Reproducibility Section of Information Systems Journal. In this section, authors submit a reproducibility paper that explains in detail the computational assets from a previous published manuscript in Information Systems

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

The Francis Crick Institute

Designing information-preserving mapping schemes for XML

Author: Barbosa Denilson
Freire Juliana
Publication venue: Very Large Data Base Endowment Inc. (VLDB)
Publication date: 01/01/2005
Field of study

Journal ArticleAn XML-to-relational mapping scheme consists of a procedure for shredding XML documents into relational databases, a procedure for publishing databases back as documents, and a set of constraints the databases must satisfy. In previous work, we discussed two notions of information preservation for mapping schemes: losslessness, which guarantees the complete reconstruction of a document from a database; and validation, which guarantees that every update to a database corresponding to a valid document results in a database corresponding to another valid document. Also, we described one information preserving mapping scheme, called Edge++, and showed that, under reasonable assumptions, lossless and validation are both undecidable. This leads to the question we study in this paper: how to design information-preserving mapping schemes. We propose to do it by starting with a scheme known to be information preserving (such as Edge++) and applying to it equivalence-preserving transformations written in weakly recursive ILOG. We study a particular incarnation of this framework, the LILO algorithm, and show that it provides signfii cant performance improvements over Edge++ and that the constraints it introduces are efficiently enforced in practice

The University of Utah: J. Willard Marriott Digital Library

IMAX: incremental maintenance of schema-based XML statistics

Author: Freire Juliana
Ramanath Maya
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2005
Field of study

Journal ArticleCurrent approaches for estimating the cardinality of XML queries are applicable to a static scenario wherein the underlying XML data does not change subsequent to the collection of statistics on the repository. However, in practice, many XML-based applications are dynamic and involve frequent updates to the data. In this paper, we investigate efficient strategies for incrementally maintaining statistical summaries as and when updates are applied to the data. Specifically, we propose algorithms that handle both the addition of new documents as well as random insertions in the existing document trees. We also show, through a detailed performance evaluation, that our incremental techniques are significantly faster than the naive recomputation approach; and that estimation accuracy can be maintained even with a fixed memory budget

The University of Utah: J. Willard Marriott Digital Library