Search CORE

16,104 research outputs found

Neuroimaging study designs, computational analyses and data provenance using the LONI pipeline.

Author: Chakrapani Shruthi
Dinov Ivo
Eggert Paul
Gutman Boris
Leung Kelvin
Liu Zhizhong
Lozev Kamen
Magsipoc Rico
Parker D Stott
Petrosyan Petros
Pierce Jonathan
Toga Arthur
Van Horn John
Woods Roger
Zamanyan Alen
Publication venue: eScholarship, University of California
Publication date: 01/01/2010
Field of study

Modern computational neuroscience employs diverse software tools and multidisciplinary expertise to analyze heterogeneous brain data. The classical problems of gathering meaningful data, fitting specific models, and discovering appropriate analysis and visualization tools give way to a new class of computational challenges--management of large and incongruous data, integration and interoperability of computational resources, and data provenance. We designed, implemented and validated a new paradigm for addressing these challenges in the neuroimaging field. Our solution is based on the LONI Pipeline environment [3], [4], a graphical workflow environment for constructing and executing complex data processing protocols. We developed study-design, database and visual language programming functionalities within the LONI Pipeline that enable the construction of complete, elaborate and robust graphical workflows for analyzing neuroimaging and other data. These workflows facilitate open sharing and communication of data and metadata, concrete processing protocols, result validation, and study replication among different investigators and research groups. The LONI Pipeline features include distributed grid-enabled infrastructure, virtualized execution environment, efficient integration, data provenance, validation and distribution of new computational tools, automated data format conversion, and an intuitive graphical user interface. We demonstrate the new LONI Pipeline features using large scale neuroimaging studies based on data from the International Consortium for Brain Mapping [5] and the Alzheimer's Disease Neuroimaging Initiative [6]. User guides, forums, instructions and downloads of the LONI Pipeline environment are available at http://pipeline.loni.ucla.edu

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

A Taxonomy of Data Grids for Distributed Data Sharing, Management and Processing

Author: Buyya Rajkumar
Ramamohanarao Kotagiri
Venugopal Srikumar
Publication venue
Publication date: 10/06/2005
Field of study

Data Grids have been adopted as the platform for scientific communities that need to share, access, transport, process and manage large data collections distributed worldwide. They combine high-end computing technologies with high-performance networking and wide-area storage management techniques. In this paper, we discuss the key concepts behind Data Grids and compare them with other data sharing and distribution paradigms such as content delivery networks, peer-to-peer networks and distributed databases. We then provide comprehensive taxonomies that cover various aspects of architecture, data transportation, data replication and resource allocation and scheduling. Finally, we map the proposed taxonomy to various Data Grid systems not only to validate the taxonomy but also to identify areas for future exploration. Through this taxonomy, we aim to categorise existing systems to better understand their goals and their methodology. This would help evaluate their applicability for solving similar problems. This taxonomy also provides a "gap analysis" of this area through which researchers can potentially identify new issues for investigation. Finally, we hope that the proposed taxonomy and mapping also helps to provide an easy way for new practitioners to understand this complex area of research.Comment: 46 pages, 16 figures, Technical Repor

arXiv.org e-Print Archive

CiteSeerX

University of Melbourne Institutional Repository

Storage Solutions for Big Data Systems: A Qualitative Study and Comparison

Author: Alam Mansaf
Ali Syed Arshad
Khan Samiya
Liu Xiufeng
Publication venue
Publication date: 01/01/2019
Field of study

Big data systems development is full of challenges in view of the variety of application areas and domains that this technology promises to serve. Typically, fundamental design decisions involved in big data systems design include choosing appropriate storage and computing infrastructures. In this age of heterogeneous systems that integrate different technologies for optimized solution to a specific real world problem, big data system are not an exception to any such rule. As far as the storage aspect of any big data system is concerned, the primary facet in this regard is a storage infrastructure and NoSQL seems to be the right technology that fulfills its requirements. However, every big data application has variable data characteristics and thus, the corresponding data fits into a different data model. This paper presents feature and use case analysis and comparison of the four main data models namely document oriented, key value, graph and wide column. Moreover, a feature analysis of 80 NoSQL solutions has been provided, elaborating on the criteria and points that a developer must consider while making a possible choice. Typically, big data storage needs to communicate with the execution engine and other processing and visualization technologies to create a comprehensive solution. This brings forth second facet of big data storage, big data file formats, into picture. The second half of the research paper compares the advantages, shortcomings and possible use cases of available big data file formats for Hadoop, which is the foundation for most big data computing technologies. Decentralized storage and blockchain are seen as the next generation of big data storage and its challenges and future prospects have also been discussed

arXiv.org e-Print Archive

Online Research Database In Technology

A replicated file system for Grid computing

Author: Honeyman Peter
Zhang Jiaying
Publication venue: 'Wiley'
Publication date: 25/06/2008
Field of study

To meet the rigorous demands of large-scale data sharing in global collaborations, we present a replication scheme for NFSv4 that supports mutable replication without sacrificing strong consistency guarantees. Experimental evaluation indicates a substantial performance advantage over a single-server system. With the introduction of a hierarchical replication control protocol, the overhead of replication is negligible even when applications mostly write and replication servers are widely distributed. Evaluation with the NAS Grid Benchmarks demonstrates that our system provides comparable and often better performance than GridFTP, the de facto standard for Grid data sharing. Copyright © 2008 John Wiley & Sons, Ltd.Peer Reviewedhttp://deepblue.lib.umich.edu/bitstream/2027.42/60228/1/1286_ftp.pd

Deep Blue Documents at the University of Michigan

Recommended from our members

Human-Centered Approaches in Geovisualization Design: Investigating Multiple Methods Through a Long-Term Case Study

Author: Dykes J.
Lloyd D.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/12/2011
Field of study

Working with three domain specialists we investigate human-centered approaches to geovisualization following an ISO13407 taxonomy covering context of use, requirements and early stages of design. Our case study, undertaken over three years, draws attention to repeating trends: that generic approaches fail to elicit adequate requirements for geovis application design; that the use of real data is key to understanding needs and possibilities; that trust and knowledge must be built and developed with collaborators. These processes take time but modified human-centred approaches can be effective. A scenario developed through contextual inquiry but supplemented with domain data and graphics is useful to geovis designers. Wireframe, paper and digital prototypes enable successful communication between specialist and geovis domains when incorporating real and interesting data, prompting exploratory behaviour and eliciting previously unconsidered requirements. Paper prototypes are particularly successful at eliciting suggestions, especially for novel visualization. Enabling specialists to explore their data freely with a digital prototype is as effective as using a structured task protocol and is easier to administer. Autoethnography has potential for framing the design process. We conclude that a common understanding of context of use, domain data and visualization possibilities are essential to successful geovis design and develop as this progresses. HC approaches can make a significant contribution here. However, modified approaches, applied with flexibility, are most promising. We advise early, collaborative engagement with data – through simple, transient visual artefacts supported by data sketches and existing designs – before moving to successively more sophisticated data wireframes and data prototypes

City Research Online

Crossref

Reliable Replication at Low Cost

Author: Honeyman Peter
Zhang Jiaying
Publication venue: Center for Information Technology Integration
Publication date: 01/01/2006
Field of study

The emerging global scientific collaborations demand a scalable, efficient, reliable, and still convenient data access and management scheme. To fulfill these requirements, this paper describes a replicated file system that supports mutable (i.e., read/write) replication with strong consistency guarantees, small performance penalty, high failure resilience, and good scaling properties. The paper further evaluates the system using a real scientific application. The evaluation results show that the presented replication system can significantly improve the application's performance by reducing the first-time access latency to read the input data and by distributing the verification of data access to a nearby server. Furthermore, the penalty of file replication is negligible as long as applications use synchronous writes at a moderate rate.http://deepblue.lib.umich.edu/bitstream/2027.42/107950/1/citi-tr-06-2.pd

Deep Blue Documents at the University of Michigan

Advanced Wide-Area Monitoring System Design, Implementation, and Application

Author: Wang Weikang
Publication venue: TRACE: Tennessee Research and Creative Exchange
Publication date: 01/08/2021
Field of study

Wide-area monitoring systems (WAMSs) provide an unprecedented way to collect, store and analyze ultra-high-resolution synchrophasor measurements to improve the dynamic observability in power grids. This dissertation focuses on designing and implementing a wide-area monitoring system and a series of applications to assist grid operators with various functionalities. The contributions of this dissertation are below: First, a synchrophasor data collection system is developed to collect, store, and forward GPS-synchronized, high-resolution, rich-type, and massive-volume synchrophasor data. a distributed data storage system is developed to store the synchrophasor data. A memory-based cache system is discussed to improve the efficiency of real-time situation awareness. In addition, a synchronization system is developed to synchronize the configurations among the cloud nodes. Reliability and Fault-Tolerance of the developed system are discussed. Second, a novel lossy synchrophasor data compression approach is proposed. This section first introduces the synchrophasor data compression problem, then proposes a methodology for lossy data compression, and finally presents the evaluation results. The feasibility of the proposed approach is discussed. Third, a novel intelligent system, SynchroService, is developed to provide critical functionalities for a synchrophasor system. Functionalities including data query, event query, device management, and system authentication are discussed. Finally, the resiliency and the security of the developed system are evaluated. Fourth, a series of synchrophasor-based applications are developed to utilize the high-resolution synchrophasor data to assist power system engineers to monitor the performance of the grid as well as investigate the root cause of large power system disturbances. Lastly, a deep learning-based event detection and verification system is developed to provide accurate event detection functionality. This section introduces the data preprocessing, model design, and performance evaluation. Lastly, the implementation of the developed system is discussed

University of Tennessee, Knoxville: Trace

Recommended from our members

An Assessment of PIER Electric Grid Research 2003-2014 White Paper

Author: Brown Merwin
Cibulka Lloyd
Mateer Niall
von Meier Alexandra
Publication venue: eScholarship, University of California
Publication date: 01/01/2015
Field of study

This white paper describes the circumstances in California around the turn of the 21st century that led the California Energy Commission (CEC) to direct additional Public Interest Energy Research funds to address critical electric grid issues, especially those arising from integrating high penetrations of variable renewable generation with the electric grid. It contains an assessment of the beneficial science and technology advances of the resultant portfolio of electric grid research projects administered under the direction of the CEC by a competitively selected contractor, the University of California’s California Institute for Energy and the Environment, from 2003-2014

eScholarship - University of California