Search CORE

3,511 research outputs found

Enabling autoscaling for in-memory storage in cluster computing framework

Author: Shrestha Bibek Raj
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2019
Field of study

2019 Spring.Includes bibliographical references.IoT enabled devices and observational instruments continuously generate voluminous data. A large portion of these datasets are delivered with the associated geospatial locations. The increased volumes of geospatial data, alongside the emerging geospatial services, pose computational challenges for large-scale geospatial analytics. We have designed and implemented STRETCH , an in-memory distributed geospatial storage that preserves spatial proximity and enables proactive autoscaling for frequently accessed data. STRETCH stores data with a delayed data dispersion scheme that incrementally adds data nodes to the storage system. We have devised an autoscaling feature that proactively repartitions data to alleviate computational hotspots before they occur. We compared the performance of S TRETCH with Apache Ignite and the results show that STRETCH provides up to 3 times the throughput when the system encounters hotspots. STRETCH is built on Apache Spark and Ignite and interacts with them at runtime

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Augmented Tree-based Routing Protocol for Scalable Ad Hoc Networks

Author: Caleffi Marcello
Ferraiuolo Giancarlo
Paura Luigi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2007
Field of study

In ad hoc networks scalability is a critical requirement if these technologies have to reach their full potential. Most of the proposed routing protocols do not operate efficiently with networks of more than a few hundred nodes. In this paper, we propose an augmented tree-based address space structure and a hierarchical multi-path routing protocol, referred to as Augmented Tree-based Routing (ATR), which utilizes such a structure in order to solve the scalability problem and to gain good resilience against node failure/mobility and link congestion/instability. Simulation results and performance comparisons with existing protocols substantiate the effectiveness of the ATR.Comment: Routing, mobile ad hoc network, MANET, dynamic addressing, multi-path, distributed hash table, DH

arXiv.org e-Print Archive

The Parallelism Motifs of Genomic Data Analysis

Author: Awan Muaaz
Azad Ariful
Brock Benjamin
Buluc Aydin
Egan Rob
Ekanayake Saliya
Ellis Marquita
Georganas Evangelos
Guidi Giulia
Hofmeyr Steven
Oliker Leonid
Selvitopi Oguz
Teodoropol Cristina
Yelick Katherine
Publication venue: 'The Royal Society'
Publication date: 20/01/2020
Field of study

Genomic data sets are growing dramatically as the cost of sequencing continues to decline and small sequencing devices become available. Enormous community databases store and share this data with the research community, but some of these genomic data analysis problems require large scale computational platforms to meet both the memory and computational requirements. These applications differ from scientific simulations that dominate the workload on high end parallel systems today and place different requirements on programming support, software libraries, and parallel architectural design. For example, they involve irregular communication patterns such as asynchronous updates to shared data structures. We consider several problems in high performance genomics analysis, including alignment, profiling, clustering, and assembly for both single genomes and metagenomes. We identify some of the common computational patterns or motifs that help inform parallelization strategies and compare our motifs to some of the established lists, arguing that at least two key patterns, sorting and hashing, are missing

arXiv.org e-Print Archive

eScholarship - University of California