Search CORE

119 research outputs found

A Tale of Two Approaches: Comparing Top-Down and Bottom-Up Strategies for Analyzing and Visualizing High-Dimensional Data

Author: Anžel Aleksandar
Publication venue: Philipps-Universität Marburg
Publication date: 05/10/2023
Field of study

The proliferation of high-throughput and sensory technologies in various fields has led to a considerable increase in data volume, complexity, and diversity. Traditional data storage, analysis, and visualization methods are struggling to keep pace with the growth of modern data sets, necessitating innovative approaches to overcome the challenges of managing, analyzing, and visualizing data across various disciplines. One such approach is utilizing novel storage media, such as deoxyribonucleic acid~(DNA), which presents efficient, stable, compact, and energy-saving storage option. Researchers are exploring the potential use of DNA as a storage medium for long-term storage of significant cultural and scientific materials. In addition to novel storage media, scientists are also focussing on developing new techniques that can integrate multiple data modalities and leverage machine learning algorithms to identify complex relationships and patterns in vast data sets. These newly-developed data management and analysis approaches have the potential to unlock previously unknown insights into various phenomena and to facilitate more effective translation of basic research findings to practical and clinical applications. Addressing these challenges necessitates different problem-solving approaches. Researchers are developing novel tools and techniques that require different viewpoints. Top-down and bottom-up approaches are essential techniques that offer valuable perspectives for managing, analyzing, and visualizing complex high-dimensional multi-modal data sets. This cumulative dissertation explores the challenges associated with handling such data and highlights top-down, bottom-up, and integrated approaches that are being developed to manage, analyze, and visualize this data. The work is conceptualized in two parts, each reflecting the two problem-solving approaches and their uses in published studies. The proposed work showcases the importance of understanding both approaches, the steps of reasoning about the problem within them, and their concretization and application in various domains

Publikations- und Dokumentenserver der Universitätsbibliothek Marburg

gLite Workload Management System Performance Measurements

Author: Balaz A
Belic A
Bogojevic A R
Svraka N
Publication venue
Publication date: 01/01/2006
Field of study

In this paper an introduction to the gLite Grid middleware and one of its most important components, Workload Management System (WMS), responsible for management of user jobs is given.Useful performance metrics of gLite WMS are defined from a Grid application point of view, and preliminary results of performance measurements are presented and briefly analyzed

CERN Document Server

A differentiated proposal of three dimension i/o performance characterization model focusing on storage environments

Author: Pioli Junior Laercio
Publication venue: 'Revista Educacao em Foco (UFJF)'
Publication date: 25/08/2020
Field of study

The I/O bottleneck remains a central issue in high-performance environments. Cloud computing, high-performance computing (HPC) and big data environments share many underneath difficulties to deliver data at a desirable time rate requested by high-performance applications. This increases the possibility of creating bottlenecks throughout the application feeding process by bottom hardware devices located in the storage system layer. In the last years, many researchers have been proposed solutions to improve the I/O architecture considering different approaches. Some of them take advantage of hardware devices while others focus on a sophisticated software approach. However, due to the complexity of dealing with high-performance environments, creating solutions to improve I/O performance in both software and hardware is challenging and gives researchers many opportunities. Classifying these improvements in different dimensions allows researchers to understand how these improvements have been built over the years and how it progresses. In addition, it also allows future efforts to be directed to research topics that have developed at a lower rate, balancing the general development process. This research present a three-dimension characterization model for classifying research works on I/O performance improvements for large scale storage computing facilities. This classification model can also be used as a guideline framework to summarize researches providing an overview of the actual scenario. We also used the proposed model to perform a systematic literature mapping that covered ten years of research on I/O performance improvements in storage environments. This study classified hundreds of distinct researches identifying which were the hardware, software, and storage systems that received more attention over the years, which were the most researches proposals elements and where these elements were evaluated. In order to justify the importance of this model and the development of solutions that targets I/O performance improvements, we evaluated a subset of these improvements using a a real and complete experimentation environment, the Grid5000. Analysis over different scenarios using a synthetic I/O benchmark demonstrates how the throughput and latency parameters behaves when performing different I/O operations using distinct storage technologies and approaches.O gargalo de E/S continua sendo um problema central em ambientes de alto desempenho. Os ambientes de computação em nuvem, computação de alto desempenho (HPC) e big data compartilham muitas dificuldades para fornecer dados em uma taxa de tempo desejável solicitada por aplicações de alto desempenho. Isso aumenta a possibilidade de criar gargalos em todo o processo de alimentação de aplicativos pelos dispositivos de hardware inferiores localizados na camada do sistema de armazenamento. Nos últimos anos, muitos pesquisadores propuseram soluções para melhorar a arquitetura de E/S considerando diferentes abordagens. Alguns deles aproveitam os dispositivos de hardware, enquanto outros se concentram em uma abordagem sofisticada de software. No entanto, devido à complexidade de lidar com ambientes de alto desempenho, criar soluções para melhorar o desempenho de E/S em software e hardware é um desafio e oferece aos pesquisadores muitas oportunidades. A classificação dessas melhorias em diferentes dimensões permite que os pesquisadores entendam como essas melhorias foram construídas ao longo dos anos e como elas progridem. Além disso, também permite que futuros esforços sejam direcionados para tópicos de pesquisa que se desenvolveram em menor proporção, equilibrando o processo geral de desenvolvimento. Esta pesquisa apresenta um modelo de caracterização tridimensional para classificar trabalhos de pesquisa sobre melhorias de desempenho de E/S para instalações de computação de armazenamento em larga escala. Esse modelo de classificação também pode ser usado como uma estrutura de diretrizes para resumir as pesquisas, fornecendo uma visão geral do cenário real. Também usamos o modelo proposto para realizar um mapeamento sistemático da literatura que abrangeu dez anos de pesquisa sobre melhorias no desempenho de E/S em ambientes de armazenamento. Este estudo classificou centenas de pesquisas distintas, identificando quais eram os dispositivos de hardware, software e sistemas de armazenamento que receberam mais atenção ao longo dos anos, quais foram os elementos de proposta mais pesquisados e onde esses elementos foram avaliados. Para justificar a importância desse modelo e o desenvolvimento de soluções que visam melhorias no desempenho de E/S, avaliamos um subconjunto dessas melhorias usando um ambiente de experimentação real e completo, o Grid5000. Análises em cenários diferentes usando um benchmark de E/S sintética demonstra como os parâmetros de vazão e latência se comportam ao executar diferentes operações de E/S usando tecnologias e abordagens distintas de armazenamento

Repositório Institucional - UFJF

Large-Scale Data Management and Analysis (LSDMA) - Big Data in Science

Author: Jung Christopher
Streit Achim
Publication venue: Karlsruher Institut für Technologie
Publication date: 01/01/2014
Field of study

KITopen

An Intelligent Visual Analysis Scheme for Automatic Disassembly Processes in the Recycling Industry

Author: Yildiz Erenus
Publication venue
Publication date: 20/04/2021
Field of study

Georg-August-University Göttingen

Recommended from our members

Tribological Performance of the Head-Disk Interface in Perpendicular Magnetic Recording and Heat-Assisted Magnetic Recording

Author: Trinh Tan Duy
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

International Data Corporation (IDC) estimates that hard disk drives will still be the main storage device for storing digital data in the next 10 years, holding approximately 80% of the data inside data centers. To increase the areal density of hard disk drives, the mechanical spacing between the head and disk surface has decreased to approximately 1nm. At such a small spacing, tribology of the head-disk interface, including head-disk contacts, wear, material buildup, and lubricant transfer, become increasingly more important for the reliability of hard disk drives. In addition to small spacing, heat-assisted magnetic recording (HAMR) technology aims to deliver higher areal density recording by heating up the media surface to a few hundred Celsius degrees, facilitating the writing process. High temperature at the head and disk surfaces cause serious reliability issues for the head-disk interface (HDI). Therefore, understanding of the main factors that affect the reliability of the head-disk interface is an essential task. In this dissertation, the effect of bias voltage and helium environment on the tribological performance of the head-disk interface is investigated. To do this, we first simulated the flying characteristics of the slider as a function of bias voltage in air and helium environment. Thereafter, an experimental study was performed using custom built tester located inside a sealed environmental chamber to study the effect of air and helium on wear and lubricant redistribution at the head-disk interface during load-unload. We investigated the effect of bias voltage and relative humidity on wear, material buildup, and nano-corrosion on the slider surface. Finally, we have studied laser current and laser optical power in heat-assisted magnetic recording as a function of operating radius, head-disk clearance, media design, and their effects on the life-time of the head-disk interface. The results of this dissertation provide guidance for the effect of bias voltage, relative humidity, and helium environment on wear, material buildup, corrosion, and lubricant transfer at the head-disk interface. More importantly, our experimental study in heat-assisted magnetic recording leads to a better understanding of the main factors that cause failure of the HAMR head-disk interface. Our results are important for the improvement of the tribological performance and reliability of perpendicular magnetic recording (PMR) and heat-assisted magnetic recording (HAMR) head-disk interface

eScholarship - University of California

Autonomous management of cost, performance, and resource uncertainty for migration of applications to infrastructure-as-a-service (IaaS) clouds

Author: Lloyd Wes J.
Publication venue: Colorado State University. Libraries
Publication date: 01/01/2014
Field of study

2014 Fall.Includes bibliographical references.Infrastructure-as-a-Service (IaaS) clouds abstract physical hardware to provide computing resources on demand as a software service. This abstraction leads to the simplistic view that computing resources are homogeneous and infinite scaling potential exists to easily resolve all performance challenges. Adoption of cloud computing, in practice however, presents many resource management challenges forcing practitioners to balance cost and performance tradeoffs to successfully migrate applications. These challenges can be broken down into three primary concerns that involve determining what, where, and when infrastructure should be provisioned. In this dissertation we address these challenges including: (1) performance variance from resource heterogeneity, virtualization overhead, and the plethora of vaguely defined resource types; (2) virtual machine (VM) placement, component composition, service isolation, provisioning variation, and resource contention for multitenancy; and (3) dynamic scaling and resource elasticity to alleviate performance bottlenecks. These resource management challenges are addressed through the development and evaluation of autonomous algorithms and methodologies that result in demonstrably better performance and lower monetary costs for application deployments to both public and private IaaS clouds. This dissertation makes three primary contributions to advance cloud infrastructure management for application hosting. First, it includes design of resource utilization models based on step-wise multiple linear regression and artificial neural networks that support prediction of better performing component compositions. The total number of possible compositions is governed by Bell's Number that results in a combinatorially explosive search space. Second, it includes algorithms to improve VM placements to mitigate resource heterogeneity and contention using a load-aware VM placement scheduler, and autonomous detection of under-performing VMs to spur replacement. Third, it describes a workload cost prediction methodology that harnesses regression models and heuristics to support determination of infrastructure alternatives that reduce hosting costs. Our methodology achieves infrastructure predictions with an average mean absolute error of only 0.3125 VMs for multiple workloads

Mountain Scholar (Digital Collections of Colorado and Wyoming)

Augmented Reality

Author
Publication venue: 'IntechOpen'
Publication date: 20/04/2021
Field of study

Augmented Reality (AR) is a natural development from virtual reality (VR), which was developed several decades earlier. AR complements VR in many ways. Due to the advantages of the user being able to see both the real and virtual objects simultaneously, AR is far more intuitive, but it's not completely detached from human factors and other restrictions. AR doesn't consume as much time and effort in the applications because it's not required to construct the entire virtual scene and the environment. In this book, several new and emerging application areas of AR are presented and divided into three sections. The first section contains applications in outdoor and mobile AR, such as construction, restoration, security and surveillance. The second section deals with AR in medical, biological, and human bodies. The third and final section contains a number of new and useful applications in daily living and learning

Directory of Open Access Books (DOAB)

The Applications of Workload Characterization in The World of Massive Data Storage

Author: He Weiping
Publication venue
Publication date: 01/08/2015
Field of study

University of Minnesota Ph.D. dissertation. August 2015. Major: Computer Science. Advisor: David Du. 1 computer file (PDF); x, 116 pages.The digital world is expanding exponentially because of the growth of various applications in domains including scientific fields, enterprise environment and internet services. Importantly, these applications have drastically different storage requirements including parallel I/O performance and storage capacity. Various technologies have been developed in order to better satisfy different storage requirements. I/O middleware software, parallel file systems and storage arrays are developed to improve I/O performance by increasing I/O parallelism at different levels. New storage media and data recording technologies such as shingled magnetic recording (SMR) are also developed to increase the storage capacity. This work focuses on improving existing technologies and designing new schemes based on I/O workload characterizations in corresponding storage environments. The contributions of this work can be summarized into four pieces, two on improving parallel I/O performance and two on increasing storage capacity. First, we design a comprehensive parallel I/O workload characterization and generation framework (called PIONEER) which can be used to synthesize a particular parallel I/O workload with desired I/O characteristics or precisely emulate a High Performance Computing (HPC) application of interest. Second, we propose a non-intrusive I/O middleware (called IO-Engine) to automatically improve a given parallel I/O workload in Lustre which is a widely used HPC or parallel I/O system. IO-Engine can explore the correlations between different software layers in the deep I/O path, as well as workload patterns at runtime to transparently transform the workload patterns and tune related I/O parameters in the system. Third, we design several novel static address mapping schemes for shingled write disks (SWDs) to minimize the write amplification overhead in hard drives adopting SMR technology. Fourth, we propose a track-level shingled translation layer (T-STL) for SWDs with hybrid update strategy (in-place update plus out-of-place update). T-STL uses dynamic address mapping scheme and performs garbage collection operations by migrating selected disk tracks. This scheme can provider larger storage capacity and better overall performance with the same effective storage percentages when compared to the static address mapping schemes

University of Minnesota Digital Conservancy