290 research outputs found

    SparkFlow : towards high-performance data analytics for Spark-based genome analysis

    Get PDF
    The recent advances in DNA sequencing technology triggered next-generation sequencing (NGS) research in full scale. Big Data (BD) is becoming the main driver in analyzing these large-scale bioinformatic data. However, this complicated process has become the system bottleneck, requiring an amalgamation of scalable approaches to deliver the needed performance and hide the deployment complexity. Utilizing cutting-edge scientific workflows can robustly address these challenges. This paper presents a Spark-based alignment workflow called SparkFlow for massive NGS analysis over singularity containers. SparkFlow is highly scalable, reproducible, and capable of parallelizing computation by utilizing data-level parallelism and load balancing techniques in HPC and Cloud environments. The proposed workflow capitalizes on benchmarking two state-of-art NGS workflows, i.e., BaseRecalibrator and ApplyBQSR. SparkFlow realizes the ability to accelerate large-scale cancer genomic analysis by scaling vertically (HyperThreading) and horizontally (provisions on-demand). Our result demonstrates a trade-off inevitably between the targeted applications and processor architecture. SparkFlow achieves a decisive improvement in NGS computation performance, throughput, and scalability while maintaining deployment complexity. The paper’s findings aim to pave the way for a wide range of revolutionary enhancements and future trends within the High-performance Data Analytics (HPDA) genome analysis realm.Postprin

    Multi-platform media: how newspapers are adapting to the digital era

    Get PDF
    No abstract available

    DIGITALIZAÇÃO DO ENSINO SUPERIOR: IMPACTOS NAS PRÁTICAS DE GESTÃO E DESENVOLVIMENTO INSTITUCIONAL. UMA REVISÃO DE LITERATURA

    Get PDF
    For an extended period, the higher education community has been diligently endeavoring to implement technologically advanced and more efficacious methodologies. The aim is to enhance effectiveness and cultivate a generation of graduates equipped to navigate the evolving labor market dynamics and adapt to the influences of globalization. The outbreak of the Covid-19 pandemic significantly catalyzed the adoption of digital education, often referred to as "E-Learning," as a predominant mode of instruction across a majority of countries. This shift was necessitated by the imperative to adhere to social distancing measures and prevent the potential collapse of the educational infrastructure. In the wake of this transformative paradigm, educational institutions were compelled to engineer inventive management approaches to effectively traverse this altered landscape, marking the dawn of a new era. This era is characterized by a profound dependence on advanced technology and unfettered information accessibility as pivotal factors for sustaining and optimizing performance.This paper aims to explain the basic ideas behind managing higher education while exploring the existing research that supports these ideas. By breaking down the various aspects and tools involved, the goal is to shed light on the complex nature of managing higher education. This exploration eventually leads to an examination of how the digitalization of education impacts different functional areas of education management. Through this in-depth analysis, a clear connection emerges between the need for digitalization and the necessity to update management systems. This connection is crucial for not only achieving but also sustaining effective operation in this new era that combines technology and education.Durante um longo período, a comunidade do ensino superior tem-se esforçado diligentemente por implementar metodologias tecnologicamente avançadas e mais eficazes. O objetivo é aumentar a eficácia e cultivar uma geração de licenciados equipados para navegar na dinâmica do mercado de trabalho em evolução e adaptar-se às influências da globalização. O surto da pandemia de Covid-19 catalisou significativamente a adoção da educação digital, muitas vezes referida como "E-Learning", como um modo predominante de ensino na maioria dos países. Esta mudança foi necessária devido ao imperativo de aderir a medidas de distanciamento social e evitar o potencial colapso da infraestrutura educativa. Na sequência deste paradigma transformador, as instituições de ensino foram obrigadas a conceber abordagens de gestão inventivas para atravessar eficazmente esta paisagem alterada, marcando o início de uma nova era. Esta era caracteriza-se por uma profunda dependência da tecnologia avançada e da acessibilidade ilimitada à informação como factores essenciais para sustentar e otimizar o desempenho. Este documento tem por objetivo explicar as ideias básicas subjacentes à gestão do ensino superior, explorando simultaneamente a investigação existente que apoia estas ideias. Ao decompor os vários aspectos e instrumentos envolvidos, o objetivo é esclarecer a natureza complexa da gestão do ensino superior. Esta exploração acaba por conduzir a um exame da forma como a digitalização do ensino afecta as diferentes áreas funcionais da gestão do ensino. Através desta análise aprofundada, surge uma ligação clara entre a necessidade de digitalização e a necessidade de atualizar os sistemas de gestão. Esta ligação é crucial não só para alcançar, mas também para manter um funcionamento eficaz nesta nova era que combina tecnologia e educação

    An evaluation of galaxy and ruffus-scripting workflows system for DNA-seq analysis

    Get PDF
    >Magister Scientiae - MScFunctional genomics determines the biological functions of genes on a global scale by using large volumes of data obtained through techniques including next-generation sequencing (NGS). The application of NGS in biomedical research is gaining in momentum, and with its adoption becoming more widespread, there is an increasing need for access to customizable computational workflows that can simplify, and offer access to, computer intensive analyses of genomic data. In this study, the Galaxy and Ruffus frameworks were designed and implemented with a view to address the challenges faced in biomedical research. Galaxy, a graphical web-based framework, allows researchers to build a graphical NGS data analysis pipeline for accessible, reproducible, and collaborative data-sharing. Ruffus, a UNIX command-line framework used by bioinformaticians as Python library to write scripts in object-oriented style, allows for building a workflow in terms of task dependencies and execution logic. In this study, a dual data analysis technique was explored which focuses on a comparative evaluation of Galaxy and Ruffus frameworks that are used in composing analysis pipelines. To this end, we developed an analysis pipeline in Galaxy, and Ruffus, for the analysis of Mycobacterium tuberculosis sequence data. Furthermore, this study aimed to compare the Galaxy framework to Ruffus with preliminary analysis revealing that the analysis pipeline in Galaxy displayed a higher percentage of load and store instructions. In comparison, pipelines in Ruffus tended to be CPU bound and memory intensive. The CPU usage, memory utilization, and runtime execution are graphically represented in this study. Our evaluation suggests that workflow frameworks have distinctly different features from ease of use, flexibility, and portability, to architectural designs

    Artificial Intelligence and Digital Work: The Sociotechnical Reversal

    Get PDF
    A well-designed information system (IS) in the classical view comprises two interrelated yet different subsystems; one that represents the technological dimension of work; and one that represents the social dimension. When these subsystems are heralded as equally important, they constitute a sociotechnical whole, producing economic outcomes such as profit and efficiency, plus humanistic outcomes, such as engagement and well-being. We see, increasingly, this classical view becoming obliviated. In this conceptual paper, we reflect upon the role of humans and technology in these changing work environments. While technical aspects from Artificial Intelligence and digital technologies are dominating the social side of work, we suggest a sociotechnical reversal to happen. Whereas this technosocial reality might be well motivated by advances in efficiency and productivity, the effects on well-being and engagement are less well understood. Consequently, we provide a set of theoretically derived principles to guide these changes in the digital workplace

    Clinical deployment environments: Five pillars of translational machine learning for health

    Get PDF
    Machine Learning for Health (ML4H) has demonstrated efficacy in computer imaging and other self-contained digital workflows, but has failed to substantially impact routine clinical care. This is no longer because of poor adoption of Electronic Health Records Systems (EHRS), but because ML4H needs an infrastructure for development, deployment and evaluation within the healthcare institution. In this paper, we propose a design pattern called a Clinical Deployment Environment (CDE). We sketch the five pillars of the CDE: (1) real world development supported by live data where ML4H teams can iteratively build and test at the bedside (2) an ML-Ops platform that brings the rigour and standards of continuous deployment to ML4H (3) design and supervision by those with expertise in AI safety (4) the methods of implementation science that enable the algorithmic insights to influence the behaviour of clinicians and patients and (5) continuous evaluation that uses randomisation to avoid bias but in an agile manner. The CDE is intended to answer the same requirements that bio-medicine articulated in establishing the translational medicine domain. It envisions a transition from "real-world" data to "real-world" development
    corecore