18 research outputs found

    A Computational Workflow for Interdisciplinary Deep Learning Projects utilizing bwHPC Infrastructure

    Get PDF
    Deep neural networks have the capability to solve complex tasks through accurate function approximation. The process from submitting domain data and defining process requirements to analyzed data consists of multiple steps, disallowing a simplistic straightforward procedure. It follows that one of the core questions is: how does an application development process facilitating interaction between data scientists and domain experts look like? Practically, two connected challenges need to be addressed. Firstly, it requires a solution for handling large amounts of domain-specific data. Secondly, when dealing with complex deep neural networks, it is essential to find a concept of how model training can be designed in an computationally efficient manner. While tailored solutions for addressing these challenges in interdisciplinary deep learning projects exist, a comprehensive and structured approach is missing. Hence, we present a computational workflow to enhance these kinds of projects concerning data handling, integration of cluster computing resources such as bwHPC infrastructure, and development processes. We exemplify our proposal by means of a biomedical image analysis project

    Precise and Automated Tomographic Reconstruction with a Limited Number of Projections

    Get PDF
    This thesis proposed a parameter-optimized iterative reconstruction method (optimized-CGTV) for tomographic reconstruction with limited projections subject to the minimization of the total variation (TV). The reconstruction problem is solved with a parameter optimization applying a discrete L-curve. The optimized-CGTV reconstruction method is incorporated into an automatic framework of parallel 3D reconstruction on a computer cluster to achieve a rapid reconstruction process

    Helmholtz Portfolio Theme Large-Scale Data Management and Analysis (LSDMA)

    Get PDF
    The Helmholtz Association funded the "Large-Scale Data Management and Analysis" portfolio theme from 2012-2016. Four Helmholtz centres, six universities and another research institution in Germany joined to enable data-intensive science by optimising data life cycles in selected scientific communities. In our Data Life cycle Labs, data experts performed joint R&D together with scientific communities. The Data Services Integration Team focused on generic solutions applied by several communities

    New Methods to Improve Large-Scale Microscopy Image Analysis with Prior Knowledge and Uncertainty

    Get PDF
    Multidimensional imaging techniques provide powerful ways to examine various kinds of scientific questions. The routinely produced data sets in the terabyte-range, however, can hardly be analyzed manually and require an extensive use of automated image analysis. The present work introduces a new concept for the estimation and propagation of uncertainty involved in image analysis operators and new segmentation algorithms that are suitable for terabyte-scale analyses of 3D+t microscopy images

    New Methods to Improve Large-Scale Microscopy Image Analysis with Prior Knowledge and Uncertainty

    Get PDF
    Multidimensional imaging techniques provide powerful ways to examine various kinds of scientific questions. The routinely produced datasets in the terabyte-range, however, can hardly be analyzed manually and require an extensive use of automated image analysis. The present thesis introduces a new concept for the estimation and propagation of uncertainty involved in image analysis operators and new segmentation algorithms that are suitable for terabyte-scale analyses of 3D+t microscopy images.Comment: 218 pages, 58 figures, PhD thesis, Department of Mechanical Engineering, Karlsruhe Institute of Technology, published online with KITopen (License: CC BY-SA 3.0, http://dx.doi.org/10.5445/IR/1000057821

    An ensemble-averaged, cell density-based digital model of zebrafish embryo development derived from light-sheet microscopy data with single-cell resolution

    Get PDF
    A new era in developmental biology has been ushered in by recent advances in the quantitative imaging of all-cell morphogenesis in living organisms. Here we have developed a light-sheet fluorescence microscopy-based framework with single-cell resolution for identification and characterization of subtle phenotypical changes of millimeter-sized organisms. Such a comparative study requires analyses of entire ensembles to be able to distinguish sample-to-sample variations from definitive phenotypical changes. We present a kinetic digital model of zebrafish embryos up to 16h of development. The model is based on the precise overlay and averaging of data taken on multiple individuals and describes the cell density and its migration direction at every point in time. Quantitative metrics for multi-sample comparative studies have been introduced to analyze developmental variations within the ensemble. The digital model may serve as a canvas on which the behavior of cellular subpopulations can be studied. As an example, we have investigated cellular rearrangements during germ layer formation at the onset of gastrulation. A comparison of the one-eyed pinhead (oep) mutant with the digital model of the wild-type embryo reveals its abnormal development at the onset of gastrulation, many hours before changes are obvious to the eye

    Generic Metadata Handling in Scientific Data Life Cycles

    Get PDF
    Scientific data life cycles define how data is created, handled, accessed, and analyzed by users. Such data life cycles become increasingly sophisticated as the sciences they deal with become more and more demanding and complex with the coming advent of exascale data and computing. The overarching data life cycle management background includes multiple abstraction categories with data sources, data and metadata management, computing and workflow management, security, data sinks, and methods on how to enable utilization. Challenges in this context are manifold. One is to hide the complexity from the user and to enable seamlessness in using resources to usability and efficiency. Another one is to enable generic metadata management that is not restricted to one use case but can be adapted with limited effort to further ones. Metadata management is essential to enable scientists to save time by avoiding the need for manually keeping track of data, meaning for example by its content and location. As the number of files grows into the millions, managing data without metadata becomes increasingly difficult. Thus, the solution is to employ metadata management to enable the organization of data based on information about it. Previously, use cases tended to only support highly specific or no metadata management at all. Now, a generic metadata management concept is available that can be used to efficiently integrate metadata capabilities with use cases. The concept was implemented within the MoSGrid data life cycle that enables molecular simulations on distributed HPC-enabled data and computing infrastructures. The implementation enables easy-to-use and effective metadata management. Automated extraction, annotation, and indexing of metadata was designed, developed, integrated, and search capabilities provided via a seamless user interface. Further analysis runs can be directly started based on search results. A complete evaluation of the concept both in general and along the example implementation is presented. In conclusion, generic metadata management concept advances the state of the art in scientific date life cycle management

    Novel high performance techniques for high definition computer aided tomography

    Get PDF
    Mención Internacional en el título de doctorMedical image processing is an interdisciplinary field in which multiple research areas are involved: image acquisition, scanner design, image reconstruction algorithms, visualization, etc. X-Ray Computed Tomography (CT) is a medical imaging modality based on the attenuation suffered by the X-rays as they pass through the body. Intrinsic differences in attenuation properties of bone, air, and soft tissue result in high-contrast images of anatomical structures. The main objective of CT is to obtain tomographic images from radiographs acquired using X-Ray scanners. The process of building a 3D image or volume from the 2D radiographs is known as reconstruction. One of the latest trends in CT is the reduction of the radiation dose delivered to patients through the decrease of the amount of acquired data. This reduction results in artefacts in the final images if conventional reconstruction methods are used, making it advisable to employ iterative reconstruction algorithms. There are numerous reconstruction algorithms available, from which we can highlight two specific types: traditional algorithms, which are fast but do not enable the obtaining of high quality images in situations of limited data; and iterative algorithms, slower but more reliable when traditional methods do not reach the quality standard requirements. One of the priorities of reconstruction is the obtaining of the final images in near real time, in order to reduce the time spent in diagnosis. To accomplish this objective, new high performance techniques and methods for accelerating these types of algorithms are needed. This thesis addresses the challenges of both traditional and iterative reconstruction algorithms, regarding acceleration and image quality. One common approach for accelerating these algorithms is the usage of shared-memory and heterogeneous architectures. In this thesis, we propose a novel simulation/reconstruction framework, namely FUX-Sim. This framework follows the hypothesis that the development of new flexible X-ray systems can benefit from computer simulations, which may also enable performance to be checked before expensive real systems are implemented. Its modular design abstracts the complexities of programming for accelerated devices to facilitate the development and evaluation of the different configurations and geometries available. In order to obtain near real execution times, low-level optimizations for the main components of the framework are provided for Graphics Processing Unit (GPU) architectures. Other alternative tackled in this thesis is the acceleration of iterative reconstruction algorithms by using distributed memory architectures. We present a novel architecture that unifies the two most important computing paradigms for scientific computing nowadays: High Performance Computing (HPC). The proposed architecture combines Big Data frameworks with the advantages of accelerated computing. The proposed methods presented in this thesis provide more flexible scanner configurations as they offer an accelerated solution. Regarding performance, our approach is as competitive as the solutions found in the literature. Additionally, we demonstrate that our solution scales with the size of the problem, enabling the reconstruction of high resolution images.El procesamiento de imágenes médicas es un campo interdisciplinario en el que participan múltiples áreas de investigación como la adquisición de imágenes, diseño de escáneres, algoritmos de reconstrucción de imágenes, visualización, etc. La tomografía computarizada (TC) de rayos X es una modalidad de imágen médica basada en el cálculo de la atenuación sufrida por los rayos X a medida que pasan por el cuerpo a escanear. Las diferencias intrínsecas en la atenuación de hueso, aire y tejido blando dan como resultado imágenes de alto contraste de estas estructuras anatómicas. El objetivo principal de la TC es obtener imágenes tomográficas a partir estas radiografías obtenidas mediante escáneres de rayos X. El proceso de construir una imagen o volumen en 3D a partir de las radiografías 2D se conoce como reconstrucción. Una de las últimas tendencias en la tomografía computarizada es la reducción de la dosis de radiación administrada a los pacientes a través de la reducción de la cantidad de datos adquiridos. Esta reducción da como resultado artefactos en las imágenes finales si se utilizan métodos de reconstrucción convencionales, por lo que es aconsejable emplear algoritmos de reconstrucción iterativos. Existen numerosos algoritmos de reconstrucción disponibles a partir de los cuales podemos destacar dos categorías: algoritmos tradicionales, rápidos pero no permiten obtener imágenes de alta calidad en situaciones en las que los datos son limitados; y algoritmos iterativos, más lentos pero más estables en situaciones donde los métodos tradicionales no alcanzan los requisitos en cuanto a la calidad de la imagen. Una de las prioridades de la reconstrucción es la obtención de las imágenes finales en tiempo casi real, con el fin de reducir el tiempo de diagnóstico. Para lograr este objetivo, se necesitan nuevas técnicas y métodos de alto rendimiento para acelerar estos algoritmos. Esta tesis aborda los desafíos de los algoritmos de reconstrucción tradicionales e iterativos, con respecto a la aceleración y la calidad de imagen. Un enfoque común para acelerar estos algoritmos es el uso de arquitecturas de memoria compartida y heterogéneas. En esta tesis, proponemos un nuevo sistema de simulación/reconstrucción, llamado FUX-Sim. Este sistema se construye alrededor de la hipótesis de que el desarrollo de nuevos sistemas de rayos X flexibles puede beneficiarse de las simulaciones por computador, en los que también se puede realizar un control del rendimiento de los nuevos sistemas a desarrollar antes de su implementación física. Su diseño modular abstrae las complejidades de la programación para aceleradores con el objetivo de facilitar el desarrollo y la evaluación de las diferentes configuraciones y geometrías disponibles. Para obtener ejecuciones en casi tiempo real, se proporcionan optimizaciones de bajo nivel para los componentes principales del sistema en las arquitecturas GPU. Otra alternativa abordada en esta tesis es la aceleración de los algoritmos de reconstrucción iterativa mediante el uso de arquitecturas de memoria distribuidas. Presentamos una arquitectura novedosa que unifica los dos paradigmas informáticos más importantes en la actualidad: computación de alto rendimiento (HPC) y Big Data. La arquitectura propuesta combina sistemas Big Data con las ventajas de los dispositivos aceleradores. Los métodos propuestos presentados en esta tesis proporcionan configuraciones de escáner más flexibles y ofrecen una solución acelerada. En cuanto al rendimiento, nuestro enfoque es tan competitivo como las soluciones encontradas en la literatura. Además, demostramos que nuestra solución escala con el tamaño del problema, lo que permite la reconstrucción de imágenes de alta resolución.This work has been mainly funded thanks to a FPU fellowship (FPU14/03875) from the Spanish Ministry of Education. It has also been partially supported by other grants: • DPI2016-79075-R. “Nuevos escenarios de tomografía por rayos X”, from the Spanish Ministry of Economy and Competitiveness. • TIN2016-79637-P Towards unification of HPC and Big Data Paradigms from the Spanish Ministry of Economy and Competitiveness. • Short-term scientific missions (STSM) grant from NESUS COST Action IC1305. • TIN2013-41350-P, Scalable Data Management Techniques for High-End Computing Systems from the Spanish Ministry of Economy and Competitiveness. • RTC-2014-3028-1 NECRA Nuevos escenarios clinicos con radiología avanzada from the Spanish Ministry of Economy and Competitiveness.Programa Oficial de Doctorado en Ciencia y Tecnología InformáticaPresidente: José Daniel García Sánchez.- Secretario: Katzlin Olcoz Herrero.- Vocal: Domenico Tali

    New Methods to Improve Large-Scale Microscopy Image Analysis with Prior Knowledge and Uncertainty

    Get PDF
    Multidimensional imaging techniques provide powerful ways to examine various kinds of scientific questions. The routinely produced data sets in the terabyte-range, however, can hardly be analyzed manually and require an extensive use of automated image analysis. The present work introduces a new concept for the estimation and propagation of uncertainty involved in image analysis operators and new segmentation algorithms that are suitable for terabyte-scale analyses of 3D+t microscopy images
    corecore