42 research outputs found
A nested model for AI design and validation
The growing AI field faces trust, transparency, fairness, and discrimination challenges. Despite the need for new regulations, there is a mismatch between regulatory science and AI, preventing a consistent framework. A five-layer nested model for AI design and validation aims to address these issues and streamline AI application design and validation, improving fairness, trust, and AI adoption. This model aligns with regulations, addresses AI practitionersâ daily challenges, and offers prescriptive guidance for determining appropriate evaluation approaches by identifying unique validity threats. We have three recommendations motivated by this model: (1) Authors should distinguish between layers when claiming contributions to clarify the specific areas in which the contribution is made and to avoid confusion; (2) authors should explicitly state upstream assumptions to ensure that the context and limitations of their AI system are clearly understood, (3) AI venues should promote thorough testing and validation of AI systems and their compliance with regulatory requirements
Analyzing colony dynamics and visualizing cell diversity in spatiotemporal experiments
Hattab G. Analyzing colony dynamics and visualizing cell diversity in spatiotemporal experiments. Bielefeld: UniversitÀt Bielefeld; 2018.Bioimaging technologies enable the description of the life cycle of organisms at the microscopic scale, for example bacterial cells. In the particular case of time lapse imaging, the coupling of experimental setups and marker protocols results in the acquisition of biological changes in spatiotemporal experiments. Such experiments are devised to obtain a time-lapse image data, which I refer to as biomovies. Understanding how a cell behaves at every time point is crucial. In fact, this motivated all cell studies in the literature, which are single cell oriented. For the present biomovies, the task is to identify similarly fluorescing subpopulations across space and time.
My interest lies in isogenic bacterial populations of *Sinorhizobium meliloti*. The biomoviesâ particularity is a dynamic range of high values for a set of different properties (e.g. cell density, cell count, etc), herein, leading to a bottleneck. State of the art methods cannot address such a task, which is partly due to their inability to handle highly dense populations and their adaptability to different experimental setups. In particular, they fall short either at the segmentation step (to delineate individual cells and extract their abstraction, e.g. cell centroid) or at the tracking step (to follow identified cells in each frame). To gain insight into bacterial growth at the population level, I claim that one does not really need to know the fate of each single cell.
In the context of this thesis, I present a series of pipelines and algorithms. First, preprocessing pipelines to reduce noise and enhance the object-to-background contrast. Second, an adaptive algorithm to correct spatial shift in the images (i.e. registration) and of each biomovie. Third and last, a modular algorithm that constructs coherent patch lineages by employing two adapted data abstractions, the particle and the patch, that are essential to solving the aforementioned bottleneck and are defined as follows: A particle is an intuitive geometric abstraction that results from considering whether the neighborhood around a pixel falls within a cell by checking for signal characteristics such as signal intensity, edge orientation, fluorescence signals, or texture. A patch is the aggregation of spatially contiguous particle trajectories that feature similar fluorescence patterns.
The methodology that creates coherent patch lineages is automatic and modular. By integrating aspects of object recognition and spatiotemporal changes, it lays down the foundation for investigating colony growth. All of the aforementioned pipelines represent a new methodological contribution to the field of lineage analysis and colony growth. I evaluate the proposed pipelines and algorithms on simulated and biological data, respectively. In turn this enabled me to validate the algorithms, interpret changes in the colony growth and differences among conditions of an experiment. In particular, I found that in a same condition, two isogenic bacterial colonies grew differently when faced with the same stress. The methods pioneered herein provide a key step to investigating colony growth
Interactive polar diagrams for model comparison
Objective
Evaluating the performance of multiple complex models, such as those found in biology, medicine, climatology, and machine learning, using conventional approaches is often challenging when using various evaluation metrics simultaneously. The traditional approach, which relies on presenting multi-model evaluation scores in the table, presents an obstacle when determining the similarities between the models and the order of performance.
Methods
By combining statistics, information theory, and data visualization, juxtaposed Taylor and Mutual Information Diagrams permit users to track and summarize the performance of one model or a collection of different models. To uncover linear and nonlinear relationships between models, users may visualize one or both charts.
Results
Our library presents the first publicly available implementation of the Mutual Information Diagram and its new interactive capabilities, as well as the first publicly available implementation of an interactive Taylor Diagram. Extensions have been implemented so that both diagrams can display temporality, multimodality, and multivariate data sets, and feature one scalar model property such as uncertainty. Our library, named polar-diagrams, supports both continuous and categorical attributes.
Conclusion
The library can be used to quickly and easily assess the performances of complex models, such as those found in machine learning, climate, or biomedical domains
ViCAR: An Adaptive and Landmark-Free Registration of Time Lapse Image Data from Microfluidics Experiments
Hattab G, SchlĂŒter J-P, Becker A, Nattkemper TW. ViCAR: An Adaptive and Landmark-Free Registration of Time Lapse Image Data from Microfluidics Experiments. Frontiers in Genetics. 2017;8: 69.In order to understand gene function in bacterial life cycles, time lapse bioimaging is applied in combination with different marker protocols in so called microfluidics chambers (i.e., a multi-well plate). In one experiment, a series of T images is recorded for one visual field, with a pixel resolution of 60 nm/px. Any (semi-)automatic analysis of the data is hampered by a strong image noise, low contrast and, last but not least, considerable irregular shifts during the acquisition. Image registration corrects such shifts enabling next steps of the analysis (e.g., feature extraction or tracking). Image alignment faces two obstacles in this microscopic context: (a) highly dynamic structural changes in the sample (i.e., colony growth) and (b) an individual data set-specific sample environment which makes the application of landmarks-based alignments almost impossible. We present a computational image registration solution, we refer to as ViCAR: (Vi)sual (C)ues based (A)daptive (R)egistration, for such microfluidics experiments, consisting of (1) the detection of particular polygons (outlined and segmented ones, referred to as visual cues), (2) the adaptive retrieval of three coordinates throughout different sets of frames, and finally (3) an image registration based on the relation of these points correcting both rotation and translation. We tested ViCAR with different data sets and have found that it provides an effective spatial alignment thereby paving the way to extract temporal features pertinent to each resulting bacterial colony. By using ViCAR, we achieved an image registration with 99.9% of image closeness, based on the average rmsd of 4.10â2 pixels, and superior results compared to a state of the art algorithm
MOSGA: Modular Open-Source Genome Annotator
The generation of high-quality assemblies, even for large eukaryotic genomes,
has become a routine task for many biologists thanks to recent advances in
sequencing technologies. However, the annotation of these assemblies - a
crucial step towards unlocking the biology of the organism of interest - has
remained a complex challenge that often requires advanced bioinformatics
expertise. Here we present MOSGA, a genome annotation framework for eukaryotic
genomes with a user-friendly web-interface that generates and integrates
annotations from various tools. The aggregated results can be analyzed with a
fully integrated genome browser and are provided in a format ready for
submission to NCBI. MOSGA is built on a portable, customizable, and easily
extendible Snakemake backend, and thus, can be tailored to a wide range of
users and projects. We provide MOSGA as a publicly free available web service
at https://mosga.mathematik.uni-marburg.de and as a docker container at
registry.gitlab.com/mosga/mosga:latest. Source code can be found at
https://gitlab.com/mosga/mosg
A Novel methodology for characterizing cell subpopulations in automated time-lapse microscopy
Hattab G, Wiesmann V, Becker A, Munzner T, Nattkemper TW. A Novel methodology for characterizing cell subpopulations in automated time-lapse microscopy. Frontiers in Bioengineering and Biotechnology. 2018;6: 17.Time-lapse imaging of cell colonies in microfluidic chambers provides time series of bioimages, i.e., biomovies. They show the behavior of cells over time under controlled conditions. One of the main remaining bottlenecks in this area of research is the analysis of experimental data and the extraction of cell growth characteristics, such as lineage information. The extraction of the cell line by human observers is time-consuming and error-prone. Previously proposed methods often fail because of their reliance on the accurate detection of a single cell, which is not possible for high density, high diversity of cell shapes and numbers, and high-resolution images with high noise. Our task is to characterize subpopulations in biomovies. In order to shift the analysis of the data from individual cell level to cellular groups with similar fluorescence or even subpopulations, we propose to represent the cells by two new abstractions: the particle and the patch. We use a three-step framework: preprocessing, particle tracking, and construction of the patch lineage. First, preprocessing improves the signal-to-noise ratio and spatially aligns the biomovie frames. Second, cell sampling is performed by assuming particles, which represent a part of a cell, cell or group of contiguous cells in space. Particle analysis includes the following: particle tracking, trajectory linking, filtering, and color information, respectively. Particle tracking consists of following the spatiotemporal position of a particle and gives rise to coherent particle trajectories over time. Typical tracking problems may occur (e.g., appearance or disappearance of cells, spurious artifacts). They are effectively processed using trajectory linking and filtering. Third, the construction of the patch lineage consists in joining particle trajectories that share common attributes (i.e., proximity and fluorescence intensity) and feature common ancestry. This step is based on patch finding, patching trajectory propagation, patch splitting, and patch merging. The main idea is to group together the trajectories of particles in order to gain spatial coherence. The final result of CYCASP is the complete graph of the patch lineage. Finally, the graph encodes the temporal and spatial coherence of the development of cellular colonies. We present results showing a computation time of less than 5 min for biomovies and simulated films. The method, presented here, allowed for the separation of colonies into subpopulations and allowed us to interpret the growth of colonies in a timely manner
Computational strategies to combat COVID-19: useful tools to accelerate SARS-CoV-2 and coronavirus research
SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is a novel virus of the family Coronaviridae. The virus causesthe infectious disease COVID-19. The biology of coronaviruses has been studied for many years. However, bioinformaticstools designed explicitly for SARS-CoV-2 have only recently been developed as a rapid reaction to the need for fast detection,understanding and treatment of COVID-19. To control the ongoing COVID-19 pandemic, it is of utmost importance to getinsight into the evolution and pathogenesis of the virus. In this review, we cover bioinformatics workflows and tools for theroutine detection of SARS-CoV-2 infection, the reliable analysis of sequencing data, the tracking of the COVID-19 pandemicand evaluation of containment measures, the study of coronavirus evolution, the discovery of potential drug targets anddevelopment of therapeutic strategies. For each tool, we briefly describe its use case and how it advances researchspecifically for SARS-CoV-2.Fil: Hufsky, Franziska. Friedrich Schiller University Jena; AlemaniaFil: Lamkiewicz, Kevin. Friedrich Schiller University Jena; AlemaniaFil: Almeida, Alexandre. the Wellcome Sanger Institute; Reino UnidoFil: Aouacheria, Abdel. Centre National de la Recherche Scientifique; FranciaFil: Arighi, Cecilia. Biocuration and Literature Access at PIR; Estados UnidosFil: Bateman, Alex. European Bioinformatics Institute. Head of Protein Sequence Resources; Reino UnidoFil: Baumbach, Jan. Universitat Technical Zu Munich; AlemaniaFil: Beerenwinkel, Niko. Universitat Technical Zu Munich; AlemaniaFil: Brandt, Christian. Jena University Hospital; AlemaniaFil: Cacciabue, Marco Polo Domingo. Instituto Nacional de TecnologĂa Agropecuaria. Centro de InvestigaciĂłn En Ciencias Veterinarias y AgronĂłmicas. Instituto de AgrobiotecnologĂa y BiologĂa Molecular. Consejo Nacional de Investigaciones CientĂficas y TĂ©cnicas. Oficina de CoordinaciĂłn Administrativa Parque Centenario. Instituto de AgrobiotecnologĂa y BiologĂa Molecular; Argentina. Consejo Nacional de Investigaciones CientĂficas y TĂ©cnicas; ArgentinaFil: Chuguransky, Sara RocĂo. European Bioinformatics Institute; Reino Unido. Consejo Nacional de Investigaciones CientĂficas y TĂ©cnicas; ArgentinaFil: Drechsel, Oliver. Robert Koch-Institute; AlemaniaFil: Finn, Robert D.. Biocurator for Pfam and InterPro databases; Reino UnidoFil: Fritz, Adrian. Helmholtz Centre for Infection Research; AlemaniaFil: Fuchs, Stephan. Robert Koch-Institute; AlemaniaFil: Hattab, Georges. University Marburg; AlemaniaFil: Hauschild, Anne Christin. University Marburg; AlemaniaFil: Heider, Dominik. University Marburg; AlemaniaFil: Hoffmann, Marie. Freie UniversitĂ€t Berlin; AlemaniaFil: Hölzer, Martin. Friedrich Schiller University Jena; AlemaniaFil: Hoops, Stefan. University of Virginia; Estados UnidosFil: Kaderali, Lars. University Medicine Greifswald; AlemaniaFil: Kalvari, Ioanna. European Bioinformatics Institute; Reino UnidoFil: von Kleist, Max. Robert Koch-Institute; AlemaniaFil: Kmiecinski, RenĂł. Robert Koch-Institute; AlemaniaFil: KĂŒhnert, Denise. Max Planck Institute for the Science of Human History; AlemaniaFil: Lasso, Gorka. Albert Einstein College of Medicine; Estados UnidosFil: Libin, Pieter. Hasselt University; BĂ©lgicaFil: List, Markus. Universitat Technical Zu Munich; AlemaniaFil: Löchel, Hannah F.. University Marburg; Alemani
Computational strategies to combat COVID-19: useful tools to accelerate SARS-CoV-2 and coronavirus research
SARS-CoV-2 (severe acute respiratory syndrome coronavirus 2) is a novel virus of the family Coronaviridae. The virus causes the infectious disease COVID-19. The biology of coronaviruses has been studied for many years. However, bioinformatics tools designed explicitly for SARS-CoV-2 have only recently been developed as a rapid reaction to the need for fast detection, understanding and treatment of COVID-19. To control the ongoing COVID-19 pandemic, it is of utmost importance to get insight into the evolution and pathogenesis of the virus. In this review, we cover bioinformatics workflows and tools for the routine detection of SARS-CoV-2 infection, the reliable analysis of sequencing data, the tracking of the COVID-19 pandemic and evaluation of containment measures, the study of coronavirus evolution, the discovery of potential drug targets and development of therapeutic strategies. For each tool, we briefly describe its use case and how it advances research specifically for SARS-CoV-2. All tools are free to use and available online, either through web applications or public code repositories.Peer Reviewe
Are we ready to track climate-driven shifts in marine species across international boundaries? - A global survey of scientific bottom trawl data
Marine biota are redistributing at a rapid pace in response to climate change and shifting seascapes. While changes in fish populations and community structure threaten the sustainability of fisheries, our capacity to adapt by tracking and projecting marine species remains a challenge due to data discontinuities in biological observations, lack of data availability, and mismatch between data and real species distributions. To assess the extent of this challenge, we review the global status and accessibility of ongoing scientific bottom trawl surveys. In total, we gathered metadata for 283,925 samples from 95 surveys conducted regularly from 2001 to 2019. We identified that 59% of the metadata collected are not publicly available, highlighting that the availability of data is the most important challenge to assess species redistributions under global climate change. Given that the primary purpose of surveys is to provide independent data to inform stock assessment of commercially important populations, we further highlight that single surveys do not cover the full range of the main commercial demersal fish species. An average of 18 surveys is needed to cover at least 50% of species ranges, demonstrating the importance of combining multiple surveys to evaluate species range shifts. We assess the potential for combining surveys to track transboundary species redistributions and show that differences in sampling schemes and inconsistency in sampling can be overcome with spatio-temporal modeling to follow species density redistributions. In light of our global assessment, we establish a framework for improving the management and conservation of transboundary and migrating marine demersal species. We provide directions to improve data availability and encourage countries to share survey data, to assess species vulnerabilities, and to support management adaptation in a time of climate-driven ocean changes.En prensa6,86
ghattab/banana-physics: v1.0
Physics gone bananas! A set of A7 cards to learn some of the most used physical constants