18 research outputs found
Test-time augmentation for deep learning-based cell segmentation on microscopy images
Recent advancements in deep learning have revolutionized the way microscopy images of cells are processed. Deep learning network architectures have a large number of parameters, thus, in order to reach high accuracy, they require a massive amount of annotated data. A common way of improving accuracy builds on the artificial increase of the training set by using different augmentation techniques. A less common way relies on test-time augmentation (TTA) which yields transformed versions of the image for prediction and the results are merged. In this paper we describe how we have incorporated the test-time argumentation prediction method into two major segmentation approaches utilized in the single-cell analysis of microscopy images. These approaches are semantic segmentation based on the U-Net, and instance segmentation based on the Mask R-CNN models. Our findings show that even if only simple test-time augmentations (such as rotation or flipping and proper merging methods) are applied, TTA can significantly improve prediction accuracy. We have utilized images of tissue and cell cultures from the Data Science Bowl (DSB) 2018 nuclei segmentation competition and other sources. Additionally, boosting the highest-scoring method of the DSB with TTA, we could further improve prediction accuracy, and our method has reached an ever-best score at the DSB.Peer reviewe
Correction to “Improved False Discovery Rate Estimation Procedure for Shotgun Proteomics”
Correction to
“Improved False Discovery Rate
Estimation Procedure for Shotgun Proteomics
Tandem Mass Spectrum Identification via Cascaded Search
Accurate
assignment of peptide sequences to observed fragmentation
spectra is hindered by the large number of hypotheses that must be
considered for each observed spectrum. A high score assigned to a
particular peptide–spectrum match (PSM) may not end up being
statistically significant after multiple testing correction. Researchers
can mitigate this problem by controlling the hypothesis space in various
ways: considering only peptides resulting from enzymatic cleavages,
ignoring possible post-translational modifications or single nucleotide
variants, etc. However, these strategies sacrifice identifications
of spectra generated by rarer types of peptides. In this work, we
introduce a statistical testing framework, cascade search, that directly
addresses this problem. The method requires that the user specify <i>a priori</i> a statistical confidence threshold as well as a
series of peptide databases. For instance, such a cascade of databases
could include fully tryptic, semitryptic, and nonenzymatic peptides
or peptides with increasing numbers of modifications. Cascaded search
then gradually expands the list of candidate peptides from more likely
peptides toward rare peptides, sequestering at each stage any spectrum
that is identified with a specified statistical confidence. We compare
cascade search to a standard procedure that lumps all of the peptides
into a single database, as well as to a previously described group
FDR procedure that computes the FDR separately within each database.
We demonstrate, using simulated and real data, that cascade search
identifies more spectra at a fixed FDR threshold than with either
the ungrouped or grouped approach. Cascade search thus provides a
general method for maximizing the number of identified spectra in
a statistically rigorous fashion
Infrastructure Aware Scientific Workflows and Infrastructure Aware Workflow Managers in Science Gateways
The workflow interoperability problem was successfully solved by the SHIWA project if the workflows to be integrated were running in the same grid infrastructure. However, in the more generic case when the workflows were running in different infrastructures the problem has not been solved yet. In the current paper we show a solution for this problem by introducing a new type of workflow called infrastructure-aware workflow. These are scientific workflows extended with new node types that enable the on-the-fly creation and destruction of the required infrastructures in the clouds. The paper shows the semantics of these new types of nodes and workflows and also how they can solve the workflow interoperability problem. The paper also describes how these new type of workflows can be implemented by a new service called Occopus, and how this service can be integrated with the existing SHIWA Simulation Platform services like the WS-PGRADE/gUSE portal to provide the required functionalities of solving the workflow interoperability problem