20 research outputs found

    Knowing the Distance: Understanding the Gap Between Synthetic and Real Data For Face Parsing

    Full text link
    The use of synthetic data for training computer vision algorithms has become increasingly popular due to its cost-effectiveness, scalability, and ability to provide accurate multi-modality labels. Although recent studies have demonstrated impressive results when training networks solely on synthetic data, there remains a performance gap between synthetic and real data that is commonly attributed to lack of photorealism. The aim of this study is to investigate the gap in greater detail for the face parsing task. We differentiate between three types of gaps: distribution gap, label gap, and photorealism gap. Our findings show that the distribution gap is the largest contributor to the performance gap, accounting for over 50% of the gap. By addressing this gap and accounting for the labels gap, we demonstrate that a model trained on synthetic data achieves comparable results to one trained on a similar amount of real data. This suggests that synthetic data is a viable alternative to real data, especially when real data is limited or difficult to obtain. Our study highlights the importance of content diversity in synthetic datasets and challenges the notion that the photorealism gap is the most critical factor affecting the performance of computer vision models trained on synthetic data

    The complete mitochondrial genome of the acid-tolerant fungus Penicillium ShG4C

    Get PDF
    AbstractComplete mitochondrial genome of the acid-tolerant fungus Penicillium ShG4C, isolated from oxidized sediments of an abandoned polymetallic mine site, has been sequenced using high-throughput sequencing approach. The mitochondrial genome represents a circular DNA molecule with size of 26,725bp. It encodes a usual set of mitochondrial genes, including 15 protein coding genes, large and small ribosomal RNAs and 27 tRNA genes. All genes are located on H-strand DNA and transcribed in one direction. Taxonomic analysis based on concatenated sequences of mitochondrial proteins confirmed taxonomic position of this fungus within the genus Penicillium. The sequence of the complete mitochondrial genome of Penicillium ShG4C was deposited in DBBJ/EMBL/GenBank under accession number KX931017

    Experimental and numerical investigation on spark ignition of linearly-arranged non-premixed swirling burners

    Get PDF
    The ignition characteristics of a non-premixed multiple-burner linear combustion chamber was investigated experimentally and numerically, focusing on the determination of the mechanisms driving flame propagation from burner to burner. For different inter-burner spacings, overall equivalence ratios and bulk velocities, measurements of the velocity field and the mixture fraction distribution have been performed, respectively, with laser doppler anemometry and planar laser-induced fluorescence of acetone in the un-ignited flow. It was shown that in every individual burner, gas mixes with air within a central recirculation zone (CRZ) where the mixture is flammable except in the axial central rich gas jet and the annular air jet. Flammable mixture from the CRZ is extracted by the annular jet and this results in the existence of bridges of positive flammability factor in the inter-burner region. These bridges allow flame fragments to travel from the CRZ of the ignited burner to the CRZ of the adjacent unignited one, leading to burner-to-burner flame propagation. The ignition probability that sparking within a burner results in ignition of the adjacent one was obtained by performing many separate ignition trials with a laser spark. Ignition probability contours were also computed using a previously developed stochastic low-order ignition model and a large eddy simulation (LES) time-averaged solution of the cold flow. The quantification of the probability a flame kernel leads to burner ignition explained the differences existing between experimental results and the model. The results presented in this article extend our understanding of the mechanisms underlying the global ignition behavior of non-premixed annular combustion chambers.The authors gratefully acknowledge financial assistance from the EPSRC

    Federated learning enables big data for rare cancer boundary detection.

    Get PDF
    Although machine learning (ML) has shown promise across disciplines, out-of-sample generalizability is concerning. This is currently addressed by sharing multi-site data, but such centralization is challenging/infeasible to scale due to various limitations. Federated ML (FL) provides an alternative paradigm for accurate and generalizable ML, by only sharing numerical model updates. Here we present the largest FL study to-date, involving data from 71 sites across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, reporting the largest such dataset in the literature (n = 6, 314). We demonstrate a 33% delineation improvement for the surgically targetable tumor, and 23% for the complete tumor extent, over a publicly trained model. We anticipate our study to: 1) enable more healthcare studies informed by large diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further analyses for glioblastoma by releasing our consensus model, and 3) demonstrate the FL effectiveness at such scale and task-complexity as a paradigm shift for multi-site collaborations, alleviating the need for data-sharing

    Author Correction: Federated learning enables big data for rare cancer boundary detection.

    Get PDF
    10.1038/s41467-023-36188-7NATURE COMMUNICATIONS14

    Federated Learning Enables Big Data for Rare Cancer Boundary Detection

    Get PDF
    Although machine learning (ML) has shown promise across disciplines, out-of-sample generalizability is concerning. This is currently addressed by sharing multi-site data, but such centralization is challenging/infeasible to scale due to various limitations. Federated ML (FL) provides an alternative paradigm for accurate and generalizable ML, by only sharing numerical model updates. Here we present the largest FL study to-date, involving data from 71 sites across 6 continents, to generate an automatic tumor boundary detector for the rare disease of glioblastoma, reporting the largest such dataset in the literature (n = 6, 314). We demonstrate a 33% delineation improvement for the surgically targetable tumor, and 23% for the complete tumor extent, over a publicly trained model. We anticipate our study to: 1) enable more healthcare studies informed by large diverse data, ensuring meaningful results for rare diseases and underrepresented populations, 2) facilitate further analyses for glioblastoma by releasing our consensus model, and 3) demonstrate the FL effectiveness at such scale and task-complexity as a paradigm shift for multi-site collaborations, alleviating the need for data-sharing

    Distribution Patterns of Antibiotic Resistance Genes and Their Bacterial Hosts in a Manure Lagoon of a Large-Scale Swine Finishing Facility

    No full text
    The spread of antibiotic resistance genes (ARGs) that are present in livestock manures, which are discharged into the environment, is a severe threat to human and animal health. Here, we used 16S rRNA gene profiling and metagenomic analysis to characterize microbial community composition and antibiotic resistance in a manure storage lagoon from a large-scale swine finishing facility. Manure samples were collected at intervals of two years. Both the prokaryotic community and the resistome were dominated by the Firmicutes, Proteobacteria and Bacteroidota. Metagenomic analysis of two samples revealed 726 and 641 ARGs classified into 59 and 46 AMR gene families. Besides multidrug efflux pumps, the predominating ARGs potentially encoded resistance to tetracyclines, macrolide–lincosamide–streptogramin, aminoglycosides, peptide antibiotics, rifamycin, chloramphenicol, and beta-lactams. Genes from all predominant AMR gene families were found in both samples indicating overall long-term stability of the resistome. Antibiotic efflux pumps were the primary type of ARGs in the Proteobacteria, while antibiotic target alteration or protection was the main mechanism of resistance in the Firmicutes, Actinobacteriota and Bacteroidota. Metagenome-assembled genomes (MAG) of four multidrug-resistant strains were assembled. The first MAG, assigned to Escherichia flexneri, contained 46 ARGs, including multidrug efflux pumps, modified porins, beta-lactamases, and genes conferring resistance to peptide antibiotics. The second MAG, assigned to the family Alcaligenaceae, contained 18 ARGs encoding resistance to macrolide–lincosamide–streptogramin, tetracyclines, aminoglycosides and diaminopyrimidins. Two other MAGs representing the genera Atopostipes and Prevotella, contained four and seven ARGs, respectively. All these MAGs represented minor community members and accounted for less than 0.3% of the whole metagenome. Overall, a few lineages originated from the gut but relatively rare in the manure storage lagoon, are the main source of ARGs and some of them carry multiple resistance determinants
    corecore