643 research outputs found

    Reflections on the Evolution of the BookTracker Visualization Platform

    Get PDF
    Understanding the trade data of historical books is crucial for researchers investigating the distribution and provenance of Incunabula (books printed between 1450 and 1500). We incrementally developed BookTracker, a platform featuring multiple visualization and visual analytics applications to support these research efforts. This platform leverages data from the Material Evidence in Incunabula (MEI) database, which meticulously records detailed information on the provenance, ownership, and use of 15th-century printed books. BookTracker began with a focus on providing visualization and visual analytical solutionsto effectively present each book provenance’s chronological and geographical information. Through three years of collaborative work with domain experts, we continually explored the Material Evidence in Incunabula (MEI) data and discovered more possibilities for visualization to represent this rich information. Gradually, a suite of specialized visualization tools for specific analytical purposes was developed, including DanteSearchVis, DanteExploreVis, KURF2022, KURF2023, and OwnershipTracker. These tools now comprise the BookTracker platform, which has evolved to explore various features and aspects of the data. This paper details the evolution of BookTracker’s design and development alongside domain experts, highlighting the reflections and lessons learned from its application in various research projects. We discuss this long-term collaborative visualization project, hoping to offer our experience as a case study for similar research in the future

    Visualizing Historical Book Trade Data: An Iterative Design Study with Close Collaboration with Domain Experts

    Full text link
    The circulation of historical books has always been an area of interest for historians. However, the data used to represent the journey of a book across different places and times can be difficult for domain experts to digest due to buried geographical and chronological features within text-based presentations. This situation provides an opportunity for collaboration between visualization researchers and historians. This paper describes a design study where a variant of the Nine-Stage Framework was employed to develop a Visual Analytics (VA) tool called DanteExploreVis. This tool was designed to aid domain experts in exploring, explaining, and presenting book trade data from multiple perspectives. We discuss the design choices made and how each panel in the interface meets the domain requirements. We also present the results of a qualitative evaluation conducted with domain experts. The main contributions of this paper include: 1) the development of a VA tool to support domain experts in exploring, explaining, and presenting book trade data; 2) a comprehensive documentation of the iterative design, development, and evaluation process following the variant Nine-Stage Framework; 3) a summary of the insights gained and lessons learned from this design study in the context of the humanities field; and 4) reflections on how our approach could be applied in a more generalizable way

    Factors that affect the growth and photosynthesis of the filamentous green algae, Chaetomorpha valida, in static sea cucumber aquaculture ponds with high salinity and high pH

    Get PDF
    Chaetomorpha valida, dominant filamentous green algae, can be harmful to sea cucumber growth in aquaculture ponds of China. In order to understand the environmental factors affecting the growth of C. valida in sea cucumber aquaculture ecosystems, a combination of field investigations and laboratory experiments were conducted. Field surveys over one year revealed that C. valida survived in sea cucumber aquaculture ponds in salinities ranging from 24.3 ± 0.01‰ to 32.0 ± 0.02‰ and a pH range of 7.5 ± 0.02–8.6 ± 0.04. The high salinity and pH during the period of low C. valida biomass from January to May lay the foundation for its rapid growth in the following months of June to October. Many factors interact in the field environment, thus, laboratory experiments were conducted to determine the isolated effects of pH and salinity on C. valida growth. In laboratory experiments, samples were incubated under different salinity and pH conditions at 25 °C, with a light intensity of 108 μmol photon·m−2·s−1, and a photoperiod of 12 L:12 D. Results showed that salinity and pH significantly affect the growth and Fv/Fm (quantum yield of photosynthesis) of C. valida (p < 0.01). C. valida grew the longest at a salinity of 34‰ and a pH of 8.0. At 34‰ salinity, C. valida grew to 26.44 ± 5.89 cm in 16 days. At a pH of 8.0, C. valida grew to 67.96 ± 4.45 cm in 32 days. Fv/Fm was 0.635 ± 0.002 at a salinity of 32‰, and 0.550 ± 0.006 to 0.660± 0.001 at pH 7.0 to 8.5. Based on these results, we conclude that C. valida can bloom in sea cucumber ponds due to the high salinity and pH of coastal sea waters, which promote growth and maintain the photosynthetic activity of C. valida

    Self-adaptive k-means based on a covering algorithm

    Full text link
    The K-means algorithm is one of the ten classic algorithms in the area of data mining and has been studied by researchers in numerous fields for a long time. However, the value of the clustering number k in the K-means algorithm is not always easy to be determined, and the selection of the initial centers is vulnerable to outliers. This paper proposes an improved K-means clustering algorithm called the covering K-means algorithm (C-K-means). The C-K-means algorithm can not only acquire efficient and accurate clustering results but also self-adaptively provide a reasonable numbers of clusters based on the data features. It includes two phases: the initialization of the covering algorithm (CA) and the Lloyd iteration of the K-means. The first phase executes the CA. CA self-organizes and recognizes the number of clusters k based on the similarities in the data, and it requires neither the number of clusters to be prespecified nor the initial centers to be manually selected. Therefore, it has a “blind” feature, that is, k is not preselected. The second phase performs the Lloyd iteration based on the results of the first phase. The C-K-means algorithm combines the advantages of CA and K-means. Experiments are carried out on the Spark platform, and the results verify the good scalability of the C-K-means algorithm. This algorithm can effectively solve the problem of large-scale data clustering. Extensive experiments on real data sets show that the accuracy and efficiency of the C-K-means algorithm outperforms the existing algorithms under both sequential and parallel conditions

    SegRap2023: A Benchmark of Organs-at-Risk and Gross Tumor Volume Segmentation for Radiotherapy Planning of Nasopharyngeal Carcinoma

    Full text link
    Radiation therapy is a primary and effective NasoPharyngeal Carcinoma (NPC) treatment strategy. The precise delineation of Gross Tumor Volumes (GTVs) and Organs-At-Risk (OARs) is crucial in radiation treatment, directly impacting patient prognosis. Previously, the delineation of GTVs and OARs was performed by experienced radiation oncologists. Recently, deep learning has achieved promising results in many medical image segmentation tasks. However, for NPC OARs and GTVs segmentation, few public datasets are available for model development and evaluation. To alleviate this problem, the SegRap2023 challenge was organized in conjunction with MICCAI2023 and presented a large-scale benchmark for OAR and GTV segmentation with 400 Computed Tomography (CT) scans from 200 NPC patients, each with a pair of pre-aligned non-contrast and contrast-enhanced CT scans. The challenge's goal was to segment 45 OARs and 2 GTVs from the paired CT scans. In this paper, we detail the challenge and analyze the solutions of all participants. The average Dice similarity coefficient scores for all submissions ranged from 76.68\% to 86.70\%, and 70.42\% to 73.44\% for OARs and GTVs, respectively. We conclude that the segmentation of large-size OARs is well-addressed, and more efforts are needed for GTVs and small-size or thin-structure OARs. The benchmark will remain publicly available here: https://segrap2023.grand-challenge.orgComment: A challenge report of SegRap2023 (organized in conjunction with MICCAI2023

    Retrospective evaluation of whole exome and genome mutation calls in 746 cancer samples

    No full text
    Funder: NCI U24CA211006Abstract: The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) curated consensus somatic mutation calls using whole exome sequencing (WES) and whole genome sequencing (WGS), respectively. Here, as part of the ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG) Consortium, which aggregated whole genome sequencing data from 2,658 cancers across 38 tumour types, we compare WES and WGS side-by-side from 746 TCGA samples, finding that ~80% of mutations overlap in covered exonic regions. We estimate that low variant allele fraction (VAF < 15%) and clonal heterogeneity contribute up to 68% of private WGS mutations and 71% of private WES mutations. We observe that ~30% of private WGS mutations trace to mutations identified by a single variant caller in WES consensus efforts. WGS captures both ~50% more variation in exonic regions and un-observed mutations in loci with variable GC-content. Together, our analysis highlights technological divergences between two reproducible somatic variant detection efforts
    corecore