31 research outputs found

    The successes and challenges of harmonising juvenile idiopathic arthritis (JIA) datasets to create a large-scale JIA data resource

    Get PDF
    Background CLUSTER is a UK consortium focussed on precision medicine research in JIA/JIA-Uveitis. As part of this programme, a large-scale JIA data resource was created by harmonizing and pooling existing real-world studies. Here we present challenges and progress towards creation of this unique large JIA dataset. Methods Four real-world studies contributed data; two clinical datasets of JIA patients starting first-line methotrexate (MTX) or tumour necrosis factor inhibitors (TNFi) were created. Variables were selected based on a previously developed core dataset, and encrypted NHS numbers were used to identify children contributing similar data across multiple studies. Results Of 7013 records (from 5435 individuals), 2882 (1304 individuals) represented the same child across studies. The final datasets contain 2899 (MTX) and 2401 (TNFi) unique patients; 1018 are in both datasets. Missingness ranged from 10 to 60% and was not improved through harmonisation. Conclusions Combining data across studies has achieved dataset sizes rarely seen in JIA, invaluable to progressing research. Losing variable specificity and missingness, and their impact on future analyses requires further consideration

    Teaching PDC in the Time of COVID: Hands-on Materials for Remote Learning

    No full text
    In response to shifts in the hardware foundations of computing, parallel and distributed computing (PDC) is now a key piece of the core CS curriculum. For CS educators, the COVID-19 pandemic and the resulting switch to remote-learning add new challenges to the tasks of helping learners understand abstract PDC concepts and equipping them with hands-on practical skills. This paper presents several novel teaching materials for teaching PDC remotely, including: (i) using a Runestone Interactive virtual handout to learn how to run OpenMP multithreaded programs on a Raspberry Pi, and (ii) using Google Colab and Jupyter notebooks to run mpi4py instances on remote systems and thus learn about MPI distributed multiprocessing. The authors piloted these strategies during a multi-day faculty development workshop on teaching PDC. Assessment data indicates that the materials greatly aided professional development and preparedness to teach PDC

    Efficiency of Shared-Memory Multiprocessors for a Genetic Sequence Similarity Search Algorithm

    No full text
    Molecular biologists who conduct large-scale genetic sequencing projects are producing an ever-increasing amount of sequence data. GenBank, the primary repository for DNA sequence data is doubling in size every 1.3 years. Keeping pace with the analysis of this data is a difficult task. One of the most successful techniques for analyzing genetic data is sequence similarity analysis---the comparison of unknown sequences against known sequences kept in databases. As biologists gather more sequence data, sequence similarity algorithms are more and more useful, but take longer and longer to run. BLAST is one of the most popular sequence similarity algorithms in use today, but its running time is proportional to the size of the database. Sequence similarity analysis using BLAST is becoming a bottleneck. Shared-Memory Multiprocessors (SMPs) may offer performance that scales with the growth of the genetic databases. This paper analyzes the performance of BLAST on SMPs, to improve our theoretic..

    Using Inexpensive Microclusters and Accessible Materials for Cost-Effective Parallel and Distributed Computing Education

    No full text
    With parallel and distributed computing (PDC) now in the core CS curriculum, CS educators are building new pedagogical tools to teach their students about this cutting-edge area of computing. In this paper, we present an innovative approach we call microclusters – personal, portable Beowulf clusters – that provide students with hands-on PDC learning experiences. We present several different microclusters, each built using a different combination of single board computers (SBCs) as its compute nodes, including various ODROID models, Nvidia’s Jetson TK1, Adapteva’s Parallella, and the Raspberry Pi. We explore different ways that CS educators are using these systems in their teaching, and describe specific courses in which CS educators have used microclusters. Finally, we present an overview of sources of free PDC pedagogical materials that can be used with microclusters
    corecore