1,742 research outputs found

    Evaluating a Cluster of Low-Power ARM64 Single-Board Computers with MapReduce

    Get PDF
    With the meteoric rise of enormous data collection in science, industry, and the cloud, methods for processing massive datasets have become more crucial than ever. MapReduce is a restricted programing model for expressing parallel computations as simple serial functions, and an execution framework for distributing those computations over large datasets residing on clusters of commodity hardware. MapReduce abstracts away the challenging low-level synchronization and scalability details which parallel and distributed computing often necessitate, reducing the concept burden on programmers and scientists who require data processing at-scale. Typically, MapReduce clusters are implemented using inexpensive commodity hardware, emphasizing quantity over quality due to the fault-tolerant nature of the MapReduce execution framework. The nascent explosion of inexpensive single-board computers designed around multi-core 64bit ARM processors, such as the RasberryPi 3, Pine64, and Odroid C2, has opened new avenues for inexpensive and low-power cluster computing. In this thesis, we implement a novel cluster around low-power ARM64 single-board computers and the Disco Python MapReduce execution framework. We use MapReduce to empirically evaluate our cluster by solving the Word Count and Inverted Link Index problems for the Wikipedia article dataset. We benchmark our MapReduce solutions against local solutions of the same algorithms for a conventional low-power x86 platform. We show our cluster out-performs the conventional platform for larger benchmarks, thus demonstrating low-power single-board computers as a viable avenue for data-intensive cluster computing

    Linear Mode Connectivity in Sparse Neural Networks

    Full text link
    With the rise in interest of sparse neural networks, we study how neural network pruning with synthetic data leads to sparse networks with unique training properties. We find that distilled data, a synthetic summarization of the real data, paired with Iterative Magnitude Pruning (IMP) unveils a new class of sparse networks that are more stable to SGD noise on the real data, than either the dense model, or subnetworks found with real data in IMP. That is, synthetically chosen subnetworks often train to the same minima, or exhibit linear mode connectivity. We study this through linear interpolation, loss landscape visualizations, and measuring the diagonal of the hessian. While dataset distillation as a field is still young, we find that these properties lead to synthetic subnetworks matching the performance of traditional IMP with up to 150x less training points in settings where distilled data applies.Comment: Published in NeurIPS 2023 UniReps Worksho

    On the analysis of tuberculosis studies with intermittent missing sputum data

    Get PDF
    In randomized studies evaluating treatments for tuberculosis (TB), individuals are scheduled to be routinely evaluated for the presence of TB using sputum cultures. One important endpoint in such studies is the time of culture conversion, the first visit at which a patient’s sputum culture is negative and remains negative. This article addresses how to draw inference about treatment effects when sputum cultures are intermittently missing on some patients. We discuss inference under a novel benchmark assumption and under a class of assumptions indexed by a treatment-specific sensitivity parameter that quantify departures from the benchmark assumption. We motivate and illustrate our approach using data from a randomized trial comparing the effectiveness of two treatments for adult TB patients in Brazil.Fil: Scharfstein, Daniel. University Johns Hopkins; Estados UnidosFil: Rotnitzky, Andrea Gloria. Consejo Nacional de Investigaciones Científicas y Técnicas; Argentina. Universidad Torcuato Di Tella. Departamento de Economía; ArgentinaFil: Abraham, Maria. Statistics Collaborative; Estados UnidosFil: McDermott, Aidan. University Johns Hopkins; Estados UnidosFil: Chaisson, Richard. University Johns Hopkins; Estados UnidosFil: Geiter, Lawrence. Otsuka Novel Products; Estados Unido

    Neuroqueer Literacies in a Physics Context: A Discussion on Changing the Physics Classroom Using a Neuroqueer Literacy Framework

    Full text link
    Life experience, identity, the relationship between ourselves and the world around us among others, all affect and shape how we, as scientists, construct knowledge. Neurodiversity, the diversity of minds, is an interesting concept when keeping this in mind. Being neurodivergent, or neuroqueer (the viewing of being neurodivergent as a queer thing, along with the intersection of neurodiversity and queerness), means having non-neurotypical ways of perceiving and interacting with the world, and especially of creating knowledge about the rules and regulations, both natural and societal, that govern it locally and broadly. Neuroqueer physicists, therefore, have unique non-normative ways of doing physics, the study of the rules (which is done societally) which govern the natural world. It is imperative that, when teaching neurodivergent students, we encourage and support this non-normative way of thinking about physics, and help them do physics in ways that they will be successful, and support the development of Neuroqueer (Scientific) Literacies, from Kleekamp's and Smilges's works on literacy. We here present a brief overview of Neuroqueer Literacies and how to apply them in the physics classroom.Comment: 8 pages, 1 figure. Submitted to The Physics Teache

    The TACL model: A framework for safeguarding children with a disability in sport

    Get PDF
    This study represents the first investigation of how children with a disability can be safeguarded in Rugby Union. In study 1, a questionnaire containing quantitative questions was completed by 389 safeguarding volunteers regarding their experiences of working with a child with a disability in their role. Descriptive statistics revealed that 76% of this sample had worked with a child with a disability in Rugby Union and that 28% continue to do so on a weekly basis. In study 2, a qualitative survey was completed by 329 safeguarding volunteers and interviews were conducted with a geographically representative sample of 14 Safeguarding Officers. This study focused on developing a model of promising practice with respect to safeguarding children with a disability in Rugby Union. Based on an inductive thematic analysis of the qualitative survey and interview data, the TACL model was developed: Trigger (creating a system that sensitively identifies children with a disability), Action Plan (creating an individualized approach such that the child is effectively included and protected), Communicate (ensuring that all key stakeholders are informed about the plan) and Learn (ensuring that cases of good practice are identified and disseminated). The name TACL (pronounced tackle) was chosen to promote proactive strategies and to provide a label relevant to the language of Rugby Union. These strategies are proposed as the basis for the safeguarding of children with a disability

    Validation of the Social Security Death Index (SSDI): An Important Readily-Available Outcomes Database for Researchers

    Get PDF
    Study Objective: To determine the accuracy of the online Social Security Death Index (SSDI) for determining death outcomes. Methods: We selected 30 patients who were determined to be dead and 90 patients thought to be alive after an ED visit as determined by a web-based searched of the SSDI. For those thought to be dead we requested death certificates. We then had a research coordinator blinded to the results of the SSDI search, complete direct follow-up by contacting the patients, family or primary care physicians to determine vital status. To determine the sensitivity and specificity of the SSDI for death at six months in this cohort, we used direct follow-up as the criterion reference and calculated 95% confidence intervals. Results: Direct follow-up was completed for 90% (108 of 120) of the patients. For those patients 20 were determined to be dead and 88 alive. The dead were more likely to be male (57%) and older [(mean age 83.9 (95% CI 79.1 – 88.7) vs. 60.9 (95% CI 56.4 – 65.4) for those alive]. The sensitivity of the SSDI for those with completed direct follow-up was 100% (95% CI 91 -100%) with specificity of 100% (95% CI 98-100%). Of the 12 patients who were not able to be contacted through direct follow-up, the SSDI indicated that 10 were dead and two were alive. Conclusions: SSDI is an accurate measure of death outcomes and appears to have the advantage of finding deaths among patients lost to follow-up

    UniCat: Crafting a Stronger Fusion Baseline for Multimodal Re-Identification

    Full text link
    Multimodal Re-Identification (ReID) is a popular retrieval task that aims to re-identify objects across diverse data streams, prompting many researchers to integrate multiple modalities into a unified representation. While such fusion promises a holistic view, our investigations shed light on potential pitfalls. We uncover that prevailing late-fusion techniques often produce suboptimal latent representations when compared to methods that train modalities in isolation. We argue that this effect is largely due to the inadvertent relaxation of the training objectives on individual modalities when using fusion, what others have termed modality laziness. We present a nuanced point-of-view that this relaxation can lead to certain modalities failing to fully harness available task-relevant information, and yet, offers a protective veil to noisy modalities, preventing them from overfitting to task-irrelevant data. Our findings also show that unimodal concatenation (UniCat) and other late-fusion ensembling of unimodal backbones, when paired with best-known training techniques, exceed the current state-of-the-art performance across several multimodal ReID benchmarks. By unveiling the double-edged sword of "modality laziness", we motivate future research in balancing local modality strengths with global representations.Comment: Accepted NeurIPS 2023 UniReps, 9 pages, 4 table

    A Meta-Narrative Review on the Use of R.O.S.E in Telecytology for the Patient, Pathologist, and Cytologist

    Get PDF
    Advancements in technology have given rise to a path of convenience, ease, and flexibility for workers to work remotely. A new tool in laboratory diagnostics is Telecytology for Rapid On-Site Evaluation. A cytologist processes a specimen on site and captures an image or video of the findings. The media is sent directly to a pathologist for further evaluation, and then a diagnosis is given to the patient. With this being a relatively new practice, we need to ask what the advantages and disadvantages are for everyone involved: the patient, the cytologist, and the pathologist. We found articles that were less than five years old and reviewed the methodologies in the articles. We found that there were many advantages including decreased diagnosis time, availability to patients in rural areas, and fewer repeated procedures. We also found disadvantages such as extensive training requirements and the possibility of incorrect diagnoses. Our findings indicate there is success in using Telecytology for R.O.S.E., but that faults are present to some degree. As technology continues to advance, we expect more studies to be conducted that highlight the success of Teleyctology with Rapid On-Site Evaluation.https://openworks.mdanderson.org/rmps/1007/thumbnail.jp

    GraFT: Gradual Fusion Transformer for Multimodal Re-Identification

    Full text link
    Object Re-Identification (ReID) is pivotal in computer vision, witnessing an escalating demand for adept multimodal representation learning. Current models, although promising, reveal scalability limitations with increasing modalities as they rely heavily on late fusion, which postpones the integration of specific modality insights. Addressing this, we introduce the \textbf{Gradual Fusion Transformer (GraFT)} for multimodal ReID. At its core, GraFT employs learnable fusion tokens that guide self-attention across encoders, adeptly capturing both modality-specific and object-specific features. Further bolstering its efficacy, we introduce a novel training paradigm combined with an augmented triplet loss, optimizing the ReID feature embedding space. We demonstrate these enhancements through extensive ablation studies and show that GraFT consistently surpasses established multimodal ReID benchmarks. Additionally, aiming for deployment versatility, we've integrated neural network pruning into GraFT, offering a balance between model size and performance.Comment: 3 Borderline Reviews at WACV, 8 pages, 5 figures, 8 table

    Randomized Controlled Trial of Prophylactic Antibiotics for Dog Bites with Refined Cost Model

    Get PDF
    Reprints available through open access a
    • …
    corecore