571 research outputs found
Exploring the relationship between students' level of content knowledge and their ability to engage in scientific argumentation using structural equation modeling
The release of the Next Generation Science Standards (NGSS) in 2013 introduced science standards that are rich in core ideas as well as science and engineering practices. The NGSS views science content and science practice as closely interconnected to each other. The aim of this study is to explore the relationship between students' level of science content and their ability to engage in the practice of scientific argumentation. Specifically, this study teases apart content knowledge into both domain-general and discipline specific knowledge. To this end, this study explores the following research questions. (1) What is the relationship between students' content knowledge and their ability to engage in scientific argumentation? (2) How do the different dimensions of argumentation vary in difficulty? To explore these research questions, factor analysis, Item Response Theory, and Structural Equation Modeling are used. The results indicate the there is a stronger relationship between discipline specific knowledge and argumentation. This study contributes to the understanding of the connection between content knowledge and argumentation has the potential to inform and improve argumentation instruction, which ultimately can provide students with more authentic science experiences in the classroomIncludes bibliographical reference
Model Performance Prediction for Hyperparameter Optimization of Deep Learning Models Using High Performance Computing and Quantum Annealing
Hyperparameter Optimization (HPO) of Deep Learning-based models tends to be a
compute resource intensive process as it usually requires to train the target
model with many different hyperparameter configurations. We show that
integrating model performance prediction with early stopping methods holds
great potential to speed up the HPO process of deep learning models. Moreover,
we propose a novel algorithm called Swift-Hyperband that can use either
classical or quantum support vector regression for performance prediction and
benefit from distributed High Performance Computing environments. This
algorithm is tested not only for the Machine-Learned Particle Flow model used
in High Energy Physics, but also for a wider range of target models from
domains such as computer vision and natural language processing.
Swift-Hyperband is shown to find comparable (or better) hyperparameters as well
as using less computational resources in all test cases
Hyperparameter optimization, quantum-assisted model performance prediction, and benchmarking of AI-based High Energy Physics workloads using HPC
Training and Hyperparameter Optimization (HPO) of deep learning-based AI
models are often compute resource intensive and calls for the use of
large-scale distributed resources as well as scalable and resource efficient
hyperparameter search algorithms. This work studies the potential of using
model performance prediction to aid the HPO process carried out on High
Performance Computing systems. In addition, a quantum annealer is used to train
the performance predictor and a method is proposed to overcome some of the
problems derived from the current limitations in quantum systems as well as to
increase the stability of solutions. This allows for achieving results on a
quantum machine comparable to those obtained on a classical machine, showing
how quantum computers could be integrated within classical machine learning
tuning pipelines.
Furthermore, results are presented from the development of a containerized
benchmark based on an AI-model for collision event reconstruction that allows
us to compare and assess the suitability of different hardware accelerators for
training deep neural networks.Comment: 5 pages, 7 figures. Submitted to the proceedings of the ACAT 2022
conference and is to be published in the Journal Of Physics: Conference
Serie
Progress towards an improved particle flow algorithm at CMS with machine learning
The particle-flow (PF) algorithm, which infers particles based on tracks and
calorimeter clusters, is of central importance to event reconstruction in the
CMS experiment at the CERN LHC, and has been a focus of development in light of
planned Phase-2 running conditions with an increased pileup and detector
granularity. In recent years, the machine learned particle-flow (MLPF)
algorithm, a graph neural network that performs PF reconstruction, has been
explored in CMS, with the possible advantages of directly optimizing for the
physical quantities of interest, being highly reconfigurable to new conditions,
and being a natural fit for deployment to heterogeneous accelerators. We
discuss progress in CMS towards an improved implementation of the MLPF
reconstruction, now optimized using generator/simulation-level particle
information as the target for the first time. This paves the way to potentially
improving the detector response in terms of physical quantities of interest. We
describe the simulation-based training target, progress and studies on
event-based loss terms, details on the model hyperparameter tuning, as well as
physics validation with respect to the current PF algorithm in terms of
high-level physical quantities such as the jet and missing transverse momentum
resolutions. We find that the MLPF algorithm, trained on a generator/simulator
level particle information for the first time, results in broadly compatible
particle and jet reconstruction performance with the baseline PF, setting the
stage for improving the physics performance by additional training statistics
and model tuning.Comment: 7 pages, 4 Figures, 1 Tabl
Scalable neural network models and terascale datasets for particle-flow reconstruction
We study scalable machine learning models for full event reconstruction in
high-energy electron-positron collisions based on a highly granular detector
simulation. Particle-flow (PF) reconstruction can be formulated as a supervised
learning task using tracks and calorimeter clusters or hits. We compare a graph
neural network and kernel-based transformer and demonstrate that both avoid
quadratic memory allocation and computational cost while achieving realistic PF
reconstruction. We show that hyperparameter tuning on a supercomputer
significantly improves the physics performance of the models. We also
demonstrate that the resulting model is highly portable across hardware
processors, supporting Nvidia, AMD, and Intel Habana cards. Finally, we
demonstrate that the model can be trained on highly granular inputs consisting
of tracks and calorimeter hits, resulting in a competitive physics performance
with the baseline. Datasets and software to reproduce the studies are published
following the findable, accessible, interoperable, and reusable (FAIR)
principles.Comment: 19 pages, 7 figure
Kappa-symmetric SL(2,R) covariant D-brane actions
A superspace formulation of IIB supergravity which includes the field
strengths of the duals of the usual physical one, three and five-form field
strengths as well as the eleven-form field strength is given. The
superembedding formalism is used to construct kappa-symmetric SL(2,R) covariant
D-brane actions in an arbitrary supergravity background.Comment: 20 pages. Minor clarification in text. References adde
Do We Still Need Clinical Language Models?
Although recent advances in scaling large language models (LLMs) have
resulted in improvements on many NLP tasks, it remains unclear whether these
models trained primarily with general web text are the right tool in highly
specialized, safety critical domains such as clinical text. Recent results have
suggested that LLMs encode a surprising amount of medical knowledge. This
raises an important question regarding the utility of smaller domain-specific
language models. With the success of general-domain LLMs, is there still a need
for specialized clinical models? To investigate this question, we conduct an
extensive empirical analysis of 12 language models, ranging from 220M to 175B
parameters, measuring their performance on 3 different clinical tasks that test
their ability to parse and reason over electronic health records. As part of
our experiments, we train T5-Base and T5-Large models from scratch on clinical
notes from MIMIC III and IV to directly investigate the efficiency of clinical
tokens. We show that relatively small specialized clinical models substantially
outperform all in-context learning approaches, even when finetuned on limited
annotated data. Further, we find that pretraining on clinical tokens allows for
smaller, more parameter-efficient models that either match or outperform much
larger language models trained on general text. We release the code and the
models used under the PhysioNet Credentialed Health Data license and data use
agreement
Model Performance Prediction for Hyperparameter Optimization of Deep Learning Models Using High Performance Computing and Quantum Annealing
Hyperparameter Optimization (HPO) of Deep Learning (DL)-based models tends to be a compute resource intensive process as it usually requires to train the target model with many different hyperparameter configurations. We show that integrating model performance prediction with early stopping methods holds great potential to speed up the HPO process of deep learning models. Moreover, we propose a novel algorithm called Swift-Hyperband that can use either classical or quantum Support Vector Regression (SVR) for performance prediction and benefit from distributed High Performance Computing (HPC) environments. This algorithm is tested not only for the Machine-Learned Particle Flow (MLPF), model used in High-Energy Physics (HEP), but also for a wider range of target models from domains such as computer vision and natural language processing. Swift-Hyperband is shown to find comparable (or better) hyperparameters as well as using less computational resources in all test cases
How should the advent of large language models affect the practice of science?
Large language models (LLMs) are being increasingly incorporated into
scientific workflows. However, we have yet to fully grasp the implications of
this integration. How should the advent of large language models affect the
practice of science? For this opinion piece, we have invited four diverse
groups of scientists to reflect on this query, sharing their perspectives and
engaging in debate. Schulz et al. make the argument that working with LLMs is
not fundamentally different from working with human collaborators, while Bender
et al. argue that LLMs are often misused and over-hyped, and that their
limitations warrant a focus on more specialized, easily interpretable tools.
Marelli et al. emphasize the importance of transparent attribution and
responsible use of LLMs. Finally, Botvinick and Gershman advocate that humans
should retain responsibility for determining the scientific roadmap. To
facilitate the discussion, the four perspectives are complemented with a
response from each group. By putting these different perspectives in
conversation, we aim to bring attention to important considerations within the
academic community regarding the adoption of LLMs and their impact on both
current and future scientific practices
INTERGROWTH-21st Project international INTER-NDA standards for child development at 2 years of age: an international prospective population-based study.
OBJECTIVES: To describe the construction of the international INTERGROWTH-21st Neurodevelopment Assessment (INTER-NDA) standards for child development at 2 years by reporting the cognitive, language, motor and behaviour outcomes in optimally healthy and nourished children in the INTERGROWTH-21st Project. DESIGN: Population-based cohort study, the INTERGROWTH-21st Project. SETTING: Brazil, India, Italy, Kenya and the UK. PARTICIPANTS: 1181 children prospectively recruited from early fetal life according to the prescriptive WHO approach, and confirmed to be at low risk of adverse perinatal and postnatal outcomes. PRIMARY MEASURES: Scaled INTER-NDA domain scores for cognition, language, fine and gross motor skills and behaviour; vision outcomes measured on the Cardiff tests; attentional problems and emotional reactivity measured on the respective subscales of the preschool Child Behaviour Checklist; and the age of acquisition of the WHO gross motor milestones. RESULTS: Scaled INTER-NDA domain scores are presented as centiles, which were constructed according to the prescriptive WHO approach and excluded children born preterm and those with significant postnatal/neurological morbidity. For all domains, except negative behaviour, higher scores reflect better outcomes and the threshold for normality was defined as ≥10th centile. For the INTER-NDA's cognitive, fine motor, gross motor, language and positive behaviour domains these are ≥38.5, ≥25.7, ≥51.7, ≥17.8 and ≥51.4, respectively. The threshold for normality for the INTER-NDA's negative behaviour domain is ≤50.0, that is, ≤90th centile. At 22-30 months of age, the cohort overlapped with the WHO motor milestone centiles, showed low postnatal morbidity (<10%), and vision outcomes, attentional problems and emotional reactivity scores within the respective normative ranges. CONCLUSIONS: From this large, healthy and well-nourished, international cohort, we have constructed, using the WHO prescriptive methodology, international INTER-NDA standards for child development at 2 years of age. Standards, rather than references, are recommended for population-level screening and the identification of children at risk of adverse outcomes
- …