12 research outputs found
Teaching Smaller Language Models To Generalise To Unseen Compositional Questions
We equip a smaller Language Model to generalise to answering challenging
compositional questions that have not been seen in training. To do so we
propose a combination of multitask supervised pretraining on up to 93 tasks
designed to instill diverse reasoning abilities, and a dense retrieval system
that aims to retrieve a set of evidential paragraph fragments. Recent progress
in question-answering has been achieved either through prompting methods
against very large pretrained Language Models in zero or few-shot fashion, or
by fine-tuning smaller models, sometimes in conjunction with information
retrieval. We focus on the less explored question of the extent to which
zero-shot generalisation can be enabled in smaller models with retrieval
against a corpus within which sufficient information to answer a particular
question may not exist. We establish strong baselines in this setting for
diverse evaluation datasets (StrategyQA, CommonsenseQA, IIRC, DROP, Musique and
ARC-DA), and show that performance can be significantly improved by adding
retrieval-augmented training datasets which are designed to expose our models
to a variety of heuristic reasoning strategies such as weighing partial
evidence or ignoring an irrelevant context
Answering Unseen Questions With Smaller Language Models Using Rationale Generation and Dense Retrieval
When provided with sufficient explanatory context, smaller Language Models
have been shown to exhibit strong reasoning ability on challenging short-answer
question-answering tasks where the questions are unseen in training. We
evaluate two methods for further improvement in this setting. Both methods
focus on combining rationales generated by a larger Language Model with longer
contexts created from a multi-hop dense retrieval system. The first method
() involves training a Rationale Ranking model to score both
generated rationales and retrieved contexts with respect to relevance and
truthfulness. We then use the scores to derive combined contexts from both
knowledge sources using a number of combinatory strategies. For the second
method () we train a smaller Reasoning model using
retrieval-augmented training datasets such that it becomes proficient at
utilising relevant information from longer text sequences that may be only
partially evidential and frequently contain many irrelevant sentences.
Generally we find that both methods are effective but that the
method is more straightforward to apply and produces the strongest results in
the unseen setting on which we focus. Our single best Reasoning model using
only 440 million parameters materially improves upon strong comparable prior
baselines for unseen evaluation datasets (StrategyQA 58.9 61.7
acc., CommonsenseQA 63.6 72.7 acc., ARC-DA 31.6
52.1 F1, IIRC 25.5 27.3 F1) and a version utilising our prior
knowledge of each type of question in selecting a context combination strategy
does even better. Our proposed models also generally outperform direct prompts
against much larger models (BLOOM 175B and StableVicuna 13B) in both few-shot
chain-of-thought and few-shot answer-only settings
Anti-Search for the Glueball Candidate f_J(2220) in Two-Photon Interactions
Using 13.3 fb^{-1} of e^+e^- data recorded with the CLEO II and CLEO II.V
detector configurations at CESR, we have searched for f_J(2220) decays to
K^0_{S} K^0_{S} in untagged two-photon interactions. We report an upper limit
on the product of the two-photon partial width and the branching fraction,
Gamma_gamma gamma cdot B (f_J(2220) to K^0_{S} K^0_{S}) of less than 1.1 eV at
the 95% C.L: systematic uncertainties are included. This dataset is four times
larger than that used in the previous CLEO publication.Comment: 10 pages postscript, also available through
http://w4.lns.cornell.edu/public/CLNS, Submitted to PRD (R
Two-Body B Meson Decays to and -- Observation of {'}K$
In a sample of 6.6 million produced B mesons we have observed decays B ->
eta' K, with branching fractions BR(B+ -> eta' K+ = 6.5 +1.5 -1.4 +- 0.9) x
and BR(B0 -> eta' K0 = 4.7 +2.7 -2.0 +- 0.9) x . We have
searched with comparable sensitivity for 17 related decays to final states
containing an eta or eta' meson accompanied by a single particle or low-lying
resonance. Our upper limits for these constrain theoretical interpretations of
the B -> eta' K signal.Comment: 12 page postscript file, postscript file also available through
http://w4.lns.cornell.edu/public/CLN
Prevention of Sexual and Gender Harassment and Abuse in Sport: Initiatives in Europe and Beyond
How much compression should a scramjet inlet Do?
The supersonic combustion ramjet, or scramjet, is the engine cycle most suitable for sustained hypersonic flight in the atmosphere. This paper examines a key decision in the design of the inlet or intake of these hypersonic airbreathing engines, namely, the level of compression to be performed. Too much compression can lead to onerous system level issues including the need for bleed or variable geometry, while too little compression can result in low cycle efficiency and poor combustion of fuel. An analysis of the important factors that affect the choice of scramjet inlet compression ratio has been performed for hydrogen-fueled scramjets at Mach 6, 8, 10, and 12. It was found that contrary to classical thermodynamic analyses, scramjet cycle efficiency reaches an optimum at a relatively low compression ratio between 50 and 100 for all Mach numbers. Practical constraints related to nonequilibrium flow effects, inlet starting, and boundary-layer separation were also shown to prompt a desire for low compression ratio. The lower limit on compression was found to be set by the need to complete the combustion reaction in the available engine length and is therefore dependent on engine scale. On the basis of these factors it is recommended that scramjet inlet compression ratio be set to the minimum that satisfies the robust combustion requirement, with the caveat that it not be below 50 in order to maintain high cycle efficiency. For typical wind-tunnel-scale engines, this results in a requirement for the inlet to compress airflow entering the combustor to a pressure of approximately 1/2 atm, regardless of the flight Mach number
Measurement of the muonic branching fractions of the Υ(1S) and Υ(3S)
Using the CLEO detector at the Cornell Electron Storage Ring, we have measured the muonic branching fractions B of the (1S) and (3S) to be (2.520.070.07)% and (2.020.190.33)%, respectively. © 1989 The American Physical Society
Recent progress in neutrino factory and muon collider research within the Muon collaboration
We describe the status of our effort to realize a first neutrino factory and the progress made in understanding the problems associated with the collection and cooling of muons towards that end. We summarize the physics that can be done with neutrino factories as well as with intense cold beams of muons. The physics potential of muon colliders is reviewed, both as Higgs Factories and compact high energy lepton colliders. The status and timescale of our research and development effort is reviewed as well as the latest designs in cooling channels including the promise of ring coolers in achieving longitudinal and transverse cooling simultaneously. We detail the efforts being made to mount an international cooling experiment to demonstrate the ionization cooling of muons