1,101 research outputs found

    Teaching Smaller Language Models To Generalise To Unseen Compositional Questions

    Full text link
    We equip a smaller Language Model to generalise to answering challenging compositional questions that have not been seen in training. To do so we propose a combination of multitask supervised pretraining on up to 93 tasks designed to instill diverse reasoning abilities, and a dense retrieval system that aims to retrieve a set of evidential paragraph fragments. Recent progress in question-answering has been achieved either through prompting methods against very large pretrained Language Models in zero or few-shot fashion, or by fine-tuning smaller models, sometimes in conjunction with information retrieval. We focus on the less explored question of the extent to which zero-shot generalisation can be enabled in smaller models with retrieval against a corpus within which sufficient information to answer a particular question may not exist. We establish strong baselines in this setting for diverse evaluation datasets (StrategyQA, CommonsenseQA, IIRC, DROP, Musique and ARC-DA), and show that performance can be significantly improved by adding retrieval-augmented training datasets which are designed to expose our models to a variety of heuristic reasoning strategies such as weighing partial evidence or ignoring an irrelevant context

    Answering Unseen Questions With Smaller Language Models Using Rationale Generation and Dense Retrieval

    Full text link
    When provided with sufficient explanatory context, smaller Language Models have been shown to exhibit strong reasoning ability on challenging short-answer question-answering tasks where the questions are unseen in training. We evaluate two methods for further improvement in this setting. Both methods focus on combining rationales generated by a larger Language Model with longer contexts created from a multi-hop dense retrieval system. The first method (RR\textit{RR}) involves training a Rationale Ranking model to score both generated rationales and retrieved contexts with respect to relevance and truthfulness. We then use the scores to derive combined contexts from both knowledge sources using a number of combinatory strategies. For the second method (RATD\textit{RATD}) we train a smaller Reasoning model using retrieval-augmented training datasets such that it becomes proficient at utilising relevant information from longer text sequences that may be only partially evidential and frequently contain many irrelevant sentences. Generally we find that both methods are effective but that the RATD\textit{RATD} method is more straightforward to apply and produces the strongest results in the unseen setting on which we focus. Our single best Reasoning model using only 440 million parameters materially improves upon strong comparable prior baselines for unseen evaluation datasets (StrategyQA 58.9 →\rightarrow 61.7 acc., CommonsenseQA 63.6 →\rightarrow 72.7 acc., ARC-DA 31.6 →\rightarrow 52.1 F1, IIRC 25.5 →\rightarrow 27.3 F1) and a version utilising our prior knowledge of each type of question in selecting a context combination strategy does even better. Our proposed models also generally outperform direct prompts against much larger models (BLOOM 175B and StableVicuna 13B) in both few-shot chain-of-thought and few-shot answer-only settings

    Design of Modular, Shape-transitioning Inlets for a Conical Hypersonic Vehicle

    Get PDF
    For a hypersonic vehicle, propelled by scramjet engines, integration of the engines and airframe is highly desirable. Thus, the forward capture shape of the engine inlet should conform to the vehicle body shape. Furthermore, the use of modular engines places a constraint on the shape of the inlet sidewalls. Finally, one may desire a combustor cross- section shape that is different from that of the inlet. These shape constraints for the inlet can be accommodated by employing a streamline-tracing and lofting technique. This design technique was developed by Smart for inlets with a rectangular-to-elliptical shape transition. In this paper, we generalise that technique to produce inlets that conform to arbitrary shape requirements. As an example, we show the design of a body-integrated hypersonic inlet on a winged-cone vehicle, typical of what might be used in a three-stage orbital launch system. The special challenge of inlet design for this conical vehicle at an angle-of-attack is also discussed. That challenge is that the bow shock sits relatively close to the vehicle body

    A novel, high-sensitivity, bacteriophage-based assay identifies low level Mycobacterium tuberculosis bacteraemia in immunocompetent patients with active and incipient tuberculosis

    Get PDF
    Haematogenous dissemination of M. tuberculosis (Mtb) is critical to pathogenesis of progressive tuberculous infection in animal models. Using a novel phage-based blood assay, we report the first concordant evidence in well-characterised immunocompetent human cohorts, demonstrating associations of Mtb bacteraemia with progressive phenotypes of latent infection and active pulmonary TB respectively

    Studies of the decays D^0 \rightarrow K_S^0K^-\pi^+ and D^0 \rightarrow K_S^0K^+\pi^-

    Full text link
    The first measurements of the coherence factor R_{K_S^0K\pi} and the average strong--phase difference \delta^{K_S^0K\pi} in D^0 \to K_S^0 K^\mp\pi^\pm decays are reported. These parameters can be used to improve the determination of the unitary triangle angle \gamma\ in B^- \rightarrow D~K−\widetilde{D}K^- decays, where D~\widetilde{D} is either a D^0 or a D^0-bar meson decaying to the same final state, and also in studies of charm mixing. The measurements of the coherence factor and strong-phase difference are made using quantum-correlated, fully-reconstructed D^0D^0-bar pairs produced in e^+e^- collisions at the \psi(3770) resonance. The measured values are R_{K_S^0K\pi} = 0.70 \pm 0.08 and \delta^{K_S^0K\pi} = (0.1 \pm 15.7)∘^\circ for an unrestricted kinematic region and R_{K*K} = 0.94 \pm 0.12 and \delta^{K*K} = (-16.6 \pm 18.4)∘^\circ for a region where the combined K_S^0 \pi^\pm invariant mass is within 100 MeV/c^2 of the K^{*}(892)^\pm mass. These results indicate a significant level of coherence in the decay. In addition, isobar models are presented for the two decays, which show the dominance of the K^*(892)^\pm resonance. The branching ratio {B}(D^0 \rightarrow K_S^0K^+\pi^-)/{B}(D^0 \rightarrow K_S^0K^-\pi^+) is determined to be 0.592 \pm 0.044 (stat.) \pm 0.018 (syst.), which is more precise than previous measurements.Comment: 38 pages. Version 3 updated to include the erratum information. Errors corrected in Eqs (25), (26), 28). Fit results updated accordingly, and external inputs updated to latest best known values. Typo corrected in Eq(3)- no other consequence

    Updated Measurement of the Strong Phase in D0 --> K+pi- Decay Using Quantum Correlations in e+e- --> D0 D0bar at CLEO

    Full text link
    We analyze a sample of 3 million quantum-correlated D0 D0bar pairs from 818 pb^-1 of e+e- collision data collected with the CLEO-c detector at E_cm = 3.77 GeV, to give an updated measurement of \cos\delta and a first determination of \sin\delta, where \delta is the relative strong phase between doubly Cabibbo-suppressed D0 --> K+pi- and Cabibbo-favored D0bar --> K+pi- decay amplitudes. With no inputs from other experiments, we find \cos\delta = 0.81 +0.22+0.07 -0.18-0.05, \sin\delta = -0.01 +- 0.41 +- 0.04, and |\delta| = 10 +28+13 -53-0 degrees. By including external measurements of mixing parameters, we find alternative values of \cos\delta = 1.15 +0.19+0.00 -0.17-0.08, \sin\delta = 0.56 +0.32+0.21 -0.31-0.20, and \delta = (18 +11-17) degrees. Our results can be used to improve the world average uncertainty on the mixing parameter y by approximately 10%.Comment: Minor revisions, version accepted by PR

    Observation of the Dalitz Decay Ds∗+→Ds+e+e−D_{s}^{*+} \to D_{s}^{+} e^{+} e^{-}

    Full text link
    Using 586 pb−1\textrm{pb}^{-1} of e+e−e^{+}e^{-} collision data acquired at s=4.170\sqrt{s}=4.170 GeV with the CLEO-c detector at the Cornell Electron Storage Ring, we report the first observation of Ds∗+→Ds+e+e−D_{s}^{*+} \to D_{s}^{+} e^{+} e^{-} with a significance of 5.3σ5.3 \sigma. The ratio of branching fractions \calB(D_{s}^{*+} \to D_{s}^{+} e^{+} e^{-}) / \calB(D_{s}^{*+} \to D_{s}^{+} \gamma) is measured to be [0.72−0.13+0.15(stat)±0.10(syst)][ 0.72^{+0.15}_{-0.13} (\textrm{stat}) \pm 0.10 (\textrm{syst})]%, which is consistent with theoretical expectations
    • …
    corecore