43 research outputs found

    The 2018 European heatwave led to stem dehydration but not to consistent growth reductions in forests

    Get PDF
    Heatwaves exert disproportionately strong and sometimes irreversible impacts on forest ecosystems. These impacts remain poorly understood at the tree and species level and across large spatial scales. Here, we investigate the effects of the record-breaking 2018 European heatwave on tree growth and tree water status using a collection of high-temporal resolution dendrometer data from 21 species across 53 sites. Relative to the two preceding years, annual stem growth was not consistently reduced by the 2018 heatwave but stems experienced twice the temporary shrinkage due to depletion of water reserves. Conifer species were less capable of rehydrating overnight than broadleaves across gradients of soil and atmospheric drought, suggesting less resilience toward transient stress. In particular, Norway spruce and Scots pine experienced extensive stem dehydration. Our high-resolution dendrometer network was suitable to disentangle the effects of a severe heatwave on tree growth and desiccation at large-spatial scales in situ, and provided insights on which species may be more vulnerable to climate extremes

    Humanity's Last Exam

    Get PDF
    Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. HLE consists of 3,000 questions across dozens of subjects, including mathematics, humanities, and the natural sciences. HLE is developed globally by subject-matter experts and consists of multiple-choice and short-answer questions suitable for automated grading. Each question has a known solution that is unambiguous and easily verifiable, but cannot be quickly answered via internet retrieval. State-of-the-art LLMs demonstrate low accuracy and calibration on HLE, highlighting a significant gap between current LLM capabilities and the expert human frontier on closed-ended academic questions. To inform research and policymaking upon a clear understanding of model capabilities, we publicly release HLE at https://lastexam.ai

    Humanity's Last Exam

    Get PDF
    Benchmarks are important tools for tracking the rapid advancements in large language model (LLM) capabilities. However, benchmarks are not keeping pace in difficulty: LLMs now achieve over 90\% accuracy on popular benchmarks like MMLU, limiting informed measurement of state-of-the-art LLM capabilities. In response, we introduce Humanity's Last Exam (HLE), a multi-modal benchmark at the frontier of human knowledge, designed to be the final closed-ended academic benchmark of its kind with broad subject coverage. HLE consists of 3,000 questions across dozens of subjects, including mathematics, humanities, and the natural sciences. HLE is developed globally by subject-matter experts and consists of multiple-choice and short-answer questions suitable for automated grading. Each question has a known solution that is unambiguous and easily verifiable, but cannot be quickly answered via internet retrieval. State-of-the-art LLMs demonstrate low accuracy and calibration on HLE, highlighting a significant gap between current LLM capabilities and the expert human frontier on closed-ended academic questions. To inform research and policymaking upon a clear understanding of model capabilities, we publicly release HLE at https://lastexam.ai

    The Thermoanaerobacter Glycobiome Reveals Mechanisms of Pentose and Hexose Co-Utilization in Bacteria

    Get PDF
    Author Summary Renewable liquid fuels derived from lignocellulosic biomass could alleviate global energy shortage and climate change. Cellulose and hemicellulose are the main components of lignocellulosic biomass. Therefore, the ability to simultaneously utilize pentose and hexose (i.e., co-utilization) has been a crucial challenge for industrial microbes producing lignocellulosic biofuels. Certain thermoanaerobic bacteria demonstrate this unusual talent, but the genetic foundation and molecular mechanism of this process remain unknown. In this study, we reconstructed the structure and dynamics of the first genome-wide carbon utilization network of thermoanaerobes. This transcriptome-based co-expression network reveals that glucose, xylose, fructose, and cellobiose catabolism are each featured on distinct functional modules. Furthermore, the dynamics of the network suggests a distinct yet collaborative nature between glucose and xylose catabolism. In addition, we experimentally demonstrated that these novel network-derived features can be rationally exploited for product-yield enhancement via optimized timing and balanced loading of the carbon supply in a substrate-specific manner. Thus, the newly discovered modular and precisely regulated network elucidates unique features of thermoanaerobic glycobiomes and reveals novel perturbation strategies and targets for the enhanced thermophilic production of lignocellulosic biofuels.Yeshttp://www.plosgenetics.org/static/editorial#pee

    Large expert-curated database for benchmarking document similarity detection in biomedical literature search

    Get PDF
    Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical research.Peer reviewe
    corecore