6 research outputs found

    Large Language Models are Zero Shot Hypothesis Proposers

    Full text link
    Significant scientific discoveries have driven the progress of human civilisation. The explosion of scientific literature and data has created information barriers across disciplines that have slowed the pace of scientific discovery. Large Language Models (LLMs) hold a wealth of global and interdisciplinary knowledge that promises to break down these information barriers and foster a new wave of scientific discovery. However, the potential of LLMs for scientific discovery has not been formally explored. In this paper, we start from investigating whether LLMs can propose scientific hypotheses. To this end, we construct a dataset consist of background knowledge and hypothesis pairs from biomedical literature. The dataset is divided into training, seen, and unseen test sets based on the publication date to control visibility. We subsequently evaluate the hypothesis generation capabilities of various top-tier instructed models in zero-shot, few-shot, and fine-tuning settings, including both closed and open-source LLMs. Additionally, we introduce an LLM-based multi-agent cooperative framework with different role designs and external tools to enhance the capabilities related to generating hypotheses. We also design four metrics through a comprehensive review to evaluate the generated hypotheses for both ChatGPT-based and human evaluations. Through experiments and analyses, we arrive at the following findings: 1) LLMs surprisingly generate untrained yet validated hypotheses from testing literature. 2) Increasing uncertainty facilitates candidate generation, potentially enhancing zero-shot hypothesis generation capabilities. These findings strongly support the potential of LLMs as catalysts for new scientific discoveries and guide further exploration.Comment: Instruction Workshop @ NeurIPS 202

    Robust semi-supervised classification based on data augmented online ELMs with deep features

    No full text
    Abstract One important strategy in semi-supervised learning is to utilize the predicted pseudo labels of unlabeled data to relieve the overdependence on the ground truth of supervised learning algorithms. However, the performance of such kinds of semi-supervised methods heavily relies on the quality of pseudo labels. To address this issue, a robust semi-supervised classification method, named data augmented online extreme learning machines (ELMs) with deep features (DF-DAELM) is proposed. This method firstly extracts features and infers labels for unlabeled data through self-training. Then, with the learned features and inferred labels, two noise-robust shallow classifiers based on data augmentation (i.e., SLI-OELM and CR-OELM) are proposed to eliminate the adverse effects of noises on classifier training. Specifically, inspired by label smoothing, a data augmented method, SLI-OELM is designed based on stochastic linear interpolation to improve the robustness of classifiers based on ELMs. Furthermore, based on the smoothing assumption, the proposed CR-OELM utilizes an ℓ₂-norm consistency regularization term to implicitly weight noisy samples. Comprehensive experiments demonstrate that DF-DAELM achieves competitive or even better performance on CIFAR-10/100 and SVHN over the related state-of-the-art methods. Meanwhile, for the proposed classifiers, experimental results on the MNIST dataset with different noise levels and sample scales demonstrate their superior performance, especially when the sample scale is small (≤ 20 K) and the noise is strong (40% ~ 80% )

    Synergistic organic dye degradation and hydrogen production using Bi2Te3/Te/C single-catalyst nanowires

    No full text
    Over-consumption of limited fossil fuels has caused serious environmental pollution and a global energy crisis, threatening human life and biodiversity. As an ideal, environmentally friendly renewable energy, hydrogen can satisfy human clean energy requirements. Therefore, whether hydrogen can be catalytically generated in the wastewater treatment process is a highly meaningful investigation. Herein, Bi2Te3/Te/C heterojunction nanowires with high specific surface area and rich pore structure were successfully synthesized. The efficient catalytic degradation process is accompanied by the generation of hydrogen. The catalytic degradation of methylene blue and methyl orange was achieved in less than 20 s and 150 s, respectively. Meanwhile, in scaled-up degradation/hydrogen production experiments, fast and efficient H2 production from NaBH4 can be realized in the presence of Bi2Te3/Te/C nanowires. The mechanism of efficient synergistic organic dye degradation and hydrogen production is due to the efficient carrier transfers and accumulation at the hetero-interface. In contrast to previous work, rapid degradation of organic dyes and hydrogen production by decomposition of NaBH4 were achieved without the help of high-cost catalysts such as precious metals. This work could provide an alternative pathway for the future degradation of organic matter in synergistic heterogeneous catalytic wastewater and recovery of by-products including hydrogen

    The Sixth Visual Object Tracking VOT2018 Challenge Results

    Get PDF
    The Visual Object Tracking challenge VOT2018 is the sixth annual tracker benchmarking activity organized by the VOT initiative. Results of over eighty trackers are presented; many are state-of-the-art trackers published at major computer vision conferences or in journals in the recent years. The evaluation included the standard VOT and other popular methodologies for short-term tracking analysis and a “real-time” experiment simulating a situation where a tracker processes images as if provided by a continuously running sensor. A long-term tracking subchallenge has been introduced to the set of standard VOT sub-challenges. The new subchallenge focuses on long-term tracking properties, namely coping with target disappearance and reappearance. A new dataset has been compiled and a performance evaluation methodology that focuses on long-term tracking capabilities has been adopted. The VOT toolkit has been updated to support both standard short-term and the new long-term tracking subchallenges. Performance of the tested trackers typically by far exceeds standard baselines. The source code for most of the trackers is publicly available from the VOT page. The dataset, the evaluation kit and the results are publicly available at the challenge website (http://votchallenge.net).Funding agencies: Slovenian research agencySlovenian Research Agency - Slovenia [P2-0214, P2-0094, J2-8175]; Czech Science FoundationGrant Agency of the Czech Republic [GACR P103/12/G084]; WASP; VR (EMC2); SSF (SymbiCloud); SNIC; AIT Strategic Research Programme 2017 Visua</p
    corecore