34 research outputs found

    Which LLM to Play? Convergence-Aware Online Model Selection with Time-Increasing Bandits

    Full text link
    Web-based applications such as chatbots, search engines and news recommendations continue to grow in scale and complexity with the recent surge in the adoption of LLMs. Online model selection has thus garnered increasing attention due to the need to choose the best model among a diverse set while balancing task reward and exploration cost. Organizations faces decisions like whether to employ a costly API-based LLM or a locally finetuned small LLM, weighing cost against performance. Traditional selection methods often evaluate every candidate model before choosing one, which are becoming impractical given the rising costs of training and finetuning LLMs. Moreover, it is undesirable to allocate excessive resources towards exploring poor-performing models. While some recent works leverage online bandit algorithm to manage such exploration-exploitation trade-off in model selection, they tend to overlook the increasing-then-converging trend in model performances as the model is iteratively finetuned, leading to less accurate predictions and suboptimal model selections. In this paper, we propose a time-increasing bandit algorithm TI-UCB, which effectively predicts the increase of model performances due to finetuning and efficiently balances exploration and exploitation in model selection. To further capture the converging points of models, we develop a change detection mechanism by comparing consecutive increase predictions. We theoretically prove that our algorithm achieves a logarithmic regret upper bound in a typical increasing bandit setting, which implies a fast convergence rate. The advantage of our method is also empirically validated through extensive experiments on classification model selection and online selection of LLMs. Our results highlight the importance of utilizing increasing-then-converging pattern for more efficient and economic model selection in the deployment of LLMs.Comment: Accepted by WWW'24 (Oral

    Automated Quantification of Traffic Particulate Emissions via an Image Analysis Pipeline

    Full text link
    Traffic emissions are known to contribute significantly to air pollution around the world, especially in heavily urbanized cities such as Singapore. It has been previously shown that the particulate pollution along major roadways exhibit strong correlation with increased traffic during peak hours, and that reductions in traffic emissions can lead to better health outcomes. However, in many instances, obtaining proper counts of vehicular traffic remains manual and extremely laborious. This then restricts one's ability to carry out longitudinal monitoring for extended periods, for example, when trying to understand the efficacy of intervention measures such as new traffic regulations (e.g. car-pooling) or for computational modelling. Hence, in this study, we propose and implement an integrated machine learning pipeline that utilizes traffic images to obtain vehicular counts that can be easily integrated with other measurements to facilitate various studies. We verify the utility and accuracy of this pipeline on an open-source dataset of traffic images obtained for a location in Singapore and compare the obtained vehicular counts with collocated particulate measurement data obtained over a 2-week period in 2022. The roadside particulate emission is observed to correlate well with obtained vehicular counts with a correlation coefficient of 0.93, indicating that this method can indeed serve as a quick and effective correlate of particulate emissions

    Robust estimation of bacterial cell count from optical density

    Get PDF
    Optical density (OD) is widely used to estimate the density of cells in liquid culture, but cannot be compared between instruments without a standardized calibration protocol and is challenging to relate to actual cell count. We address this with an interlaboratory study comparing three simple, low-cost, and highly accessible OD calibration protocols across 244 laboratories, applied to eight strains of constitutive GFP-expressing E. coli. Based on our results, we recommend calibrating OD to estimated cell count using serial dilution of silica microspheres, which produces highly precise calibration (95.5% of residuals <1.2-fold), is easily assessed for quality control, also assesses instrument effective linear range, and can be combined with fluorescence calibration to obtain units of Molecules of Equivalent Fluorescein (MEFL) per cell, allowing direct comparison and data fusion with flow cytometry measurements: in our study, fluorescence per cell measurements showed only a 1.07-fold mean difference between plate reader and flow cytometry data

    ERTool: A Python Package for Efficient Implementation of the Evidential Reasoning Approach for Multi-Source Evidence Fusion

    No full text
    Background: Multi-source evidence fusion aims to process and combine evidence from different sources to support rational and reliable decision-making. The evidential reasoning (ER) approach is a helpful method to deal with information from multiple sources with uncertainty. It has been widely used in business analytics, healthcare management, and other fields for optimal decision-making. However, computerized implementation of the ER approach usually requires much expertise and effort. At present, some ER-based computerized tools, such as the intelligent decision system (IDS), have been developed by professionals to provide decision support. Nevertheless, IDS is not open source, and the user interfaces are a bit complicated for non-professional users. The lack of a free-to-access and easy-to-use computerized tool limits the application of ER. Methods: We designed and developed a Python package that could efficiently implement the ER approach for multi-source evidence fusion. Further, based on it, we built an online web-based system, providing not only real-time evidence fusion but also visualized illustrations of combined results. Finally, a comparison study between the Python package and IDS was conducted. Results: A Python package, ERTool, was developed to implement the ER approach automatically and efficiently. The online version of the ERTool provides a more convenient way to handle evidence fusion tasks. Conclusions: ERTool, compatible with Python 3 and can be installed through the Python Package Index at https://pypi.org/project/ERTool/, was developed to implement the ER approach. The ERTool has advantages in easy accessibility, clean interfaces, and high computing efficiency, making it a key tool for researchers and practitioners in multiple evidence-based decision-making. It helps bridge the gap between the algorithmic ER and its practical application and facilitates its widespread adoption in general decision-making contexts

    Deliver Bioinformatics Services in Public Cloud: Challenges and Research Framework

    No full text
    Bioinformatics is a developing interdisciplinary science which combines information technologies into biological researches. The techniques from this emerging field have shown great potential in many business areas including drug design, agriculture, and so on. Meanwhile, this new computational field has also been one of the largest consumers of computational power, as the analyses in bioinformatics are often extremely computationally or data intensive. Although there are already several projects which have done tentative exploration on deploying bioinformatics applications to cloud environments, the deployment is ad-hoc and restricted to a single private cloud environment. Moreover, the complexity of various demands of bench biologists and bioinformaticians also brings new challenges to bioinformatics cloud development. In this paper, we first identify the key participants and their interactions in a public bioinformatics cloud environment, where bioinformatic analyses are consumed as services on top of a cloud infrastructure. After that, we propose a research framework to discuss the domain-specific technical challenges in delivering such a solution. Finally, we summarize the existing related research efforts based on our framework and introduce our ongoing Web Lab project. ? 2011 IEEE.EI

    Electrocatalytic NAD(P)H regeneration for biosynthesis

    No full text
    The highly efficient chemoselectivity, stereoselectivity, and regioselectivity render enzyme catalysis an ideal pathway for the synthesis of various chemicals in broad applications. While the cofactor of an enzyme is necessary but expensive, the conversed state of the cofactor is not beneficial for the positive direction of the reaction. Cofactor regeneration using electrochemical methods has the advantages of simple operation, low cost, easy process monitoring, and easy product separation, and the electrical energy is green and sustainable. Therefore, bioelectrocatalysis has great potential in synthesis by combining electrochemical cofactor regeneration with enzymatic catalysis. In this review, we detail the mechanism of cofactor regeneration and categorize the common electron mediators and enzymes used in cofactor regeneration. The reaction type and the recent progress are summarized in electrochemically coupled enzymatic catalysis. The main challenges of such electroenzymatic catalysis are pointed out and future developments in this field are foreseen

    Noise exposure in occupational setting associated with elevated blood pressure in China

    Get PDF
    Abstract Background Hypertension is the primary out-auditory adverse outcome caused due to occupational noise exposure. This study investigated the associations of noise exposure in an occupational setting with blood pressure and risk of hypertension. Methods A total of 1,390 occupational noise-exposed workers and 1399 frequency matched non-noise-exposed subjects were recruited from a cross-sectional survey of occupational noise-exposed and the general population, respectively. Blood pressure was measured using a mercury sphygmomanometer following a standard protocol. Multiple logistic regression was used to calculate the odds ratio (OR) and 95% confidence interval (CI) of noise exposure adjusted by potential confounders. Results Noise-exposed subjects had significantly higher levels of systolic blood pressure(SBP) (125.1 ± 13.9 mm Hg) and diastolic blood pressure (DBP) (77.6 ± 10.7 mm Hg) than control subjects (SBP: 117.2 ± 15.7 mm Hg, DBP: 70.0 ± 10.5 mm Hg) (P  0.05). Conclusions Occupational noise exposure was associated with higher levels of SBP, DBP, and the risk of hypertension. These findings indicate that effective and feasible measures should be implemented to reduce the risk of hypertension caused by occupational noise exposure

    Innovative Materials for Energy Storage and Conversion

    No full text
    The metal chalcogenides (MCs) for sodium-ion batteries (SIBs) have gained increasing attention owing to their low cost and high theoretical capacity. However, the poor electrochemical stability and slow kinetic behaviors hinder its practical application as anodes for SIBs. Hence, various strategies have been used to solve the above problems, such as dimensions reduction, composition formation, doping functionalization, morphology control, coating encapsulation, electrolyte modification, etc. In this work, the recent progress of MCs as electrodes for SIBs has been comprehensively reviewed. Moreover, the summarization of metal chalcogenides contains the synthesis methods, modification strategies and corresponding basic reaction mechanisms of MCs with layered and non-layered structures. Finally, the challenges, potential solutions and future prospects of metal chalcogenides as SIBs anode materials are also proposed
    corecore