26 research outputs found

    Is GPT-4 a Good Data Analyst?

    Full text link
    As large language models (LLMs) have demonstrated their powerful capabilities in plenty of domains and tasks, including context understanding, code generation, language generation, data storytelling, etc., many data analysts may raise concerns if their jobs will be replaced by AI. This controversial topic has drawn a lot of attention in public. However, we are still at a stage of divergent opinions without any definitive conclusion. Motivated by this, we raise the research question of "is GPT-4 a good data analyst?" in this work and aim to answer it by conducting head-to-head comparative studies. In detail, we regard GPT-4 as a data analyst to perform end-to-end data analysis with databases from a wide range of domains. We propose a framework to tackle the problems by carefully designing the prompts for GPT-4 to conduct experiments. We also design several task-specific evaluation metrics to systematically compare the performance between several professional human data analysts and GPT-4. Experimental results show that GPT-4 can achieve comparable performance to humans. We also provide in-depth discussions about our results to shed light on further studies before we reach the conclusion that GPT-4 can replace data analysts.Comment: 11 pages, 2 figure

    The neurological and non-neurological roles of the primary microcephaly-associated protein ASPM

    Get PDF
    Primary microcephaly (MCPH), is a neurological disorder characterized by small brain size that results in numerous developmental problems, including intellectual disability, motor and speech delays, and seizures. Hitherto, over 30 MCPH causing genes ( MCPHs ) have been identified. Among these MCPHs , MCPH5 , which encodes abnormal spindle-like microcephaly-associated protein (ASPM), is the most frequently mutated gene. ASPM regulates mitotic events, cell proliferation, replication stress response, DNA repair, and tumorigenesis. Moreover, using a data mining approach, we have confirmed that high levels of expression of ASPM correlate with poor prognosis in several types of tumors. Here, we summarize the neurological and non-neurological functions of ASPM and provide insight into its implications for the diagnosis and treatment of MCPH and cancer

    The neurological and non-neurological roles of the primary microcephaly-associated protein ASPM

    Get PDF
    Primary microcephaly (MCPH), is a neurological disorder characterized by small brain size that results in numerous developmental problems, including intellectual disability, motor and speech delays, and seizures. Hitherto, over 30 MCPH causing genes (MCPHs) have been identified. Among these MCPHs, MCPH5, which encodes abnormal spindle-like microcephaly-associated protein (ASPM), is the most frequently mutated gene. ASPM regulates mitotic events, cell proliferation, replication stress response, DNA repair, and tumorigenesis. Moreover, using a data mining approach, we have confirmed that high levels of expression of ASPM correlate with poor prognosis in several types of tumors. Here, we summarize the neurological and non-neurological functions of ASPM and provide insight into its implications for the diagnosis and treatment of MCPH and cancer

    Can ChatGPT-like Generative Models Guarantee Factual Accuracy? On the Mistakes of New Generation Search Engines

    Full text link
    Although large conversational AI models such as OpenAI's ChatGPT have demonstrated great potential, we question whether such models can guarantee factual accuracy. Recently, technology companies such as Microsoft and Google have announced new services which aim to combine search engines with conversational AI. However, we have found numerous mistakes in the public demonstrations that suggest we should not easily trust the factual claims of the AI models. Rather than criticizing specific models or companies, we hope to call on researchers and developers to improve AI models' transparency and factual correctness

    Competing for Shareable Arms in Multi-Player Multi-Armed Bandits

    Full text link
    Competitions for shareable and limited resources have long been studied with strategic agents. In reality, agents often have to learn and maximize the rewards of the resources at the same time. To design an individualized competing policy, we model the competition between agents in a novel multi-player multi-armed bandit (MPMAB) setting where players are selfish and aim to maximize their own rewards. In addition, when several players pull the same arm, we assume that these players averagely share the arms' rewards by expectation. Under this setting, we first analyze the Nash equilibrium when arms' rewards are known. Subsequently, we propose a novel SelfishMPMAB with Averaging Allocation (SMAA) approach based on the equilibrium. We theoretically demonstrate that SMAA could achieve a good regret guarantee for each player when all players follow the algorithm. Additionally, we establish that no single selfish player can significantly increase their rewards through deviation, nor can they detrimentally affect other players' rewards without incurring substantial losses for themselves. We finally validate the effectiveness of the method in extensive synthetic experiments.Comment: ICML 202

    Unlocking Temporal Question Answering for Large Language Models Using Code Execution

    Full text link
    Large language models (LLMs) have made significant progress in natural language processing (NLP), and are utilized extensively in various applications. Recent works, such as chain-of-thought (CoT), have shown that intermediate reasoning steps can improve the performance of LLMs for complex reasoning tasks, such as math problems and symbolic question-answering tasks. However, we notice the challenge that LLMs face when it comes to temporal reasoning. Our preliminary experiments show that generating intermediate reasoning steps does not always boost the performance of complex temporal question-answering tasks. Therefore, we propose a novel framework that combines the extraction capability of LLMs and the logical reasoning capability of a Python solver to tackle this issue. Extensive experiments and analysis demonstrate the effectiveness of our framework in handling intricate time-bound reasoning tasks

    Serum Creatinine Level: A Supplemental Index to Distinguish Duchenne Muscular Dystrophy from Becker Muscular Dystrophy

    Get PDF
    Background. To improve assessment of dystrophinopathy, the aim of this study was to identify whether serum creatinine (Crn) level reflects disease severity. Methods. Biochemical, Vignos score, and genetic data were collected on 212 boys with dystrophinopathy. Results. Serum Crn level had a strong inverse correlation with Vignos score by simple correlation ( = −0.793) and partial correlation analysis after adjustment for age, height, and weight ( = −0.791; both < 0.01). Serum Crn level was significantly higher in patients with in-frame than out-of-frame mutations ( = −4.716, < 0.01) and in Becker muscular dystrophy (BMD) patients than Duchenne muscular dystrophy (DMD) patients at ages 4, 5, 7, and 9 yr (all < 0.0125). After adjusting for age, height, and weight, BMD patients still had a significantly higher serum Crn level than DMD patients ( = 7.140, = 6.277, < 0.01). Conclusions. Serum Crn level reflected disease severity and may serve as a supplemental index to distinguish DMD from BMD in clinical practice

    Single Fasting Plasma Glucose Versus 75-g Oral Glucose-Tolerance Test in Prediction of Adverse Perinatal Outcomes::A Cohort Study

    Get PDF
    Background: There remains uncertainty regarding whether a single fasting glucose measurement is sufficient to predict risk of adverse perinatal outcomes. Methods: We included 12,594 pregnant women who underwent a 75-g oral glucose-tolerance test (OGTT) at 22–28 weeks' gestation in the Born in Guangzhou Cohort Study, China. Outcomes were large for gestational age (LGA) baby, cesarean section, and spontaneous preterm birth. We calculated the area under the receiver operator characteristic curves (AUCs) to assess the capacity of OGTT glucose values to predict adverse outcomes, and compared the AUCs of different components of OGTT. Results: 1325 women had a LGA baby (10.5%). Glucose measurements were linearly associated with LGA, with strongest associations for fasting glucose (odds ratio 1.37, 95% confidence interval 1.30–1.45). Weaker associations were observed for cesarean section and spontaneous preterm birth. Fasting glucose have a comparable discriminative power for prediction of LGA to the combination of fasting, 1 h, and 2 h glucose values during OGTT (AUCs, 0.611 vs. 0.614, P = 0.166). The LGA risk was consistently increased in women with abnormal fasting glucose (≥5.1 mmol/l), irrespective of 1 h or 2 h glucose levels. Conclusions: A single fasting glucose measurement performs comparably to 75-g OGTT in predicting risk of having a LGA baby

    Qwen Technical Report

    Full text link
    Large language models (LLMs) have revolutionized the field of artificial intelligence, enabling natural language processing tasks that were previously thought to be exclusive to humans. In this work, we introduce Qwen, the first installment of our large language model series. Qwen is a comprehensive language model series that encompasses distinct models with varying parameter counts. It includes Qwen, the base pretrained language models, and Qwen-Chat, the chat models finetuned with human alignment techniques. The base language models consistently demonstrate superior performance across a multitude of downstream tasks, and the chat models, particularly those trained using Reinforcement Learning from Human Feedback (RLHF), are highly competitive. The chat models possess advanced tool-use and planning capabilities for creating agent applications, showcasing impressive performance even when compared to bigger models on complex tasks like utilizing a code interpreter. Furthermore, we have developed coding-specialized models, Code-Qwen and Code-Qwen-Chat, as well as mathematics-focused models, Math-Qwen-Chat, which are built upon base language models. These models demonstrate significantly improved performance in comparison with open-source models, and slightly fall behind the proprietary models.Comment: 59 pages, 5 figure
    corecore