23 research outputs found

    The Confidence-Competence Gap in Large Language Models: A Cognitive Study

    Full text link
    Large Language Models (LLMs) have acquired ubiquitous attention for their performances across diverse domains. Our study here searches through LLMs' cognitive abilities and confidence dynamics. We dive deep into understanding the alignment between their self-assessed confidence and actual performance. We exploit these models with diverse sets of questionnaires and real-world scenarios and extract how LLMs exhibit confidence in their responses. Our findings reveal intriguing instances where models demonstrate high confidence even when they answer incorrectly. This is reminiscent of the Dunning-Kruger effect observed in human psychology. In contrast, there are cases where models exhibit low confidence with correct answers revealing potential underestimation biases. Our results underscore the need for a deeper understanding of their cognitive processes. By examining the nuances of LLMs' self-assessment mechanism, this investigation provides noteworthy revelations that serve to advance the functionalities and broaden the potential applications of these formidable language models.Comment: 19 pages, 8 Figures, to be published in a journal (Journal TBD), All Authors contributed equally and were Supervised by Chandra Dhaka

    Agronomic management and climate change scenario simulations on productivity of rice, maize and wheat in central Nepal using DSSAT ver 4.5 crop model

    Get PDF
    Average productivity of 3.50 t/ha of rice, 2.50 t/ha of maize and 2.45 t/ha of wheat in Nepal have been very less than their potential productivity  for which précised agronomic management and changing climatic scenarios have been reported the most challenging factors at present. Cropping system Model (CSM)-Crop Estimation through Resource and Environment Synthesis (CERES)- Rice, Maize and Wheat, embedded under Decision Support System for Agro-technology Transfer (DSSAT) ver. 4.5 was evaluated from a datasets of farmers’ field experimentations of the central Nepal (Terai-Nawalpur and mid-hill-Kaski districts), and showed high sensitivity of model over change in different agronomic management and climate change scenarios. Model calibration was done by using maximum attainable yield treatments for all tested cultivars while validation was accomplished by using the remaining treatments for predicting growth, phenology and yield of all crop cultivars and results were found perfectly matched with the observed results. Further, the different agronomic management options and climate change scenarios as advocated by IPCC for 2020, 2050 and 2080 from base line of 1995 was studied to simulate the growth and yield performance of diverse crop cultivars. The hybrids and short duration cultivars of all three cereals were found more affected due to climate change than the local and long duration crop cultivars. The model simulation results obtained on rice, maize and wheat using DSSAT ver 4.5 model highlighted that there is utmost importance to develop new climate ready crop cultivars to feed the future generation over different climate change scenarios as suggested by IPCC, 2007 and the simulation results should be extrapolated to the major domains of similar agro-ecozones in Nepal. It is suggested that CSM- CERES- model would be reliable and valid approach for getting strategic decision support system especially with regards to the climate change adaptation measures in Nepal

    Productivity and Profitability Assessment of Drought Tolerant Rice Cultivars under Different Crop Management Practices in Central Terai of Nepal

    Full text link
    Reduction in productivity has led to lower profitability of rice production in Nepal. Proper selections of resource conservation technologies and drought tolerant cultivars are being potential strategies determining productivity of rice in drought prone areas. Thus, a field experiment was accomplished in central-terai of Nepal during 2014 to assess the productivity and profitability of drought tolerant rice cultivars under different crop management practices. The experiment was carried out in strip-plot design with three replications consisting four drought tolerant rice cultivars and three crop management practices. The analyzed data revealed that SRI (System of Rice Intensification) produced significantly higher grain yield (5.28 t ha-1) than other management practices. The straw yield of SRI (5.12 t ha-1) was also significantly higher than other management practices. The cultivars had no influence on grain yield, but the straw yield was significantly influenced by cultivars, with the highest straw yield in Sukkha-3 (5.21 t ha-1). Similarly, SRI management practice also had significantly higher gross returns (NRs. 144652 ha-1), net return (NRs. 56647 ha-1) and B:C ratio (1.64:1). Thus, SRI management practice can be adopted as adaptation approach for obtaining higher productivity and profitability in central terai and similar agro-climatic regions of Nepal

    Cervical cancer screening in Nepal: Ethical considerations

    Get PDF
    © 2015 Gyawali et al. This work is published by Dove Medical Press Limited, and licensed under Creative Commons Attribution – Non Commercial (unported, v3.0) License. The full terms of the License are available at http://creativecommons.org/licenses/by-nc/3.0/. Non-commercial uses of the work are permitted without any further permission from Dove Medical Press Limited, provided the work is properly attributed. Permissions beyond the scope of the License are administered by Dove Medical Press Limited. Information on how to request permission may be found at: http://www.dovepress.com/permissions.phpCervical cancer is the leading cause of cancer deaths for women worldwide. Cervical screening and early treatment can help to prevent cervical cancers. Cervical screening programs in Nepal are often associated with a number of socioeconomic, cultural, and ethical challenges. This paper discusses some central ethical challenges in providing cervical cancer screening in the Nepalese context and culture. It is necessary to address these challenges for successful implementation of such screening programs

    Assessment of Yield and Yield Attributing Characters of Hybrid Maize using Nutrient Expert® Maize Model in Eastern Terai of Nepal

    Full text link
    Indiscriminate use of fertilizer and lack of site specific nutrient management technology is the main cause of low maize productivity in Nepal. Thus, field experiments on farmer\u27s field were conducted on maize to assess the productivity at two sites of Jhapa district viz. Damak and Gauradaha using Nutrient Expert® Maize model from November 2015 to May 2016. The experiment was laid out in Randomized Completely Block Design consisting two treatments viz. NE (Nutrient Expert recommendation) and FFP (Farmer\u27s Fertilizer Practice) with twenty replications. The result revealed significant differences in terms of grain yield, stover yield, biological yield, and yield attributing characters. NE based practices produced higher grain yield (9.22 t ha-1), which was 86.6 percent higher than FFP (4.94 t ha-1). Similarly, higher average cob number m 2 (8.2), average kernel rows cob-1 (14.2), average kernels number row-1 (589.9) and test weight (361.4 g) were recorded in NE based practice. Thus, NE based practice can be adopted for obtaining higher productivity in eastern terai region of Nepal

    Agronomic performance and genotypic diversity for morphological traits among early maize genotypes

    Get PDF
    Detailed information on the genetic diversity between maize germplasm (Zea mays L.) is useful for their systematic and efficient use in breeding programs. Fourteen early maize genotypes were studied to assess their performance and genotypic diversity at Doti, Nepal in 2015. Days to tasseling, days to silking, plant height, ear height, ear length, ear diameter and grain yield were significant among genotypes. Genotype SO3TEY-PO-BM, COMPOL-NIOBP and ACROSS-99402 were found higher yielder with earlier maturity. Days to tasseling (0.85), days to silking (0.82), plant height (0.79), ear length (0.71) and ear diameter (0.66) were found highly heritable traits. Grain yield (0.39) and ear height (0.47) medium and remaining traits showed low heritability. High PCV was observed for grain yield (35.10%), number of plants/plot (34.46%), tesseling silking interval (26.85%), harvested ears/plot (24.45%) and husk cover rating (22.85%) where other traits showed medium to low PCV. Grain yield showed high GCV (21.96%), ear height and husk cover had medium and remaining traits showed low GCV (<10%). Plant height (r₌0.498), harvested plants/plot (r₌0.412), harvested ear/plot (r₌0.762), ear length (r₌0.472) and ear diameter (r₌0.470) showed significant positive correlation with grain yield. The yield can be improved if selection applied in favor of those yield components

    Upper Gastrointestinal Bleeding Induced by Gastric Ulcer Secondary to Strongyloidiasis: A Case Report

    Get PDF
    Strongyloidiasis, a parasitic infestation by Strongyloides stercoralis, involves the gastrointestinal tract with a spectrum from duodenitis to enterocolitis. However, gastric involvement with the manifestation of upper gastrointestinal bleeding is an extremely rare condition due to Strongyloides stercoralis. Due to irregular excretion of larvae, unclear symptoms, paucity of effective diagnostic tools and low parasitic load, makes clinicians difficult to reach the diagnosis of strongyloidiasis. Here, we present a case of upper gastrointestinal bleeding due to a large gastric ulcer whose aetiology was identified to be Strongyloides stercoralis infection of the gastric region by the diagnosis of exclusion

    May Measurement Month 2017: An analysis of blood pressure screening results in Nepal - South Asia

    Get PDF
    Hypertension is the leading risk factor of mortality in Nepal accounting for ∼33 000 deaths in 2016. However, more than 50% of the hypertensive patients are unaware of their status. We participated in the May Measurement Month 2017 (MMM17) project initiated worldwide by the International Society of Hypertension to raise the awareness on the importance of blood pressure (BP) screening. In this paper, we discuss the screening results of MMM17 in Nepal. An opportunistic cross-sectional survey of volunteers aged ≥18 years was carried out in May 2017 following the standard MMM protocol. Data were collected from 18 screening sites in 7 districts covering 5 provinces. Screenings were conducted either in health facilities, public places, or participants' homes. Trained volunteers with health science background and female community health volunteers were mobilized to take part in the screening. A total of 5972 individuals were screened and of 5968 participants, for whom a mean of the 2nd and 3rd readings was available, 1456 (24.4%) participants had hypertension; 908 (16.8%) of those not receiving treatment were hypertensive; and 248 (45.2%) of those being treated had uncontrolled BP. MMM17 is the first nationwide BP screening campaign undertaken in Nepal. Given the suboptimal treatment and control rates identified in the study, there is a strong imperative to scale up hypertension prevention, screening, and management programmes. These results suggest that opportunistic screening can identify significant numbers with hypertension. Mobilization of existing volunteer networks and support of community stakeholders, would be necessary to improve the overall impact and sustainability of future screening programmes

    A New Approach for Identifying Safety Improvement Sites on Rural Highways: A Validation Study

    No full text
    The research presented in this paper examines a new proposed approach for identifying safety improvement sites on rural highways. Unlike conventional approaches, the proposed approach does not require crash history, but rather utilizes classified variables for traffic volume, geometric features, and roadside characteristics that do not require access to exact data or extensive technical expertise. The research validates the performance of the proposed approach using field data from a large sample of rural two-lane highway segments in the state of Oregon including traffic, roadway, and crash data. A mathematical model for the prediction of the EB expected number of crashes using multivariate regression analysis is developed and used as the network screening criterion. The model’s independent variables include roadway geometry, roadside characteristics, and traffic exposure, while the dependent variable is the EB expected number of crashes. Using observed crash history as a reference, the performance of the proposed approach was compared to two of the well-established methods in practice, namely, the Empirical Bayes (EB) and the potential for safety improvement (PSI) methods. The study results suggest that by using crash density for highway segments, the performance of the proposed method was lower than that of the EB and PSI methods. This is despite the high R-square value of the predictive model used in the proposed method. However, when using crash frequencies for highway segments, the performance of the proposed method was found comparable to the well-established EB and PSI methods

    Do Large Language Models Show Human-like Biases? Exploring Confidence—Competence Gap in AI

    No full text
    This study investigates self-assessment tendencies in Large Language Models (LLMs), examining if patterns resemble human cognitive biases like the Dunning–Kruger effect. LLMs, including GPT, BARD, Claude, and LLaMA, are evaluated using confidence scores on reasoning tasks. The models provide self-assessed confidence levels before and after responding to different questions. The results show cases where high confidence does not correlate with correctness, suggesting overconfidence. Conversely, low confidence despite accurate responses indicates potential underestimation. The confidence scores vary across problem categories and difficulties, reducing confidence for complex queries. GPT-4 displays consistent confidence, while LLaMA and Claude demonstrate more variations. Some of these patterns resemble the Dunning–Kruger effect, where incompetence leads to inflated self-evaluations. While not conclusively evident, these observations parallel this phenomenon and provide a foundation to further explore the alignment of competence and confidence in LLMs. As LLMs continue to expand their societal roles, further research into their self-assessment mechanisms is warranted to fully understand their capabilities and limitations
    corecore