25 research outputs found

    Demand in the electricity market: analysis using big data

    Get PDF
    The traditional business model of energy companies is changing in recent years. The introduction of smart meters has led to an exponential increase in the volume of data available, and their analysis can help find consumption patterns among electric customers to reduce costs and protect the environment. Power plants generate electricity to cover peak consumption at specific times. A set of techniques called “demand response” tries to solve this problem using artificial intelligence proposals. This document proposes a method for processing large volumes of data such as those generated by smart meters. Both for the preprocessing and for the optimization and realization of this analysis big data techniques are used. Specifically, a distributed version of the k-means algorithm and several indices of internal validation of clustering for big data in Spark. The source data correspond to the consumption of electric customers in Bogota, Colombia during the year 2018. The analysis carried out in this study about consumers helps their characterization. This greater knowledge about consumer habits and types of customers can enhance the work of utilities

    Intelligent and Distributed Data Warehouse for Student’s Academic Performance Analysis

    Get PDF
    In the academic world, a large amount of data is handled each day, ranging from student’s assessments to their socio-economic data. In order to analyze this historical information, an interesting alternative is to implement a Data Warehouse. However, Data Warehouses are not able to perform predictive analysis by themselves, so machine intelligence techniques can be used for sorting, grouping, and predicting based on historical information to improve the analysis quality. This work describes a Data Warehouse architecture to carry out an academic performance analysis of students

    ChatGPT and Bard Responses to Polarizing Questions

    Full text link
    Recent developments in natural language processing have demonstrated the potential of large language models (LLMs) to improve a range of educational and learning outcomes. Of recent chatbots based on LLMs, ChatGPT and Bard have made it clear that artificial intelligence (AI) technology will have significant implications on the way we obtain and search for information. However, these tools sometimes produce text that is convincing, but often incorrect, known as hallucinations. As such, their use can distort scientific facts and spread misinformation. To counter polarizing responses on these tools, it is critical to provide an overview of such responses so stakeholders can determine which topics tend to produce more contentious responses -- key to developing targeted regulatory policy and interventions. In addition, there currently exists no annotated dataset of ChatGPT and Bard responses around possibly polarizing topics, central to the above aims. We address the indicated issues through the following contribution: Focusing on highly polarizing topics in the US, we created and described a dataset of ChatGPT and Bard responses. Broadly, our results indicated a left-leaning bias for both ChatGPT and Bard, with Bard more likely to provide responses around polarizing topics. Bard seemed to have fewer guardrails around controversial topics, and appeared more willing to provide comprehensive, and somewhat human-like responses. Bard may thus be more likely abused by malicious actors. Stakeholders may utilize our findings to mitigate misinformative and/or polarizing responses from LLM

    Forecast of the demand for hourly electric energy by artificial neural networks

    Get PDF
    Obtaining an accurate forecast of the energy demand is fundamental to support the several decision processes of the electricity service agents in a country. For market operators, a greater precision in the short-term load forecasting implies a more efficient programming of the electricity generation resources, which means a reduction in costs. In the long term, it constitutes a main indicator for the generation of investment signals for future installed capacity. This research proposes a prognostic model for the demand of electrical energy in Bogota, Colombia at hourly level in a full week, through Artificial Neural Network

    Data mining to identify risk factors associated with university students dropout

    Get PDF
    . This paper presents the identification of university students dropout patterns by means of data mining techniques. The database consists of a series of questionnaires and interviews to students from several universities in Colombia. The information was processed by the Weka software following the Knowledge Extraction Process methodology with the purpose of facilitating the interpretation of results and finding useful knowledge about the students. The partial results of data mining processing on the information about the generations of students of Industrial Engineering from 2016 to 2018 are analyzed and discussed, finding relationships between family, economic, and academic issues that indicate a probable desertion risk in students with common behaviors. These relationships provide enough and appropriate information for the decision-making process in the treatment of university dropout.Universidad Peruana de Ciencias Aplicadas, Universidad de la Costa, Universidad Libre Seccional Barranquilla, CorporaciĂłn Universitaria Latinoamericana

    Dropout-permanence analysis of university students using data mining

    Get PDF
    Dropout is a rejection method present in every educational system, related to the various selection processes, academic performance, and the efficiency of the system in general, that is, the result of the combination and effect of different variables. In this sense, the dropout of university students related to their academic performance is a matter of concern since several years ago. Academic information is analyzed in order to identify factors that influence students´ dropout at the University of Mumbai, India, by using a data mining technique. The data source contains information provided to the entrance (personal and educational background) and that is generated during the study period. The data selection and cleansing are made using different criteria of representation and implementation of classification algorithms such as decision trees, Bayesian networks, and rules. the following factors are identified as influential variables in the desertion: approved courses, quantity and results of attended courses, origin and age of entry of the student. Through this process, it was possible to identify the attributes that characterize the dropout cases and their relationship with the academic performance, especially in the first year of the career

    Megafaunal Community Structure of Andaman Seamounts Including the Back-Arc Basin – A Quantitative Exploration from the Indian Ocean

    Get PDF
    Species rich benthic communities have been reported from some seamounts, predominantly from the Atlantic and Pacific Oceans, but the fauna and habitats on Indian Ocean seamounts are still poorly known. This study focuses on two seamounts, a submarine volcano (cratered seamount – CSM) and a non-volcano (SM2) in the Andaman Back–arc Basin (ABB), and the basin itself. The main purpose was to explore and generate regional biodiversity data from summit and flank (upper slope) of the Andaman seamounts for comparison with other seamounts worldwide. We also investigated how substratum types affect the megafaunal community structure along the ABB. Underwater video recordings from TeleVision guided Gripper (TVG) lowerings were used to describe the benthic community structure along the ABB and both seamounts. We found 13 varieties of substratum in the study area. The CSM has hard substratum, such as boulders and cobbles, whereas the SM2 was dominated by cobbles and fine sediment. The highest abundance of megabenthic communities was recorded on the flank of the CSM. Species richness and diversity were higher at the flank of the CSM than other are of ABB. Non-metric multi-dimensional scaling (nMDS) analysis of substratum types showed 50% similarity between the flanks of both seamounts, because both sites have a component of cobbles mixed with fine sediments in their substratum. Further, nMDS of faunal abundance revealed two groups, each restricted to one of the seamounts, suggesting faunal distinctness between them. The sessile fauna corals and poriferans showed a significant positive relation with cobbles and fine sediments substratum, while the mobile categories echinoderms and arthropods showed a significant positive relation with fine sediments only

    Conservation Patterns of HIV-1 RT Connection and RNase H Domains: Identification of New Mutations in NRTI-Treated Patients

    Get PDF
    Background: Although extensive HIV drug resistance information is available for the first 400 amino acids of its reverse transcriptase, the impact of antiretroviral treatment in C-terminal domains of Pol (thumb, connection and RNase H) is poorly understood. Methods and Findings: We wanted to characterize conserved regions in RT C-terminal domains among HIV-1 group M subtypes and CRF. Additionally, we wished to identify NRTI-related mutations in HIV-1 RT C-terminal domains. We sequenced 118 RNase H domains from clinical viral isolates in Brazil, and analyzed 510 thumb and connection domain and 450 RNase H domain sequences collected from public HIV sequence databases, together with their treatment status and histories. Drug-naıve and NRTI-treated datasets were compared for intra- and inter-group conservation, and differences were determined using Fisher’s exact tests. One third of RT C-terminal residues were found to be conserved among group M variants. Three mutations were found exclusively in NRTI-treated isolates. Nine mutations in the connection and 6 mutations in the RNase H were associated with NRTI treatment in subtype B. Some of them lay in or close to amino acid residues which contact nucleic acid or near the RNase H active site. Several of the residues pointed out herein have been recently associated to NRTI exposure or increase drug resistance to NRTI. Conclusions: This is the first comprehensive genotypic analysis of a large sequence dataset that describes NRTI-related mutations in HIV-1 RT C-terminal domains in vivo. The findings into the conservation of RT C-terminal domains may pave the way to more rational drug design initiatives targeting those regions
    corecore