25 research outputs found
Demand in the electricity market: analysis using big data
The traditional business model of energy companies is changing in recent years. The introduction of smart meters has led to an exponential increase in the volume of data available, and their analysis can help find consumption patterns among electric customers to reduce costs and protect the environment. Power plants generate electricity to cover peak consumption at specific times. A set of techniques called “demand response” tries to solve this problem using artificial intelligence proposals. This document proposes a method for processing large volumes of data such as those generated by smart meters. Both for the preprocessing and for the optimization and realization of this analysis big data techniques are used. Specifically, a distributed version of the k-means algorithm and several indices of internal validation of clustering for big data in Spark. The source data correspond to the consumption of electric customers in Bogota, Colombia during the year 2018. The analysis carried out in this study about consumers helps their characterization. This greater knowledge about consumer habits and types of customers can enhance the work of utilities
Intelligent and Distributed Data Warehouse for Student’s Academic Performance Analysis
In the academic world, a large amount of data is handled each day, ranging from student’s assessments to their socio-economic data. In order to analyze this historical information, an interesting alternative is to implement a Data Warehouse. However, Data Warehouses are not able to perform predictive analysis by themselves, so machine intelligence techniques can be used for sorting, grouping, and predicting based on historical information to improve the analysis quality. This work describes a Data Warehouse architecture to carry out an academic performance analysis of students
ChatGPT and Bard Responses to Polarizing Questions
Recent developments in natural language processing have demonstrated the
potential of large language models (LLMs) to improve a range of educational and
learning outcomes. Of recent chatbots based on LLMs, ChatGPT and Bard have made
it clear that artificial intelligence (AI) technology will have significant
implications on the way we obtain and search for information. However, these
tools sometimes produce text that is convincing, but often incorrect, known as
hallucinations. As such, their use can distort scientific facts and spread
misinformation. To counter polarizing responses on these tools, it is critical
to provide an overview of such responses so stakeholders can determine which
topics tend to produce more contentious responses -- key to developing targeted
regulatory policy and interventions. In addition, there currently exists no
annotated dataset of ChatGPT and Bard responses around possibly polarizing
topics, central to the above aims. We address the indicated issues through the
following contribution: Focusing on highly polarizing topics in the US, we
created and described a dataset of ChatGPT and Bard responses. Broadly, our
results indicated a left-leaning bias for both ChatGPT and Bard, with Bard more
likely to provide responses around polarizing topics. Bard seemed to have fewer
guardrails around controversial topics, and appeared more willing to provide
comprehensive, and somewhat human-like responses. Bard may thus be more likely
abused by malicious actors. Stakeholders may utilize our findings to mitigate
misinformative and/or polarizing responses from LLM
Forecast of the demand for hourly electric energy by artificial neural networks
Obtaining an accurate forecast of the energy demand is fundamental to support the several decision processes of the electricity service agents in a country. For market operators, a greater precision in the short-term load forecasting implies a more efficient programming of the electricity generation resources, which means a reduction in costs. In the long term, it constitutes a main indicator for the generation of investment signals for future installed capacity. This research proposes a prognostic model for the demand of electrical energy in Bogota, Colombia at hourly level in a full week, through Artificial Neural Network
Data mining to identify risk factors associated with university students dropout
. This paper presents the identification of university students dropout
patterns by means of data mining techniques. The database consists of a series of
questionnaires and interviews to students from several universities in Colombia.
The information was processed by the Weka software following the Knowledge
Extraction Process methodology with the purpose of facilitating the interpretation of results and finding useful knowledge about the students. The partial
results of data mining processing on the information about the generations of
students of Industrial Engineering from 2016 to 2018 are analyzed and discussed, finding relationships between family, economic, and academic issues
that indicate a probable desertion risk in students with common behaviors.
These relationships provide enough and appropriate information for the
decision-making process in the treatment of university dropout.Universidad Peruana de Ciencias Aplicadas, Universidad de la Costa, Universidad Libre Seccional Barranquilla, CorporaciĂłn Universitaria Latinoamericana
Dropout-permanence analysis of university students using data mining
Dropout is a rejection method present in every educational system,
related to the various selection processes, academic performance, and the efficiency of the system in general, that is, the result of the combination and effect
of different variables. In this sense, the dropout of university students related to
their academic performance is a matter of concern since several years ago.
Academic information is analyzed in order to identify factors that influence
students´ dropout at the University of Mumbai, India, by using a data mining
technique. The data source contains information provided to the entrance
(personal and educational background) and that is generated during the study
period. The data selection and cleansing are made using different criteria of
representation and implementation of classification algorithms such as decision
trees, Bayesian networks, and rules. the following factors are identified as
influential variables in the desertion: approved courses, quantity and results of
attended courses, origin and age of entry of the student. Through this process, it
was possible to identify the attributes that characterize the dropout cases and
their relationship with the academic performance, especially in the first year of
the career
Megafaunal Community Structure of Andaman Seamounts Including the Back-Arc Basin – A Quantitative Exploration from the Indian Ocean
Species rich benthic communities have been reported from some seamounts, predominantly from the Atlantic and Pacific Oceans, but the fauna and habitats on Indian Ocean seamounts are still poorly known. This study focuses on two seamounts, a submarine volcano (cratered seamount – CSM) and a non-volcano (SM2) in the Andaman Back–arc Basin (ABB), and the basin itself. The main purpose was to explore and generate regional biodiversity data from summit and flank (upper slope) of the Andaman seamounts for comparison with other seamounts worldwide. We also investigated how substratum types affect the megafaunal community structure along the ABB. Underwater video recordings from TeleVision guided Gripper (TVG) lowerings were used to describe the benthic community structure along the ABB and both seamounts. We found 13 varieties of substratum in the study area. The CSM has hard substratum, such as boulders and cobbles, whereas the SM2 was dominated by cobbles and fine sediment. The highest abundance of megabenthic communities was recorded on the flank of the CSM. Species richness and diversity were higher at the flank of the CSM than other are of ABB. Non-metric multi-dimensional scaling (nMDS) analysis of substratum types showed 50% similarity between the flanks of both seamounts, because both sites have a component of cobbles mixed with fine sediments in their substratum. Further, nMDS of faunal abundance revealed two groups, each restricted to one of the seamounts, suggesting faunal distinctness between them. The sessile fauna corals and poriferans showed a significant positive relation with cobbles and fine sediments substratum, while the mobile categories echinoderms and arthropods showed a significant positive relation with fine sediments only
Conservation Patterns of HIV-1 RT Connection and RNase H Domains: Identification of New Mutations in NRTI-Treated Patients
Background: Although extensive HIV drug resistance information is available for the first 400 amino acids of its reverse
transcriptase, the impact of antiretroviral treatment in C-terminal domains of Pol (thumb, connection and RNase H) is poorly
understood. Methods and Findings: We wanted to characterize conserved regions in RT C-terminal domains among HIV-1 group M
subtypes and CRF. Additionally, we wished to identify NRTI-related mutations in HIV-1 RT C-terminal domains. We sequenced 118 RNase H domains from clinical viral isolates in Brazil, and analyzed 510 thumb and connection domain and 450 RNase H domain sequences collected from public HIV sequence databases, together with their treatment status and histories. Drug-naıve and NRTI-treated datasets were compared for intra- and inter-group conservation, and differences were determined using Fisher’s exact tests. One third of RT C-terminal residues were found to be conserved among group M variants. Three mutations were found exclusively in NRTI-treated isolates. Nine mutations in the connection and 6 mutations
in the RNase H were associated with NRTI treatment in subtype B. Some of them lay in or close to amino acid residues which
contact nucleic acid or near the RNase H active site. Several of the residues pointed out herein have been recently associated to NRTI exposure or increase drug resistance to NRTI. Conclusions: This is the first comprehensive genotypic analysis of a large sequence dataset that describes NRTI-related
mutations in HIV-1 RT C-terminal domains in vivo. The findings into the conservation of RT C-terminal domains may pave the way to more rational drug design initiatives targeting those regions