15 research outputs found

    Genetic Programming Approach for Classification Problem using GPU

    Get PDF
    Genetic programming (GP) is a machine learning technique that is based on the evolution of computer programs using a genetic algorithm. Genetic programming have proven to be a good technique for solving data set classification problems but at high computational cost. The objectives of this research is to accelerate the execution of the classification algorithms by proposing a general model of execution in GPU of the adjustment function of the individuals of the population. The computation times of each of the phases of the evolutionary process and the operation of the model of parallel programming in GPU were studied. Genetic programming is interesting to parallelize from the perspective of evolving a population of individuals in paralle

    Genetic Programming Approach for Classification Problem using GPU

    Get PDF
    Genetic programming (GP) is a machine learning technique that is based on the evolution of computer programs using a genetic algorithm. Genetic programming have proven to be a good technique for solving data set classification problems but at high computational cost. The objectives of this research is to accelerate the execution of the classification algorithms by proposing a general model of execution in GPU of the adjustment function of the individuals of the population. The computation times of each of the phases of the evolutionary process and the operation of the model of parallel programming in GPU were studied. Genetic programming is interesting to parallelize from the perspective of evolving a population of individuals in parallel

    Exploring the influence of ICT in online students through data mining tools

    Get PDF
    ABSTRACT The aim of the present work is to evaluate differences according to age in digital competence, usages, and attitude towards ICT in a sample of 1231 online students of a distance university. To fulfill this goal, hypothesis testing, correlation analysis, and data mining techniques were performed on the basis of a 72-item survey. Results showed no strong differences between extreme groups of age. Besides, some interesting correlations between variables and additional information through association rules were found. This study led to better knowledge of online students in order to improve teaching and learning processes. Keywords Association rules, ICT attitude, ICT usages, distance education, online education, correlation, Mann-Whitney test, digital competence. INTRODUCTION During the last decades the proportion of higher education students taking at least one online course has outstandingly increased In this paper we try to address this problem by developing the following objectives: (1) Compare digital competence, uses, and attitude towards ICT between young and students over 50; (2) Analyze relationships between variables in young and students over 50; (3) Obtain additional information about relationships between variables and group of ages by using data mining tools. This paper is organized as follows. The next section presents a brief selection of related works. The method is described in section three. In section four the results are exhibited. Finally, the section five concludes the paper with discussion and plans for future work. RELATED WORKS Recently, some research studies were proposed to address the usage of data mining techniques in education especially in association rule mining. Fattah et al. presented an association rule discovery model to investigate and analyze a university admission system database Romero et al. explored the extraction of rare association rules when gathering student usage data from a Moodle system Merceron and Yacef gave an interpretation of two measures of interest through association rules: cosine and added value Kumar and Chadha METHOD 3.1 Participants, variables and instruments A total of 1231 students participated voluntarily (with informed consent) in this study, 600 females and 631 males. They were al

    Reducing gaps in quantitative association rules: A genetic programming free-parameter algorithm

    Get PDF
    The extraction of useful information for decision making is a challenge in many different domains. Association rule mining is one of the most important techniques in this field, discovering relationships of interest among patterns. Despite the mining of association rules being an area of great interest for many researchers, the search for well-grouped continuous values is still a challenge, discovering rules that do not comprise patterns which represent unnecessary ranges of values. Existing algorithms for mining association rules in continuous domains are mainly based on a non-deterministic search, requiring a high number of parameters to be optimised. These parameters hinder the mining process, and the algorithms themselves must be known to those data mining experts that want to use them. We therefore present a grammar guided genetic programming algorithm that does not require as many parameters as other existing approaches and enables the discovery of quantitative association rules comprising small-size gaps. The algorithm is verified over a varied set of data, comparing the results to other association rule mining algorithms from several paradigms. Additionally, some resulting rules from different paradigms are analysed, demonstrating the effectiveness of our model for reducing gaps in numerical features

    Semi-supervised learning to discover the average scale of graduation of university students

    Get PDF
    Las instituciones de educación superior omiten hasta cierto punto los factores que retrasan las tasas de promoción de los estudiantes universitarios. El retraso no siempre puede ser revelado debido a la diversidad de los programas de estudio, desde el comienzo de la carrera hasta la finalización del programa y la graduación. En este trabajo se utilizó el conjunto de datos de estudiantes correspondiente a 5 cursos académicos completos (primero-quinto curso), 53 variables y 849 observaciones de las diferentes carreras universitarias. Así, se exploraron variables y se utilizó la minería de datos con técnicas de aprendizaje semi-supervisado para descubrir asociaciones que detectan categorías de graduación de estudiantes. Por lo tanto, las reglas de interés fueron descubiertas usando las métricas de support, confidence, lift y conviction de las reglas de la asociación. Los hallazgos sugieren que las edades del grupo de profesores entre segundo y tercer año, así como la categoría de nota media entre cursos y la empleabilidad de los estudiantes, son los principales factores que influyen en las tasas de graduación de los estudiantes universitarios.Institutions of higher education omit to a certain extent the factors that delay the rates of promotion of university students. The delay cannot always be disclosed due to the diversity of study programs, from the beginning of the career to the completion of the program and graduation. This paper used the student data set for 5 full academic years (grades 1-5), 53 variables, and 849 observations of different university careers. Thus, variables were explored and data mining with semi-supervised learning techniques was used to discover associations that detect graduation categories of students. Therefore, the rules of interest were discovered using the metrics of support, confidence and elevation of the rules of association. The findings suggest that the ages of the group of teachers between second and third year, as well as the grade point average between courses and the employability of students, are the main factors influencing the graduation rates of university students

    Aprendizaje semi-supervisado para descubrir la escala de tiempo promedio de graduación de estudiantes universitarios

    Get PDF
    Las instituciones de educación superior omiten hasta cierto punto los factores que retrasan las tasas de promoción de los estudiantes universitarios. El retraso no siempre puede ser revelado debido a la diversidad de los programas de estudio, desde el comienzo de la carrera hasta la finalización del programa y la graduación. En este trabajo se utilizó el conjunto de datos de estudiantes correspondiente a 5 cursos académicos completos (primero-quinto curso), 53 variables y 849 observaciones de las diferentes carreras universitarias. Así, se exploraron variables y se utilizó la minería de datos con técnicas de aprendizaje semi-supervisado para descubrir asociaciones que detectan categorías de graduación de estudiantes. Por lo tanto, las reglas de interés fueron descubiertas usando las métricas de support, confidence, lift y conviction de las reglas de la asociación. Los hallazgos sugieren que las edades del grupo de profesores entre segundo y tercer año, así como la categoría de nota media entre cursos y la empleabilidad de los estudiantes, son los principales factores que influyen en las tasas de graduación de los estudiantes universitarios.Institutions of higher education omit to a certain extent the factors that delay the rates of promotion of university students. The delay cannot always be disclosed due to the diversity of study programs, from the beginning of the career to the completion of the program and graduation. This paper used the student data set for 5 full academic years (grades 1-5), 53 variables, and 849 observations of different university careers. Thus, variables were explored and data mining with semi-supervised learning techniques was used to discover associations that detect graduation categories of students. Therefore, the rules of interest were discovered using the metrics of support, confidence and elevation of the rules of association. The findings suggest that the ages of the group of teachers between second and third year, as well as the grade point average between courses and the employability of students, are the main factors influencing the graduation rates of university students

    Knowledge extraction from courses and online learning activities

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsTechnological advancement has led to the increasing use of all types of electronic devices, which causes large volumes of data to be constantly generated and stored in repositories. This growth in data through Information Technology (IT) systems makes it necessary to continue its exploration and analysis to support institutions in the decision-making process. Due to the importance of education in society, this field has been the target of several studies over the years. Taking that into account, and knowing that association rules and regression analysis are among the most popular data mining algorithms for finding the hidden patterns in data, the purpose of this paper is to find exciting trends across courses considering the students’ grades, as well as study if, and to what extent, the student’s learning performance is related to their interaction in moodle. The data used were collected through the netp@ and moodle systems, consisting of all student learning data and activities/logs history. This data belongs to students of all masters who attended the academic years between 2012-2013 and 2020- 2021. We chose Sample, Explore, Modify, Model, and Assess (SEMMA) methodology for the applicability of its steps to accomplish the study’s goals. Through the Partial Least Squares Regression (PLSR) algorithm, it was shown that Gestão do Conhecimento, Metodologias de Investigação and Métodos Descritivos de Data Mining are the most importants courses that affect the grades of Dissertation/Work Project/Intership Report in the Business Intelligence specialization. In addition, according to the predictive model, Metodologias de Investigação was the most important variable for predicting the performance of the Dissertation/Work Project/Internship Report of Information Systems and Technologies Management specialization. Finally, the association rules algorithms used were the Apriori, FP-Growth and Eclat. From their results, it was found that courses with continuous assessment methods achieve better academic performance compared to others. Furthermore, higher levels of online interaction are associated with better achievement

    A soft computing decision support framework for e-learning

    Get PDF
    Tesi per compendi de publicacions.Supported by technological development and its impact on everyday activities, e-Learning and b-Learning (Blended Learning) have experienced rapid growth mainly in higher education and training. Its inherent ability to break both physical and cultural distances, to disseminate knowledge and decrease the costs of the teaching-learning process allows it to reach anywhere and anyone. The educational community is divided as to its role in the future. It is believed that by 2019 half of the world's higher education courses will be delivered through e-Learning. While supporters say that this will be the educational mode of the future, its detractors point out that it is a fashion, that there are huge rates of abandonment and that their massification and potential low quality, will cause its fall, assigning it a major role of accompanying traditional education. There are, however, two interrelated features where there seems to be consensus. On the one hand, the enormous amount of information and evidence that Learning Management Systems (LMS) generate during the e-Learning process and which is the basis of the part of the process that can be automated. In contrast, there is the fundamental role of e-tutors and etrainers who are guarantors of educational quality. These are continually overwhelmed by the need to provide timely and effective feedback to students, manage endless particular situations and casuistics that require decision making and process stored information. In this sense, the tools that e-Learning platforms currently provide to obtain reports and a certain level of follow-up are not sufficient or too adequate. It is in this point of convergence Information-Trainer, where the current developments of the LMS are centered and it is here where the proposed thesis tries to innovate. This research proposes and develops a platform focused on decision support in e-Learning environments. Using soft computing and data mining techniques, it extracts knowledge from the data produced and stored by e-Learning systems, allowing the classification, analysis and generalization of the extracted knowledge. It includes tools to identify models of students' learning behavior and, from them, predict their future performance and enable trainers to provide adequate feedback. Likewise, students can self-assess, avoid those ineffective behavior patterns, and obtain real clues about how to improve their performance in the course, through appropriate routes and strategies based on the behavioral model of successful students. The methodological basis of the mentioned functionalities is the Fuzzy Inductive Reasoning (FIR), which is particularly useful in the modeling of dynamic systems. During the development of the research, the FIR methodology has been improved and empowered by the inclusion of several algorithms. First, an algorithm called CR-FIR, which allows determining the Causal Relevance that have the variables involved in the modeling of learning and assessment of students. In the present thesis, CR-FIR has been tested on a comprehensive set of classical test data, as well as real data sets, belonging to different areas of knowledge. Secondly, the detection of atypical behaviors in virtual campuses was approached using the Generative Topographic Mapping (GTM) methodology, which is a probabilistic alternative to the well-known Self-Organizing Maps. GTM was used simultaneously for clustering, visualization and detection of atypical data. The core of the platform has been the development of an algorithm for extracting linguistic rules in a language understandable to educational experts, which helps them to obtain patterns of student learning behavior. In order to achieve this functionality, the LR-FIR algorithm (Extraction of Linguistic Rules in FIR) was designed and developed as an extension of FIR that allows both to characterize general behavior and to identify interesting patterns. In the case of the application of the platform to several real e-Learning courses, the results obtained demonstrate its feasibility and originality. The teachers' perception about the usability of the tool is very good, and they consider that it could be a valuable resource to mitigate the time requirements of the trainer that the e-Learning courses demand. The identification of student behavior models and prediction processes have been validated as to their usefulness by expert trainers. LR-FIR has been applied and evaluated in a wide set of real problems, not all of them in the educational field, obtaining good results. The structure of the platform makes it possible to assume that its use is potentially valuable in those domains where knowledge management plays a preponderant role, or where decision-making processes are a key element, e.g. ebusiness, e-marketing, customer management, to mention just a few. The Soft Computing tools used and developed in this research: FIR, CR-FIR, LR-FIR and GTM, have been applied successfully in other real domains, such as music, medicine, weather behaviors, etc.Soportado por el desarrollo tecnológico y su impacto en las diferentes actividades cotidianas, el e-Learning (o aprendizaje electrónico) y el b-Learning (Blended Learning o aprendizaje mixto), han experimentado un crecimiento vertiginoso principalmente en la educación superior y la capacitación. Su habilidad inherente para romper distancias tanto físicas como culturales, para diseminar conocimiento y disminuir los costes del proceso enseñanza aprendizaje le permite llegar a cualquier sitio y a cualquier persona. La comunidad educativa se encuentra dividida en cuanto a su papel en el futuro. Se cree que para el año 2019 la mitad de los cursos de educación superior del mundo se impartirá a través del e-Learning. Mientras que los partidarios aseguran que ésta será la modalidad educativa del futuro, sus detractores señalan que es una moda, que hay enormes índices de abandono y que su masificación y potencial baja calidad, provocará su caída, reservándole un importante papel de acompañamiento a la educación tradicional. Hay, sin embargo, dos características interrelacionadas donde parece haber consenso. Por un lado, la enorme generación de información y evidencias que los sistemas de gestión del aprendizaje o LMS (Learning Management System) generan durante el proceso educativo electrónico y que son la base de la parte del proceso que se puede automatizar. En contraste, está el papel fundamental de los e-tutores y e-formadores que son los garantes de la calidad educativa. Éstos se ven continuamente desbordados por la necesidad de proporcionar retroalimentación oportuna y eficaz a los alumnos, gestionar un sin fin de situaciones particulares y casuísticas que requieren toma de decisiones y procesar la información almacenada. En este sentido, las herramientas que las plataformas de e-Learning proporcionan actualmente para obtener reportes y cierto nivel de seguimiento no son suficientes ni demasiado adecuadas. Es en este punto de convergencia Información-Formador, donde están centrados los actuales desarrollos de los LMS y es aquí donde la tesis que se propone pretende innovar. La presente investigación propone y desarrolla una plataforma enfocada al apoyo en la toma de decisiones en ambientes e-Learning. Utilizando técnicas de Soft Computing y de minería de datos, extrae conocimiento de los datos producidos y almacenados por los sistemas e-Learning permitiendo clasificar, analizar y generalizar el conocimiento extraído. Incluye herramientas para identificar modelos del comportamiento de aprendizaje de los estudiantes y, a partir de ellos, predecir su desempeño futuro y permitir a los formadores proporcionar una retroalimentación adecuada. Así mismo, los estudiantes pueden autoevaluarse, evitar aquellos patrones de comportamiento poco efectivos y obtener pistas reales acerca de cómo mejorar su desempeño en el curso, mediante rutas y estrategias adecuadas a partir del modelo de comportamiento de los estudiantes exitosos. La base metodológica de las funcionalidades mencionadas es el Razonamiento Inductivo Difuso (FIR, por sus siglas en inglés), que es particularmente útil en el modelado de sistemas dinámicos. Durante el desarrollo de la investigación, la metodología FIR ha sido mejorada y potenciada mediante la inclusión de varios algoritmos. En primer lugar un algoritmo denominado CR-FIR, que permite determinar la Relevancia Causal que tienen las variables involucradas en el modelado del aprendizaje y la evaluación de los estudiantes. En la presente tesis, CR-FIR se ha probado en un conjunto amplio de datos de prueba clásicos, así como conjuntos de datos reales, pertenecientes a diferentes áreas de conocimiento. En segundo lugar, la detección de comportamientos atípicos en campus virtuales se abordó mediante el enfoque de Mapeo Topográfico Generativo (GTM), que es una alternativa probabilística a los bien conocidos Mapas Auto-organizativos. GTM se utilizó simultáneamente para agrupamiento, visualización y detección de datos atípicos. La parte medular de la plataforma ha sido el desarrollo de un algoritmo de extracción de reglas lingüísticas en un lenguaje entendible para los expertos educativos, que les ayude a obtener los patrones del comportamiento de aprendizaje de los estudiantes. Para lograr dicha funcionalidad, se diseñó y desarrolló el algoritmo LR-FIR, (extracción de Reglas Lingüísticas en FIR, por sus siglas en inglés) como una extensión de FIR que permite tanto caracterizar el comportamiento general, como identificar patrones interesantes. En el caso de la aplicación de la plataforma a varios cursos e-Learning reales, los resultados obtenidos demuestran su factibilidad y originalidad. La percepción de los profesores acerca de la usabilidad de la herramienta es muy buena, y consideran que podría ser un valioso recurso para mitigar los requerimientos de tiempo del formador que los cursos e-Learning exigen. La identificación de los modelos de comportamiento de los estudiantes y los procesos de predicción han sido validados en cuanto a su utilidad por los formadores expertos. LR-FIR se ha aplicado y evaluado en un amplio conjunto de problemas reales, no todos ellos del ámbito educativo, obteniendo buenos resultados. La estructura de la plataforma permite suponer que su utilización es potencialmente valiosa en aquellos dominios donde la administración del conocimiento juegue un papel preponderante, o donde los procesos de toma de decisiones sean una pieza clave, por ejemplo, e-business, e-marketing, administración de clientes, por mencionar sólo algunos. Las herramientas de Soft Computing utilizadas y desarrolladas en esta investigación: FIR, CR-FIR, LR-FIR y GTM, ha sido aplicadas con éxito en otros dominios reales, como música, medicina, comportamientos climáticos, etc.Postprint (published version
    corecore