Search CORE

3,270 research outputs found

Predicción y análisis de interacciones de usuarios en plataformas de enseñanza online

Author: González-Gallego Sosa Miguel Ángel
Publication venue
Publication date: 01/06/2016
Field of study

Las plataformas de enseñanza online generan gran cantidad de metadatos sobre las interacciones entre los estudiantes y con la plataforma. Esta información puede ser aprovechada por los profesores de los cursos para mejorar el curso y la experiencia docente de los estudiantes. En este contexto el objetivo de este TFG es el análisis de las interacciones realizadas por los estudiantes en cursos online y la predicción del comportamiento del estudiante utilizando su patrón de acceso a la plataforma. Debido al volumen de datos que se maneja se hará uso herramientas de computación en paralelo como Apache Spark para preprocesar los datos generados por la plataforma. Mediante Apache Spark se creará una aplicación que extraiga el patrón de acceso de los estudiantes a la plataforma y disminuya la gran cantidad de metadatos generada en un curso online. Por último, se aplicarán algoritmos de aprendizaje automático para predecir variables de interés sobre la interacción de los estudiantes con el curso como la probabilidad de abandono o el rendimiento académico. Esto también se realizará con la herramienta Apache Spark. En concreto, se utilizará el algoritmo Random Forest de la librería MLlib de Spark con la finalidad de obtener el mejor resultado a la hora de predecir las variables de interés del curso.Online education platforms generate a lot of metadata about interactions among students and with the platform. This information can be harnessed by teachers to improve the course and student’s teaching experience. In this context the aim of this study is the analysis of interactions performed by students and the prediction of student’s behavior using his access patterns to platform. Due to the volume of data handled, we use a tool for parallel computing such as Apache Spark for preprocessing the data generated by the platform. We create an application that extracts the access patterns to platform and decreases the volume of the metadata generated in this online course. Finally, we apply machine learning algorithms to predict target variables related to the interactions of students enrolled in the course, for example the dropout rate or the academic performance. We also use the tool Apache Spark for this task. Specifically, we apply the algorithm Random Forest from the library MLlib in order to get the best result in predicting the course’s target variables

Biblos-e Archivo

Database integrated analytics using R : initial experiences with SQL-Server + R

Author: Berral Josep Ll.
Poggi Nicolas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

© 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Most data scientists use nowadays functional or semi-functional languages like SQL, Scala or R to treat data, obtained directly from databases. Such process requires to fetch data, process it, then store again, and such process tends to be done outside the DB, in often complex data-flows. Recently, database service providers have decided to integrate “R-as-a-Service” in their DB solutions. The analytics engine is called directly from the SQL query tree, and results are returned as part of the same query. Here we show a first taste of such technology by testing the portability of our ALOJA-ML analytics framework, coded in R, to Microsoft SQL-Server 2016, one of the SQL+R solutions released recently. In this work we discuss some data-flow schemes for porting a local DB + analytics engine architecture towards Big Data, focusing specially on the new DB Integrated Analytics approach, and commenting the first experiences in usability and performance obtained from such new services and capabilities.Peer ReviewedPostprint (author's final draft

Crossref

UPCommons. Portal del coneixement obert de la UPC

Real-Time Context-Aware Microservice Architecture for Predictive Analytics and Smart Decision-Making

Author: Boubeta Puig Juan
Caravaca Diosdado José Antonio
Chávez de la O Francisco
García de Prado Fontela Alfonso
Ortiz Bellot Guadalupe
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2019
Field of study

The impressive evolution of the Internet of Things and the great amount of data flowing through the systems provide us with an inspiring scenario for Big Data analytics and advantageous real-time context-aware predictions and smart decision-making. However, this requires a scalable system for constant streaming processing, also provided with the ability of decision-making and action taking based on the performed predictions. This paper aims at proposing a scalable architecture to provide real-time context-aware actions based on predictive streaming processing of data as an evolution of a previously provided event-driven service-oriented architecture which already permitted the context-aware detection and notification of relevant data. For this purpose, we have defined and implemented a microservice-based architecture which provides real-time context-aware actions based on predictive streaming processing of data. As a result, our architecture has been enhanced twofold: on the one hand, the architecture has been supplied with reliable predictions through the use of predictive analytics and complex event processing techniques, which permit the notification of relevant context-aware information ahead of time. On the other, it has been refactored towards a microservice architecture pattern, highly improving its maintenance and evolution. The architecture performance has been evaluated with an air quality case study

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio de Objetos de Docencia e Investigación de la Universidad de Cádiz

Customer churn prediction in telecom using machine learning and social network analysis in big data platform

Author: Ahmad Abdelrahim Kasem
Aljoumaa Kadan
Jafar Assef
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/03/2019
Field of study

Customer churn is a major problem and one of the most important concerns for large companies. Due to the direct effect on the revenues of the companies, especially in the telecom field, companies are seeking to develop means to predict potential customer to churn. Therefore, finding factors that increase customer churn is important to take necessary actions to reduce this churn. The main contribution of our work is to develop a churn prediction model which assists telecom operators to predict customers who are most likely subject to churn. The model developed in this work uses machine learning techniques on big data platform and builds a new way of features' engineering and selection. In order to measure the performance of the model, the Area Under Curve (AUC) standard measure is adopted, and the AUC value obtained is 93.3%. Another main contribution is to use customer social network in the prediction model by extracting Social Network Analysis (SNA) features. The use of SNA enhanced the performance of the model from 84 to 93.3% against AUC standard. The model was prepared and tested through Spark environment by working on a large dataset created by transforming big raw data provided by SyriaTel telecom company. The dataset contained all customers' information over 9 months, and was used to train, test, and evaluate the system at SyriaTel. The model experimented four algorithms: Decision Tree, Random Forest, Gradient Boosted Machine Tree "GBM" and Extreme Gradient Boosting "XGBOOST". However, the best results were obtained by applying XGBOOST algorithm. This algorithm was used for classification in this churn predictive model.Comment: 24 pages, 14 figures. PDF https://rdcu.be/budK

arXiv.org e-Print Archive

Directory of Open Access Journals