91 research outputs found

    Nonlinear Boosting Projections for Ensemble Construction

    Get PDF
    In this paper we propose a novel approach for ensemble construction based on the use of nonlinear projections to achieve both accuracy and diversity of individual classifiers. The proposed approach combines the philosophy of boosting, putting more effort on difficult instances, with the basis of the random subspace method. Our main contribution is that instead of using a random subspace, we construct a projection taking into account the instances which have posed most difficulties to previous classifiers. In this way, consecutive nonlinear projections are created by a neural network trained using only incorrectly classified instances. The feature subspace induced by the hidden layer of this network is used as the input space to a new classifier. The method is compared with bagging and boosting techniques, showing an improved performance on a large set of 44 problems from the UCI Machine Learning Repository. An additional study showed that the proposed approach is less sensitive to noise in the data than boosting method

    Teaching push-down automata and Turing machines

    Get PDF
    In this paper we present the new version of a tool to assist in teaching formal languages and automata theory. In the previous version the tool provided algorithms for regular expressions, finite automata and context free grammars. The new version can simulate as well push-down automata and Turing machine

    Monitoring Students at the University: Design and Application of a Moodle Plugin

    Get PDF
    Early detection of at-risk students is essential, especially in the university environment. Moreover, personalized learning has been shown to increase motivation and lower student dropout rates. At present, the average dropout rates among students following courses leading to the award of Spanish university degrees are around 18% and 42.8% for presential teaching and online courses, respectively. The objectives of this study are: (1) to design and to implement a Modular Object-Oriented Dynamic Learning Environment (Moodle) plugin, “eOrientation”, for the early detection of at-risk students; (2) to test the effectiveness of the “eOrientation” plugin on university students. We worked with 279 third-year students following health sciences degrees. A process for extracting information records was also implemented. In addition, a learning analytics module was developed, through which both supervised and unsupervised Machine Learning techniques can be applied. All these measures facilitated the personalized monitoring of the students and the easier detection of students at academic risk. The use of this tool could be of great importance to teachers and university governing teams, as it can assist the early detection of students at academic risk. Future studies will be aimed at testing the plugin using the Moodle environment on degree courses at other universities.Consejería de Educación de la Junta de Castilla y León (Spain) (Department of Education of the Junta de Castilla y León), grant number BU032G19, and grants from the University of Burgos for the dissemination and the improvement of teaching innovation experiences of the Vice-Rectorate of Teaching and Research Staff, the Vice-Rectorate for Research and Knowledge Transfer, 2020, and the Departamento de Ciencias de la Salud the University of Burgos (Spain)

    Estudio exploratorio de los efectos de una intervención grupal virtual en sentido de vida en una población de mujeres

    Get PDF
    Para evaluar los efectos de una intervenci—n virtual en sentido de vida se usó la Escala Dimensional de Sentido de Vida, realizando una medición pre y otra pos a dos grupos de mujeres. Originalmente un grupo de 80 mujeres que se redujo a 60 dadas las condiciones, entre las que se encontraron aparición de psicopatología e inconsistencias en la evaluación. Se utiliza grupo control (30 mujeres) y grupo intervención (30 Mujeres), se usó el Grupo de sentido, una técnica logoterapéutica estandarizada que permite el cambio en la percepción en sentido de vida a través del trabajo grupal. Se usó la plataforma Zoom para realizar las sesiones totalmente virtuales. Los resultados no mostraron diferencias significativas entre grupo control e intervención, pero sólo de manera intragrupal en el grupo intervención

    Un simulador del lenguaje IL del estándar IEC 61131-3 como apoyo a la asignatura de Automática Industrial

    Get PDF
    Los Autómatas Programables Industriales o Controladores Lógicos Programables (en adelante, PLCs, del término inglés Programmable Logic Controllers), son dispositivos electrónicos muy usados en el mundo de la automatización industrial. La programación de los autómatas se realiza con software desarrollados por los fabricantes propios dispositivos. La mayoría de estos entornos son propietarios y requieren el pago de una costosa licencia. Esto supone una dificultad no sólo para la migración de sistemas en la industria sino también para la docencia del funcionamiento de dichos sistemas. En el año 1990 un grupo de trabajo de la Comisión Electrotécnica Internacional (IEC) definió el estándar IEC 61131, que es el primer paso para la estandarización de los autómatas programables y sus periféricos. En este recurso docente se presenta un software que permite la edición, validación y simulación de programas para PLCs, en concreto, para programas escritos en lenguaje Lista de Instrucciones (Instruction List, en adelante, IL) dentro del estándar IEC 61131-3, un lenguaje muy parecido al lenguaje ensamblador. El software desarrollado lleva a cabo el análisis léxico y sintáctico de dicho lenguaje. Además, presenta una interfaz de usuario para presentar los resultados de la simulación de los programas escritos en IL

    When is resampling beneficial for feature selection with imbalanced wide data?

    Get PDF
    This paper studies the effects that combinations of balancing and feature selection techniques have on wide data (many more attributes than instances) when different classifiers are used. For this, an extensive study is done using 14 datasets, 3 balancing strategies, and 7 feature selection algorithms. The evaluation is carried out using 5 classification algorithms, analyzing the results for different percentages of selected features, and establishing the statistical significance using Bayesian tests. Some general conclusions of the study are that it is better to use RUS before the feature selection, while ROS and SMOTE offer better results when applied afterwards. Additionally, specific results are also obtained depending on the classifier used, for example, for Gaussian SVM the best performance is obtained when the feature selection is done with SVM-RFE before balancing the data with RUS.“La Caixa” Foundation, under agreement LCF/PR/PR18/51130007. This work was also supported by the Junta de Castilla León under project BU055P20 (JCyL/FEDER, UE) and by the Ministry of Science and Innovation under project PID2020-119894GB-I00, co-financed through European Union FEDER funds

    Herramienta de apoyo a la docencia de algoritmos de selección de instancias

    Get PDF
    En el currículo de ingeniería informática la minería de datos y el aprendizaje automático son cada vez más relevantes, tanto en los cursos de grado y máster, como también en los de doctorado. Prueba de ello es la aparición de diversas herramientas que facilitan el aprendizaje de algoritmos relacionados con la disciplina, mediante la ejecución paso a paso de los mismos y la visualización de los resultados. Sin embargo, para el caso concreto de los algoritmos de selección de instancias, estas herramientas son prácticamente inexistentes. En el presente recurso docente se presenta una herramienta implementada para cubrir esta carencia. «Instance Selection», que es como se llama la aplicación, está preparada para mostrar el funcionamiento tanto de los algoritmos clásicos como alguno de los más modernos, permitiendo la ejecución paso a paso y visualizando los resultados intermedios para facilitar la labor didáctica. Las principales ventajas de la aplicación descrita en este recurso docente son: que implementa varios algoritmos, lo que permite su comparación, es multiplataforma, permite la visualización incremental de los pasos de los algoritmos implementados, la interfaz está preparada para varios idiomas e incluye una completa ayuda.SUMMARY -- In computer engineering curricula Data Mining and Machine Learning are increasingly important in both undergraduate and masters, as well as the PhD courses. The emergence of several tools that facilitate learning algorithms related to the discipline proves that. Some of these tools allow the execution of algorithms step by step showing the results of each step, others let the student change the algorithm parameters and the student can visualize the results. However, for the specific case of instance selection algorithms these tools are virtually nonexistent. This paper discusses a tool implemented to fill this gap. “Instance Selection”, which is the name of the application, is prepared to show the operation of both classical instance selection algorithms as some of the most modern, allowing the execution step by step and displaying the intermediate results to facilitate the teaching task. The main advantages of the application described in this teaching resource are that it implements several algorithms, allowing comparison between them, it is multi-platform, it allows the interactive visualization of the steps of the implemented algorithms, the interface is ready for several languages, it includes comprehensive help.Peer Reviewe

    Evolutionary prototype selection for multi-output regression

    Get PDF
    A novel approach to prototype selection for multi-output regression data sets is presented. A multi-objective evolutionary algorithm is used to evaluate the selections using two criteria: training data set compression and prediction quality expressed in terms of root mean squared error. A multi-target regressor based on k-NN was used for that purpose during the training to evaluate the error, while the tests were performed using four different multi-target predictive models. The distance matrices used by the multi-target regressor were cached to accelerate operational performance. Multiple Pareto fronts were also used to prevent overfitting and to obtain a broader range of solutions, by using different probabilities in the initialization of populations and different evolutionary parameters in each one. The results obtained with the benchmark data sets showed that the proposed method greatly reduced data set size and, at the same time, improved the predictive capabilities of the multi-output regressors trained on the reduced data set.NCN (Polish National Science Center) grant “Evolutionary Methods in Data Selection” No. 2017/01/X/ST6/00202, project TIN2015-67534-P (MINECO/FEDER, UE) of the Ministerio de Economía y Competitividad of the Spanish Government, and project BU085P17 (JCyL/FEDER, UE) of the Junta de Castilla y León cofinanced with European Union FEDER funds

    Experiencias de Colaboración con Empresas en la Realización de Proyectos Fin de Carrera de la Ingeniería en Informática de Gestión de la Universidad de Burgos

    Get PDF
    El presente artículo presenta algunas de las experiencias que el área de Lenguajes y Sistemas Informáticos de la Universidad de Burgos ha tenido en la realización de proyectos fin de carrera en colaboración con empresas. En opinión de los autores es sumamente interesante que los alumnos puedan aprovechar la realización de su proyecto fin de carrera para ser protagonistas de una experiencia de trabajo en un entorno real

    Approx-SMOTE: Fast SMOTE for Big Data on Apache Spark

    Get PDF
    One of the main goals of Big Data research, is to find new data mining methods that are able to process large amounts of data in acceptable times. In Big Data classification, as in traditional classification, class imbalance is a common problem that must be addressed, in the case of Big Data also looking for a solution that can be applied in an acceptable execution time. In this paper we present Approx-SMOTE, a parallel implementation of the SMOTE algorithm for the Apache Spark framework. The key difference with the original SMOTE, besides parallelism, is that it uses an approximated version of k-Nearest Neighbor which makes it highly scalable. Although an implementation of SMOTE for Big Data already exists (SMOTE-BD), it uses an exact Nearest Neighbor search, which does not make it entirely scalable. Approx-SMOTE on the other hand is able to achieve up to 30 times faster run times without sacrificing the improved classification performance offered by the original SMOTE.“La Caixa” Foundation, under agreement LCF/PR/PR18/51130007. This work was supported by the Junta de Castilla y León under project BU055P20 and by the Ministry of Science and Innovation of Spain under project PID2020-119894 GB-I00, co-financed through European Union FEDER funds. It also was supported through Consejería de Educación of the Junta de Castilla y León and the European Social Fund through a pre-doctoral grant (EDU/1100/2017). This material is based upon work supported by Google Cloud
    corecore