38 research outputs found

    Introduction to Programming Using Mobile Phones and MIT App Inventor

    Get PDF
    At the beginning of each year, we ask our new undergraduate students in Computer Engineering if they have ever developed a computer program. Surprisingly, the most frequent answer is no. The few students who have attended a Computer Science training module usually have some basic programming notions; however, most of our students coming straight from high school have never programmed. This lack of basic programming skills represents a major drawback when taking programming-related courses. This is especially true for the course on Computer Organization, taught during the first semester of the first year, as one of its main objectives is to explain the processor architecture, and therefore a great part of it revolves around programming in assembly language. To tackle this lack of basic programming skills, a workshop on mobile application programming using MIT App Inventor is offered to freshmen. This workshop is highly welcomed and positively received by the students, and we believe that it has contributed to improving their performance on courses related to programming, and in particular, on the Computer Organization course

    ¿Puedo programar mi móvil? Pero si acabo de llegar

    Get PDF
    Por sorprendente que parezca, cada vez que preguntamos a nuestros estudiantes recién matriculados en el Grado en Ingeniería Informática si han programado alguna vez, la respuesta mayoritaria es que no. Los pocos que han estudiado formación profesional en informática suelen tener alguna noción, pero la mayor parte de los que han estudiado bachillerato, ninguna. Esta falta de competencias básicas de programación supone una desventaja en aquellas asignaturas relacionadas con esta materia. En nuestro grado, esta desventaja es especialmente evidente en la asignatura Estructura de computadores, de primer curso y primer semestre que, sin ser una asignatura de programación al uso, tiene por objeto que el estudiante adquiera competencias relacionadas con la arquitectura de un computador y, por tanto, con la programación en lenguaje ensamblador. Para suplir esta falta de base, se ha impartido un taller de programación para móviles con MIT App Inventor. Este taller ha tenido una gran aceptación, ha sido muy bien valorado por los estudiantes y consideramos que ha contribuido a mejorar los resultados de Estructura de computadores.Surprisingly, every time we ask the newly enrolled students in the Degree in Computer Engineering whether they have ever programmed, the majority answer is no. The few that have done a computer science vocational training module usually have some notion, but most of those who have done high school, none. This lack of basic programming skills is a disadvantage in those courses related to this matter. In our degree, this disadvantage is especially evident in the Computers Structure course, taught on the first year at the first semester. Although it is not a usual programming course, it requires the student to acquire skills related to computer architecture, and, therefore, to programming in assembly language. To address this lack of previous knowledge, a workshop on mobile programming has been taught using MIT App Inventor. This workshop has had a great acceptance, has been very well evaluated by the students, and we believe that has contributed to improve their results on the Computers Structure course

    Animaciones interactivas para la enseñanza y aprendizaje de los protocolos de coherencia de cachés

    Get PDF
    Entre los objetivos formativos de los cursos avanzados de arquitectura de computadores suele estar el de que los estudiantes sean capaces de describir y analizar el funcionamiento de los protocolos de coherencia de cachés. Aunque dichos protocolos son relativamente sencillos, es necesario analizar muchas situaciones diferentes para entender cómo abordan todos los detalles del problema que quieren resolver. Lo que hace que sean complejos de explicar y de comprender. Una herramienta que ilustrara gráficamente el funcionamiento de dichos protocolos facilitaría enormemente su enseñanza/aprendizaje. Con objeto de mejorar la docencia de dicha materia, hemos desarrollado tres animaciones interactivas que muestran cómo funcionan tres de los protocolos de coherencia de caché más frecuentemente utilizados. Para cada protocolo, una serie de operaciones de lectura/escritura ilustran todas las posibles situaciones que pueden darse. Las animaciones permiten avanzar y retroceder para poder entender/estudiar mejor las acciones que tienen lugar en cada paso.SUMMARY: Among the educational objectives in advanced courses of computers architecture there is usually one that states that students should be able to describe and analyze how the cache coherence protocols work. Although these protocols are relatively simple, it is necessary to analyze many different situations to understand how they address all the details of the problem they solve. This makes them complex to be explained and to be understood. A tool that illustrates graphically the operation of these protocols should greatly facilitate the teaching/learning of these protocols. With the aim of improving the teaching on this subject, we have developed three interactive animations that show how some of the most frequently used cache coherence protocols work. For each protocol, a sequence of read and write operations illustrates all possible situations that can take place in each protocol. The tool is interactive in that the student can go forward and backward to understand/ study the different actions that occur at each step.Peer Reviewe

    Utilizando ARMSim y QtARMSim para la docencia de Arquitectura de Computadores

    Get PDF
    Muchos de los objetivos formativos de las asignaturas de introducción a la Arquitectura de Computadores se centran en aquellos aspectos que conforman la visión que un programador en lenguaje ensamblador tiene de un computador. Por regla general, para definir dichos objetivos se suele utilizar una arquitectura de computador concreta, que normalmente se selecciona con el doble criterio de que sea lo más sencilla posible y, a la vez, motive al estudiantado. La arquitectura ARM es una candidata idónea como vehículo conductor en la docencia de Arquitectura de Computadores. Por un lado, al estar basada en la arquitectura RISC (Reduced Instruction Set Computer), es relativamente sencilla. Por otro, se trata de una arquitectura actual y ampliamente difundida (especialmente en dispositivos móviles, smartphones y tabletas), lo que motiva al estudiantado. Para poder realizar prácticas sobre ARM es conveniente disponer de un simulador o de una herramienta de desarrollo sobre una máquina ARM. Puesto que dicha materia se explica en los primeros cursos, conviene que la aplicación seleccionada sea sencilla de utilizar y lo suficientemente flexible. Por otro lado, conviene que sea software libre, para poder adaptarla en caso necesario, y también multiplataforma y gratuita, para facilitar que el estudiante que lo desee pueda instalarla en su propio equipo. Tras evaluar distintas opciones, finalmente se optó por desarrollar y liberar un simulador propio de ARM, ARMSim, y una interfaz gráfica para dicho simulador, QtARMSim. El motor de simulación, ARMSim, y su interfaz, QtARMSim, han sido utilizados durante el curso 2014–15. Las críticas recibidas, tanto por los estudiantes como por los profesores de laboratorio, han sido muy positivas.Many of the training objectives of the Introduction to Computer Architecture modules focus on those aspects that conform the vision that an assembly language programmer has about a computer. As a rule, in order to define those objectives a concrete computer architecture is used following the following criteria: simplicity and ability to motivate students. ARM architecture is an ideal candidate for the didactics of Computer Architecture. On the one hand, being based on RISC architecture (Reduced Instruction Set Computer) it is rather simple. On the other, it is widely spread contemporary architecture (especially in mobile phones, smartphones and tablets), something that motivates students. In order to carry out ARM practice it would be convenient to have a simulator or a development tool on an ARM machine. Given the fact that this module is taught during the first academic years, it would also be convenient that the application selected was easy to use and flexible enough. Besides, it would be a good idea that it used freeware in order to be adapted if necessary, besides being free of charge and cross-platform-based so the students may install it in their own computers. After assessing several options, an ARM simulator (ARMSim) as well as a graphic interface for the latter (QtARMSim) were finally developed. The simulation engine, ARMSim, as well as its interface, QtARMSim, were used during the 2014/2015 academic year. The feedback received from both the students and lab lecturers have been remarkably positive

    Utilizando Arduino Due en la docencia de la entrada/salida

    Get PDF
    Resumen: La problemática de la entrada/salida y su gestión suele formar parte de las asignaturas de introducción a la arquitectura de computadores. La propia naturaleza del tema y su diversidad hace que las sesiones prácticas se lleven a cabo habitualmente, bien sobre dispositivos específicos sencillos, bien sobre simuladores, lo que las aleja de los dispositivos reales y les resta vistosidad. Sin embargo, es posible utilizar dispositivos actuales y sencillos, como las tarjetas Arduino, para presentar a los estudiantes una visión más real y atractiva de la entrada/ salida, manteniendo a su vez la sencillez de uso de los entornos y sistemas empleados, lo que consideramos prioritario en los primeros cursos de grado. En nuestro caso, puesto que actualmente fundamentamos nuestra docencia en arquitectura de computadores sobre la arquitectura ARM, hemos optado por el modelo Arduino Due, que dispone de un microcontrolador, el ATSAM3X8E, que implementa la versión Cortex- M3 de la arquitectura ARM. Para poder realizar las prácticas de entrada/salida hemos modificado ligeramente el entorno Arduino para poder incluir programas en ensamblador, y hemos diseñado una pequeña tarjeta con un led RGB y un pulsador, lo que ha permitido proponer ejercicios sencillos pero vistosos. Los propios dispositivos del microcontrolador de la Arduino DUE han bastado para abarcar otros aspectos de la entrada/salida y presentar ejemplos de mayor complejidad para incentivar a los estudiantes. La primera experiencia con este entorno ha sido satisfactoria tanto para el profesorado de las asignaturas en las que se ha utilizado como para los estudiantes, en quienes además se ha fomentado el interés en continuar trabajando con las tarjetas Arduino en sus propios proyectos.Abstract: The input/output (I/O) and its management is often part of the introductory courses to computer architecture. The very nature of this topic and its diversity makes that the practice sessions often take place either on simple specific devices, or on simulators, which hide the complexity of actual I/O devices and subtracts their appealing. However, it is possible to use today existing and simple devices such as Arduino boards to introduce students to a more realistic and attractive vision of the I/O, while maintaining the ease of use of the required environments and systems, which we consider a priority on first degree courses. In our case, since currently we base our teaching on computer architecture on the ARM architecture, we have opted for the Arduino Due model, which has a microcontroller, ATSAM3X8E, which implements the Cortex-M3 version of the ARM Architecture. To carry out the laboratory sessions on I/O we have slightly modified the Arduino IDE in order to accept assembly source code. In addition, we have designed and built a small board with an RGB led and a switch, which allowed us to propose simple but colourful exercises. The built-in I/O included in the ARM controller of the Arduino DUE board have proved enough to explore other important aspects of I/O as well as to offer more complex examples to incentivate the students on the subject. The first experience with this environment has been satisfactory for both teachers and students, who also have fostered interest in continuing to work with Arduino cards in their own projects

    Using machine learning to model the training scalability of convolutional neural networks on clusters of GPUs

    Get PDF
    In this work, we build a general piece-wise model to analyze data-parallel (DP) training costs of convolutional neural networks (CNNs) on clusters of GPUs. This general model is based on i) multi-layer perceptrons (MLPs) in charge of modeling the NVIDIA cuDNN/cuBLAS library kernels involved in the training of some of the state-of-the-art CNNs; and ii) an analytical model in charge of modeling the NVIDIA NCCL Allreduce collective primitive using the Ring algorithm. The CNN training scalability study performed using this model in combination with the Roofline technique on varying batch sizes, node (floating-point) arithmetic performance, node memory bandwidth, network link bandwidth, and cluster dimension unveil some crucial bottlenecks at both GPU and cluster level. To provide evidence of this analysis, we validate the accuracy of the proposed model against a Python library for distributed deep learning training.Funding for open access charge: CRUE-Universitat Jaume

    Animaciones interactivas para la enseñanza y aprendizaje de los protocolos de coherencia de cachés

    Get PDF
    Entre los objetivos formativos de los cursos avanzados de arquitectura de computadores suele estar el de que los estudiantes sean capaces de describir y analizar el funcionamiento de los protocolos de coherencia de cachés. Aunque dichos protocolos son relativamente sencillos, es necesario analizar muchas situaciones diferentes para entender cómo abordan todos los detalles del problema que quieren resolver. Lo que hace que sean complejos de explicar y de comprender. Una herramienta que ilustrara gráficamente el funcionamiento de dichos protocolos facilitaría enormemente su enseñanza/aprendizaje. Con objeto de mejorar la docencia de dicha materia, hemos desarrollado tres animaciones interactivas que muestran cómo funcionan tres de los protocolos de coherencia de caché más frecuentemente utilizados. Para cada protocolo, una serie de operaciones de lectura/escritura ilustran todas las posibles situaciones que pueden darse. Las animaciones permiten avanzar y retroceder para poder entender/estudiar mejor las acciones que tienen lugar en cada paso.Among the educational objectives in advanced courses of computers architecture there is usually one that states that students should be able to describe and analyze how the cache coherence protocols work. Although these protocols are relatively simple, it is necessary to analyze many different situations to understand how they address all the details of the problem they solve. This makes them complex to be explained and to be understood. A tool that illustrates graphically the operation of these protocols should greatly facilitate the teaching/learning of these protocols. With the aim of improving the teaching on this subject, we have developed three interactive animations that show how some of the most frequently used cache coherence protocols work. For each protocol, a sequence of read and write operations illustrates all possible situations that can take place in each protocol. The tool is interactive in that the student can go forward and backward to understand/ study the different actions that occur at each step.Este trabajo ha sido parcialmente financiado por el proyecto «Activitats formatives per a assignatures de la matèria Arquitectura de Computadores» de la Unitat de Suport Educatiu de la Universitat Jaume I (10G136-16)

    PyDTNN: A user-friendly and extensible framework for distributed deep learning

    Get PDF
    We introduce a framework for training deep neural networks on clusters of computers with the following appealing properties: (1) It is developed in Python, exposing an amiable interface that provides an accessible entry point for the newcomer; (2) it is extensible, offering a customizable tool for the more advanced user in deep learning; (3) it covers the main functionality appearing in convolutional neural networks; and (4) it delivers reasonable inter-node parallel performance exploiting data parallelism by leveraging MPI via MPI4Py for communication and NumPy for the efficient execution of (multithreaded) numerical kernels

    Performance–energy trade‑ofs of deep learning convolution algorithms on ARM processors

    Get PDF
    In this work, we assess the performance and energy efciency of high-performance codes for the convolution operator, based on the direct, explicit/implicit lowering and Winograd algorithms used for deep learning (DL) inference on a series of ARM-based processor architectures. Specifcally, we evaluate the NVIDIA Denver2 and Carmel processors, as well as the ARM Cortex-A57 and Cortex-A78AE CPUs as part of a recent set of NVIDIA Jetson platforms. The performance–energy evaluation is carried out using the ResNet-50 v1.5 convolutional neural network (CNN) on varying confgurations of convolution algorithms, number of threads/cores, and operating frequencies on the tested processor cores. The results demonstrate that the best throughput is obtained on all platforms with the Winograd convolution operator running on all the cores at their highest frequency. However, if the goal is to reduce the energy footprint, there is no rule of thumb for the optimal confguration.Funding for open access charge: CRUE-Universitat Jaume

    Performance–energy trade-offs of deep learning convolution algorithms on ARM processors

    Get PDF
    In this work, we assess the performance and energy efficiency of high-performance codes for the convolution operator, based on the direct, explicit/implicit lowering and Winograd algorithms used for deep learning (DL) inference on a series of ARM-based processor architectures. Specifically, we evaluate the NVIDIA Denver2 and Carmel processors, as well as the ARM Cortex-A57 and Cortex-A78AE CPUs as part of a recent set of NVIDIA Jetson platforms. The performance–energy evaluation is carried out using the ResNet-50 v1.5 convolutional neural network (CNN) on varying configurations of convolution algorithms, number of threads/cores, and operating frequencies on the tested processor cores. The results demonstrate that the best throughput is obtained on all platforms with the Winograd convolution operator running on all the cores at their highest frequency. However, if the goal is to reduce the energy footprint, there is no rule of thumb for the optimal configuration.Open Access funding provided thanks to the CRUE-CSIC agreement with Springer Nature. This research was funded by Project PID2020-113656RB-C21/C22 supported by MCIN/AEI/10.13039/501100011033. Manuel F. Dolz was also supported by the Plan Gen–T grant CDEIGENT/2018/014 of the Generalitat Valenciana. Héctor Martínez is a POSTDOC_21_00025 fellow supported by Junta de Andalucía. Adrián Castelló is a FJC2019-039222-I fellow supported by MCIN/AEI/10.13039/501100011033. Antonio Maciá is a PRE2021-099284 fellow supported by MCIN/AEI/10.13039/501100011033
    corecore