47 research outputs found

    Comprehensive characterization of an open source document search engine

    Get PDF
    This work performs a thorough characterization and analysis of the open source Lucene search library. The article describes in detail the architecture, functionality, and micro-architectural behavior of the search engine, and investigates prominent online document search research issues. In particular, we study how intra-server index partitioning affects the response time and throughput, explore the potential use of low power servers for document search, and examine the sources of performance degradation ands the causes of tail latencies. Some of our main conclusions are the following: (a) intra-server index partitioning can reduce tail latencies but with diminishing benefits as incoming query traffic increases, (b) low power servers given enough partitioning can provide same average and tail response times as conventional high performance servers, (c) index search is a CPU-intensive cache-friendly application, and (d) C-states are the main culprits for performance degradation in document search.Web of Science162art. no. 1

    Experiencia de innovación docente siguiendo las directrices del Espacio Europeo de Educación Superior en la enseñanza del diseño digital

    Get PDF
    El nuevo marco educativo superior establecido en la declaración de Bolonia ha introducido en nuestra metodología docente una serie de cambios muy significativos que, a nivel autonómico catalán, ha sido liderado por el DURSI (Departament d’Universitats, Recerca i Societat de la Informació). De esta forma, dicho departamento propuso la adaptación de ciertas titulaciones al futuro modelo europeo (EEES) entre las que se hallaba la de Ingeniería Informática. El departamento al que pertenecemos (Microelectrónica y Sistemas Electrónicos), ha puesto en marcha la implantación de una nueva metodología docente en un reducido número de asignaturas troncales, pertenecientes a la citada titulación, donde existe como factor común entre ellas, un alto grado de especialización tecnológica y un gran número de alumnos (más de 200). El objetivo fundamental de este trabajo es mostrar todo el proceso de adaptación que ha sido necesario para la ejecución con plenas garantías de dicha reforma educativa en nuestra asignatura (Diseño de Sistemas Digitales), así como todas las herramientas didácticas utilizadas para poder implantar dichas técnicas en una asignatura con un alto grado de experimentación práctica en los laboratorios y fomentar un mayor grado de participación y consecución de objetivos por parte del alumnado. Para conseguir el éxito final de dicho proceso, hemos utilizado una metodología docente basada en el PBL (“Problem Based Learning”), en el uso intensivo del conocido “e-learning, así como en una introducción progresiva al aprendizaje cooperativo; siempre siguiendo las directrices del grupo IDES (Grupo de Innovación Docente en Educación Superior) de la UAB. La asignatura en cuestión, ha sido estructurada en tres grandes secciones: Sesiones de teoría; Seminarios de problemas y Sesiones de laboratorios. En la sección de resultados evaluaremos el alto grado de objetivos cumplidos, la mejora significativa en el trabajo cooperativo, el incremento de la capacidad de organización y planificación, así como la buena capacidad en la resolución de problemas. A su vez, se ha conseguido que se haya obtenido un 91% de superación de la asignatura de los alumnos que han seguido los seminarios.The new higher educative framework established in the Bologna Declaration has introduced to our educational methodology a series of very significant changes that, at Catalan autonomic level, have been conducted by the DURSI (the Catalan Ministry of Universities, Research and the Information Society). With this aim, the ministry proposed the adaptation of certain degrees to the future European model, being Computer Engineering one of them. The department which we belong to (Department of Microelectronics and Electronic Systems), has started up the implantation of a new teaching methodology in a small number of core subjects, belonging to the aforementioned degree, which have in common a high level of technological specialization and a large number of students (more than 200). The main goal of our paper is to explain all the adaptation process that has been necessary for the execution of this educative reform in the subject we teach (Digital System Design). We also show all the didactic tools used to implant the new teaching methodology in a subject with a high level of experimentation in the laboratories. These tools also foment a greater degree of participation and attainment of objectives on the students’ part. In order to successfully carry out this process, we have used an educational methodology based on the Problem Based Learning technique (PBL) [2], the intensive use of e‐learning and the progressive introduction to the cooperative learning [3], always following the directives of the Unit for Innovation in Higher Education Teaching (IDES) from the UAB. The subject at issue has been structured in three great sections: Theory classes; Problem seminars y Laboratory sessions. In the results section, we will present the high level of fulfilled objectives, the significant improvement in cooperative working, the organization and planning capacity increase as well as the good aptitude in problem resolution. Moreover, the 91 per cent of the students who followed the seminars passed the subject

    Precision-Aware application execution for Energy-optimization in HPC node system

    Get PDF
    Power consumption is a critical consideration in high performance computing systems and it is becoming the limiting factor to build and operate Petascale and Exascale systems. When studying the power consumption of existing systems running HPC workloads, we find that power, energy and performance are closely related which leads to the possibility to optimize energy consumption without sacrificing (much or at all) the performance. In this paper, we propose a HPC system running with a GNU/Linux OS and a Real Time Resource Manager (RTRM) that is aware and monitors the healthy of the platform. On the system, an application for disaster management runs. The application can run with different QoS depending on the situation. We defined two main situations. Normal execution, when there is no risk of a disaster, even though we still have to run the system to look ahead in the near future if the situation changes suddenly. In the second scenario, the possibilities for a disaster are very high. Then the allocation of more resources for improving the precision and the human decision has to be taken into account. The paper shows that at design time, it is possible to describe different optimal points that are going to be used at runtime by the RTOS with the application. This environment helps to the system that must run 24/7 in saving energy with the trade-off of losing precision. The paper shows a model execution which can improve the precision of results by 65% in average by increasing the number of iterations from 1e3 to 1e4. This also produces one order of magnitude longer execution time which leads to the need to use a multi-node solution. The optimal trade-off between precision vs. execution time is computed by the RTOS with the time overhead less than 10% against a native execution

    Differential metabolic profiles associated to movement behaviour of stream-resident brown trout (Salmo trutta)

    Get PDF
    The mechanisms that can contribute in the fish movement strategies and the associated behaviour can be complex and related to the physiology, genetic and ecology of each species. In the case of the brown trout (Salmo trutta), in recent research works, individual differences in mobility have been observed in a population living in a high mountain river reach (Pyrenees, NE Spain). The population is mostly sedentary but a small percentage of individuals exhibit a mobile behavior, mainly upstream movements. Metabolomics can reflect changes in the physiological process and can determine different profiles depending on behaviour. Here, a non-targeted metabolomics approach was used to find possible changes in the blood metabolomic profile of S. trutta related to its movement behaviour, using a minimally invasive sampling. Results showed a differentiation in the metabolomic profiles of the trouts and different level concentrations of some metabolites (e.g. cortisol) according to the home range classification (pattern of movements: sedentary or mobile). The change in metabolomic profiles can generally occur during the upstream movement and probably reflects the changes in metabolite profile from the non-mobile season to mobile season. This study reveals the contribution of the metabolomic analyses to better understand the behaviour of organisms.This study has been supported and financed by the Biodiversity Conservation Plan of ENDESA, S.A. (ENEL Group)

    Activating cannabinoid receptor 2 preserves axonal health through GSK-3β/NRF2 axis in adrenoleukodystrophy

    Get PDF
    Aberrant endocannabinoid signaling accompanies several neurodegenerative disorders, including multiple sclerosis. Here, we report altered endocannabinoid signaling in X-linked adrenoleukodystrophy (X-ALD), a rare neurometabolic demyelinating syndrome caused by malfunction of the peroxisomal ABCD1 transporter, resulting in the accumulation of very long-chain fatty acids (VLCFAs). We found abnormal levels of cannabinoid receptor 2 (CB2r) and related endocannabinoid enzymes in the brain and peripheral blood mononuclear cells (PBMCs) of X-ALD patients and in the spinal cord of a murine model of X-ALD. Preclinical treatment with a selective agonist of CB2r (JWH133) halted axonal degeneration and associated locomotor deficits, along with normalization of microgliosis. Moreover, the drug improved the main metabolic disturbances underlying this model, particularly in redox and lipid homeostatic pathways, including increased lipid droplets in motor neurons, through the modulation of the GSK-3β/NRF2 axis. JWH133 inhibited Reactive Oxygen Species elicited by excess VLCFAs in primary microglial cultures of Abcd1-null mice. Furthermore, we uncovered intertwined redox and CB2r signaling in the murine spinal cords and in patient PBMC samples obtained from a phase II clinical trial with antioxidants (NCT01495260). These findings highlight CB2r signaling as a potential therapeutic target for X-ALD and perhaps other neurodegenerative disorders that present with dysregulated redox and lipid homeostasis.This study was funded by the Institute of Health Carlos III through projects [PI19/01008] to SF and [PI20/00759] to AP (co-funded by the European Regional Development Fund, ERDF, a way to build Europe), Miguel Servet program [CPII16/00016] to SF and [PFIS, FI18/00141] to LPS (co-funded by the European Social Fund, ESF investing in your future). This study was also funded by grants from the Spanish Ministry of Health, Social Services and Equality (EC10-137), the Autonomous Government of Catalonia [2017SGR1206], the Hesperia Foundation, CERTIS Obres i Serveis, and the Crowd funding Campaign Arnau’97 to AP. JP was a predoctoral fellow of IDIBELL. The Center for Biomedical Research on Rare Diseases (CIBERER), an initiative of the Institute of Health Carlos III, funded the position of MR. Locomotor experiments were performed by the SEFALer unit F5 led by AP, which belongs to the CIBERER structure. We thank the CERCA Program/Generalitat de Catalunya for institutional support

    Oxidative damage compromises energy metabolism in the axonal degeneration mouse model of X-adrenoleukodystrophy

    Get PDF
    Aims: Chronic metabolic impairment and oxidative stress are associated with the pathogenesis of axonal dysfunction in a growing number of neurodegenerative conditions. To investigate the intertwining of both noxious factors, we have chosen the mouse model of adrenoleukodystrophy (X-ALD), which exhibits axonal degeneration in spinal cords and motor disability. The disease is caused by loss of function of the ABCD1 transporter, involved in the import and degradation of very long-chain fatty acids (VLCFA) in peroxisomes. Oxidative stress due to VLCFA excess appears early in the neurodegenerative cascade. Results: In this study, we demonstrate by redox proteomics that oxidative damage to proteins specifically affects five key enzymes of glycolysis and TCA (Tricarboxylic acid) cycle in spinal cords of Abcd1(-) mice and pyruvate kinase in human X-ALD fibroblasts. We also show that NADH and ATP levels are significantly diminished in these samples, together with decrease of pyruvate kinase activities and GSH levels, and increase of NADPH. Innovation: Treating Abcd1(-) mice with the antioxidants N-acetylcysteine and alpha-lipoic acid (LA) prevents protein oxidation; preserves NADH, NADPH, ATP, and GSH levels; and normalizes pyruvate kinase activity, which implies that oxidative stress provoked by VLCFA results in bioenergetic dysfunction, at a presymptomatic stage. Conclusion: Our results provide mechanistic insight into the beneficial effects of antioxidants and enhance the rationale for translation into clinical trials for X-adrenoleukodystrophy. Antioxid. Redox Signal. 15, 2095-2107

    Design Space Exploration of heterogeneous SoC Platforms for a Data-Dominant Application

    Get PDF
    El objetivo principal de esta tesis es obtener un conjunto de implementaciones de un sistema especificado en alto nivel y bajarlo a diferentes plataformas arquitectónicas. Esto ha permitido realizar una comparación justa que incluye la cadena de diseño, metodología hacia las diversas plataformas de silicio. Esta comparación usa cuatro variables para su evaluación (el tiempo de ejecución, el área del chip, el consumo de energía y el tiempo de diseño) y produce un mapa de puntos de las diferentes implementaciones óptimas de acuerdo con un conjunto de requerimientos de operación. Se ha construido un completo IP un compresor MPEG-4 Main Profile. Este estándar de video codificación es un buen ejemplo de referencia, bastante popular en la literatura científica y es también un ejemplo adecuado de aplicación basada en flujo de datos. Por tanto, los resultados extraídos de esta tesis pueden ser extendidos a otras aplicaciones basadas en IPs con tratamiento de flujo de datos. He considerado necesario la computación de imágenes con restricciones de tiempo real. Y por tanto, se deseaba disponer del diseño más flexible posible para poder mapear las mismas especificaciones en las diferentes plataformas. Para este propósito, se ha elegido SystemC/C++ como lenguaje de descripción del sistema e idear los diferentes flujos de implementación para las diferentes arquitecturas y plataformas de silicio. Este poderoso marco de trabajo permite comparar implementaciones de una forma objetiva y razonada. Ya que nuestros resultados vienen de un αnico modelo y los diseños fueron mapeados en la misma tecnología de silicio (90nm CMOS). El resultado de este trabajo de investigación es un juego de criterios y un mapa de las soluciones disponibles sobre el espacio de funcionamiento más bien que una aserción que dice que una solución αnica es mejor que las otras. Mi intención ha sido desarrollar técnicas y formular los métodos que pueden permitir aumentar la productividad en el diseño. Este desarrollo puede ser extendido al nuevo paradigma de intercomunicación: Aquellos que usan técnicas DVFS y basadas en NoC para exploraciones e implementaciones MPSoC. Consideramos la contribución mas significativa es el desarrollo del modelo con el cual se han realizado los diversos experimentos: El compresor MPEG que se ha realizado en SystemC/C++. Se ha realizado de la forma que implementaciones mαltiples son posibles: que van desde una parte grande en HW hasta la que se carga en un VLIW. En el caso de la FPGA y el ASIC, se han realizado dos implementaciones. Hemos obtenido un conjunto de resultados para siete diferentes implementaciones con cuatro diferentes objetivos HW (FPGA, ASIC, DSP y ASIP) con diferentes arquitecturas internas, seleccionadas para obtener puntos óptimos. Esto nos da que un incremento en eficiencia del 56 % para velocidad versus 26 % en energía en la solución FSME (20% para velocidad y 57 % para energía en la solución FAST). En el caso de los ISPs, las mejoras en el código se han realizado de forma que se obtienen implementaciones mejores que las que se conseguirían con una implementación directa del código no solo mejoras en el código sino mejoras en las microarquitecturas de silicio que forman el VLIW en el caso del ASIP. Otra contribución ha sido la realización de una NoC a nivel funcional en SystemC.The main goal of this thesis is to obtain a set of results for the implementation of a given system level application down to different architectural platforms. This allowed carrying out a fair comparison that includes to build the whole system and to complete the design chain to the diverse silicon targets. This comparison uses four variables for its evaluation (execution time, chip area, energy consumption and design time) and produces a map of different optimal implementation points according to a given set or operating requirements. I built a complete MPEG-4 MP. This standard is a well known reference example, pretty popular in the scientific literature and this compressor is also a fine example of data-flow application. Therefore, results extracted from this thesis can be extended to other data-flow applications. I considered necessary to compute image compression with real-time constraints. Hence, I would like to dispose of the most flexible design possible in order to map the same specification into the different platforms. For that purpose, I chose SystemC/C++ as description system level language and setup the different implementation flows for the different architectural and silicon platforms. This powerful framework allows comparing implementations in a reasonably objective way. Since our results come from a unique reference model and all designs were finally mapped in the same silicon technology (90nm CMOS). The result of this research work is a set of criteria and a map of the available solutions on the performance space rather than an assertion saying that a unique solution is better than others. My intention has been to develop techniques and formulate methods that increased design productivity. This development can be further applied to the new parading of implementations: those that use DVFS techniques and NoC-based MPSoc implementation explorations. We consider the most important contribution is the development of the model able to achieve the different experiments: the MPEG compressor that has been realized in SystemC/C ++. It is designed in a way that multiple implementations are possible, ranging from a large part in HW up to loaded in an accelerator as a VLIW. In case of the FPGA and ASIC, two implementations have been carried out. We obtained a set of values for seven different implementations targeting four different HW platforms (FPGA, ASIC, DSP and ASIP) with diverse internal architectures, selected to get optimal points. In the case of ASIC, we managed to end up with the layouts of the two solutions. This led to an increase in efficiency of 56 % for speed versus 26 % for energy (in FSME solution 20% for speed and 57% for energy in FAST solution). In case of the ISPs, code improvements have been accomplished to come up to more ideal solutions with regard to those who would be obtained by a direct implementation. In case of the ASIP the improvements have not only been realized in the code but also in the silicon micro architecture that form the VLIW. Other contribution is the accomplishment of a functional NoC in SystemC

    Towards a Scalable Software Defined Network-on-Chip for Next Generation Cloud

    Get PDF
    The rapid evolution of Cloud-based services and the growing interest in deep learning (DL)-based applications is putting increasing pressure on hyperscalers and general purpose hardware designers to provide more efficient and scalable systems. Cloud-based infrastructures must consist of more energy efficient components. The evolution must take place from the core of the infrastructure (i.e., data centers (DCs)) to the edges (Edge computing) to adequately support new/future applications. Adaptability/elasticity is one of the features required to increase the performance-to-power ratios. Hardware-based mechanisms have been proposed to support system reconfiguration mostly at the processing elements level, while fewer studies have been carried out regarding scalable, modular interconnected sub-systems. In this paper, we propose a scalable Software Defined Network-on-Chip (SDNoC)-based architecture. Our solution can easily be adapted to support devices ranging from low-power computing nodes placed at the edge of the Cloud to high-performance many-core processors in the Cloud DCs, by leveraging on a modular design approach. The proposed design merges the benefits of hierarchical network-on-chip (NoC) topologies (via fusing the ring and the 2D-mesh topology), with those brought by dynamic reconfiguration (i.e., adaptation). Our proposed interconnect allows for creating different types of virtualised topologies aiming at serving different communication requirements and thus providing better resource partitioning (virtual tiles) for concurrent tasks. To further allow the software layer controlling and monitoring of the NoC subsystem, a few customised instructions supporting a data-driven program execution model (PXM) are added to the processing element’s instruction set architecture (ISA). In general, the data-driven programming and execution models are suitable for supporting the DL applications. We also introduce a mechanism to map a high-level programming language embedding concurrent execution models into the basic functionalities offered by our SDNoC for easing the programming of the proposed system. In the reported experiments, we compared our lightweight reconfigurable architecture to a conventional flattened 2D-mesh interconnection subsystem. Results show that our design provides an increment of the data traffic throughput of 9.5% and a reduction of 2.2× of the average packet latency, compared to the flattened 2D-mesh topology connecting the same number of processing elements (PEs) (up to 1024 cores). Similarly, power and resource (on FPGA devices) consumption is also low, confirming good scalability of the proposed architecture
    corecore