15 research outputs found
Automatic vector generation guided by a functional metric
Verification is still the bottleneck of the complex digital system design process. Formal techniques have advanced in their capacity to handle more complex descriptions, but they still suffer from problems of memory or time explosion. Simulation-based techniques handle descriptions of any size or complexity, but the efficiency of these techniques is reduced with the increase in the system complexity because of the exponential increase in the number of simulation tests necessary to maintain the coverage. Semi-formal techniques combine the advantages of simulation and formal techniques as they increase the efficiency of simulation-based verification. In this area, several research works have introduced techniques that automate the generation of vectors driven by traditional coverage metrics. However, these techniques do not ensure the detection of 100% of faults. This paper presents a novel technique for the generation of vectors. A major benefit of the technique is the more efficient generation of test-benches than when using techniques based on structural metrics. The technique introduced is more efficient since it relies on a novel coverage metric, which is more directly correlated to functional faults than structural coverage metrics (line, branch, etc.). The proposed coverage metric is based on an abstraction of the system as a set of polynomials where all system behaviours are described by a set of coefficients. By assuming a finite precision of coefficients and a maximum degree of polynomials, all the system behaviors, including both the correct and the incorrect ones, can be modeled. This technique applies mathematical theories (computer algebra and number theory) to calculate the coverage and to generate vectors which maximize coverage. Moreover, in this work, a tool which implements the technique has been developed. This tool takes a C-based system description and provides the coverage and the generated vectors as output
FOTV: a generic device offloading framework for OpenMP
Since the introduction of the “target” directive in the 4.0 specification, the usage of OpenMP for heterogeneous computing programming has increased significantly. However, the compiler support limits its usage because the code for the accelerated region has to be generated in compile time. This restricts the usage of accelerator-specific design flows (e.g. FPGA hardware synthesis) and the support of new devices that typically requires extending and modifying the compiler itself.
This paper explores a solution to these limitations: a generic device that is supported by the OpenMP compiler but whose functionality is defined at runtime. The generic device framework has been integrated in an OpenMP compiler (LLVM/Clang). It acts as a device type for the compiler and interfaces with the physical devices to execute the accelerated code. The framework has an API that provides support for new devices and accelerated code without additional OpenMP compiler modifications. It also includes a code generator that extracts the source code of OpenMP target regions for external compilation chains.
In order to evaluate the approach, we present a new device implementation that allows executing OpenCL code as an OpenMP target region. We study the overhead that the framework produces and show that it is minimal and comparable to other OpenMP devices.This work was done as part of the FitOptiVis project, funded by the ECSEL Joint Undertaking, grant H2020-ECSEL-2017–2-783162, and the Spanish MICINN, grant PCI2018–093057. It was partially funded by the Platino project, funded by the MICINN, grant TEC2017–86722-C4–3-R
Método y dispositivo para la actualización de datos en dispositivos electrónicos
Método para la actualización de datos en un dispositivo electrónico conectado a una red de comunicaciones y dispositivo, donde el dispositivo electrónico consta de al menos una primera unidad de memoria de tipo no volátil y una segunda unidad de memoria de tipo volátil y de un procesador, donde en las unidades de memoria se encuentran almacenados conjuntos de datos, donde en la primera unidad de memoria se encuentra almacenada una primera tabla con al menos una entrada para cada conjunto de datos. El método comprende las siguientes etapas: - Recibir a través de la red de comunicaciones, un fichero de datos de actualización. - Para cada conjunto de datos: almacenar dicha nueva versión del conjunto de datos en posiciones libres de una de las unidades de memoria; y añadir a la primera tabla una entrada indicando la posición en la unidad de memoria y si dicha entrada es válida. - Actualizar una segunda tabla almacenada en la segunda unidad de memoria. Un producto de programa de ordenador que comprende instrucciones ejecutables por ordenador para realizar el procedimiento según el método de la invención. Un medio de almacenamiento de datos digitales que codifica un programa de instrucciones ejecutable por máquina, para realizar el procedimiento según el método de la invención.Solicitud: 201301076 (13.11.2013)Nº Pub. de Solicitud: ES2481343A2 (05.08.2014)Nº de Patente: ES2481343B2 (23.10.2015
Memory-efficient belief propagation for high-definition real-time stereo matching systems
Tele-presence systems will enable participants to feel like they are physically together. In order to improve this feeling, these systems are starting to include depth estimation capabilities. A typical requirement for these systems includes high definition, good quality results and low latency. Benchmarks demonstrate that stereo-matching algorithms using Belief Propagation (BP) produce the best results. The execution time of the BP algorithm in a CPU cannot satisfy real-time requirements with high-definition images. GPU-based implementations of BP algorithms are only able to work in real-time with small-medium size images because the traffic with memory limits their applicability. The inherent parallelism of the BP algorithm makes FPGA-based solutions a good choice. However, even though the memory traffic of a commercial FPGA-based ASIC-prototyping board is high, it is still not enough to comply with realtime, high definition and good immersive feeling requirements. The work presented estimates depth maps in less than 40 milliseconds for high-definition images at 30fps with 80 disparity levels. The proposed double BP topology and the new data-cost estimation improve the overall classical BP performance while they reduce the memory traffic by about 21%. Moreover, the adaptive message compression method and message distribution in memory reduce the number of memory accesses by more than 70% with an almost negligible loss of performance. The total memory traffic reduction is about 90%, demonstrating sufficient quality to be classified within the first 40 positions in the Middlebury ranking.This work has been partially supported by the CDTI under project CENIT-VISION 2007-1007 and the
CICYT under TEC2008-04107
Design space exploration in heterogeneous platforms using OpenMP
In the fields of high performance computing (HPC) and embedded systems, the current trend is to employ heterogeneous platforms which integrate general purpose CPUs with specialized accelerators such as GPUs and FPGAs. Programming these architectures to approach their theoretical performance limits is a complex issue. In this article, we present a design methodology targeting heterogeneous platforms which combines a novel dynamic offloading mechanism for OpenMP and a scheduling strategy for assigning tasks to accelerator devices. The current OpenMP offloading model depends on the compiler supporting each target device, with many architectures still unsupported by the most popular compilers, such as GCC and Clang. In our approach, the software and/or hardware design flows for programming the accelerators are dissociated from the host OpenMP compiler and the device-specific implementations are dynamically loaded at runtime. Moreover, the assignment of tasks to computing resources is dynamically evaluated at runtime, with the aim of maximizing performance when using the available resources. The proposed methodology has been applied to a video processing system as a test case. The results demonstrate the flexibility of the proposal by exploiting different heterogeneous platforms and design particularities of devices, leading to a significant performance improvement.This work has been funded by FEDER/Ministerio de Ciencia, Innovación y Universidades – Agencia Estatal de Investigacion/TEC2017-86722-C4-3-R, also under the FitOptiVis Project (ECSEL2017-1-737451), which is funded by the EU (H2020) and Ministerio de Ciencia, Innovación y Universidades
Runtime reconfigurable system for decommissioned satellite identification and capture
The increasing number of space missions has led to the accumulation of space debris, becoming a problem to be taken into account. A possible solution to eliminate rubble in space consists of launching satellites capable of detecting these obstacles and then destroy them. This paper presents a system for decomissioned satellite identification and capture. The system was developed with a methodology which provides support for component management as well as runtime system reconfiguration. The proposed solution is able to reconfigure itself at runtime with different configurations that provide different performance and energy consumption strategies to adapt to the environmental conditions during a space mission.This work was done as part of the FitOptiVis project, funded by the ECSEL Joint Undertaking (grant H2020-ECSEL-2017-2-783162), and the Platino project, funded by Spanish MINECO (Reference TEC2017-86722-C4-3-R
Pre-silicon FEC decoding verification on SoC FPGAs
Forward error correction (FEC) decoding hardware modules are challenging to verify at pre-silicon stage, when they are usually described at register-transfer (RT)/logic level with a hardware description language (HDL). They tend to hide faults due to their inherent tendency to correct errors and the required simulations with a massive insertion of inputs are too slow. In this work, two verification techniques based on FPGA-prototyping are applied in order to complement the mentioned simulations: golden model vs implementation matching with thousands of random codewords and codeword/bit error rate (CER/BER) curve computation. For this purpose, a system on chip (SoC) field-programmable gate array (FPGA) is used, implementing in the programmable hardware part several replicas of the decoder (exploiting the parallel capabilities of hardware) and managing the verification by parallel programming the software part of the SoC (exploiting the presence of multiple processing cores). The presented approach allows a seamless integration with high-level models, does not need expensive testing/emulation platforms and obtains the results in a reasonable amount of time.This work has been supported by Project TEC2017-86722-C4-3-R, funded by Spanish MICINN/AEI
Método y sistema de localización espacial mediante marcadores luminosos para cualquier ambiente
Método y sistema de localización espacial de un objetivo en un entorno tridimensional que comprende al menos un marcador luminoso comprendiendo: - una cámara estéreo para capturar una primera trama de imagen en un instante actual y una segunda trama de imagen en un instante anterior; - un dispositivo de medida de ángulos para obtener un ángulo de giro del objetivo; - un procesador de señales con acceso a una memoria que almacena entre otros un radio del, al menos un, marcador detectado en un instante de tiempo actual n y en un instante de tiempo anterior n-1 configurado para calcular unas coordenadas (xi, yi) del objetivo en un instante de tiempo i como sigue: - si el ángulo de giro en el instante de tiempo actual y en el instante de tiempo anterior son distintos, (xn, yn) = (xn-1, yn-1);- si las dos tramas de imagen son iguales, (x{sub,n, yn) = (xn-1, yn-1); - en otro caso: - si los radios son iguales y hay varios marcadores, (xn, yn) se calculan mediante triangulación usando ambas tramas de imagen; - si los radios son distintos y hay varios marcadores, (xn, yn) se calculan mediante triangulación usando una sola trama de imagen; - si los radios son distintos y hay un único marcador, (xn, yn) se calculan mediante geometría estéreo; - si los radios son iguales y hay un único marcador, (xn, yn) se calculan usando coordenadas de imagen del marcador en el instante actual y en el anterior.Solicitud: 201500011 (23.12.2014)Nº Pub. de Solicitud: ES2543038A1 (13.08.2015)Nº de Patente: ES2543038B2 (26.11.2015
PSAL : estudio, análisis e implementación de algoritmos de síntesis de alto nivel
En los últimos años se ha producido un gran avance en el desarrollo de herramientas de diseño asistido por computador (cad) en microelectrónica, motivado en gran medida por la creciente complejidad de los circuitos integrados digitales. Este proceso ha incidido principalmente en la automatización del diseño desde el nivel lógico al layout, mientras que las etapas iniciales (especificación del algoritmo y determinación de la arquitectura) siguen dependiendo del diseñador. En la presente tesis se aborda el estudio, análisis e implementación de herramientas de síntesis de alto nivel, capaces de proponer la arquitectura del sistema digital que mejor implementa el comportamiento descrito a nivel algorítmico al tiempo que satisface una serie de restricciones impuestas por el diseñador. Los sistemas desarrollados, psal1 y psal2, parten de una descripción algorítmica en vhdl o isps y generan una arquitectura que describen en vhdl, cvs, bk o ddl, utilizando los algoritmos de síntesis de alto nivel propuestos en la tesis doctoral, la conexión de estas herramientas con sistemas de síntesis a nivel de transferencia de registros, permite disponer de una metodología de diseño automático desde el nivel algorítmico al layout
Método y dispositivo para la actualización eficiente de datos en dispositivos electrónicos
ABSTRACT: The invention relates to a method and a device for efficiently updating data in electronic devices, solving problems presented by existing techniques in a simple manner. The invention allows the device to be updated rapidly, with low energy consumption, and minimising the number of times the non-volatile memory unit (for example, flash) is erased, at a profitable cost.RESUMEN: Método y dispositivo para la actualización eficiente de datos en dispositivos electrónicos que resuelve de manera simple, los problemas que presentan las técnicas existentes. Permite la actualización eficiente del dispositivo de una manera rápida, con bajo consumo de energía y minimizando el número de borrados de la unidad de memoria no volátil (por ejemplo, flash) a un coste ventajoso.Solicitud: PCT/ES2014/000050 (27.03.2014)Nº de Publicación: WO2015/071507A1 (21.05.2015