461 research outputs found
LLM for SoC Security: A Paradigm Shift
As the ubiquity and complexity of system-on-chip (SoC) designs increase
across electronic devices, the task of incorporating security into an SoC
design flow poses significant challenges. Existing security solutions are
inadequate to provide effective verification of modern SoC designs due to their
limitations in scalability, comprehensiveness, and adaptability. On the other
hand, Large Language Models (LLMs) are celebrated for their remarkable success
in natural language understanding, advanced reasoning, and program synthesis
tasks. Recognizing an opportunity, our research delves into leveraging the
emergent capabilities of Generative Pre-trained Transformers (GPTs) to address
the existing gaps in SoC security, aiming for a more efficient, scalable, and
adaptable methodology. By integrating LLMs into the SoC security verification
paradigm, we open a new frontier of possibilities and challenges to ensure the
security of increasingly complex SoCs. This paper offers an in-depth analysis
of existing works, showcases practical case studies, demonstrates comprehensive
experiments, and provides useful promoting guidelines. We also present the
achievements, prospects, and challenges of employing LLM in different SoC
security verification tasks.Comment: 42 page
Combating Fake News: A Gravity Well Simulation to Model Echo Chamber Formation In Social Media
Fake news has become a serious concern as distributing misinformation has become easier and more impactful. A solution is critically required. One solution is to ban fake news, but that approach could create more problems than it solves, and would also be problematic from the beginning, as it must first be identified to be banned. We initially propose a method to automatically recognize suspected fake news, and to provide news consumers with more information as to its veracity. We suggest that fake news is comprised of two components: premises and misleading content. Fake news can be condensed down to a collection of premises, which may or may not be true, and to various forms of misleading material, including biased arguments and language, misdirection, and manipulation. Misleading content can then be exposed. While valuable, this framework’s utility may be limited by artificial intelligence, which can be used to alter fake news strategies at a rate exceeding the ability to update the framework. Therefore, we propose a model for identifying echo chambers, which are widely reported to be havens for fake news producers and consumers. We simulate a social media interest group as a gravity well, through which we identify the online groups postured to become echo chambers, and thus a source for fake news consumption and replication. This echo chamber model is supported by three pillars related to the social media group: technology employed, topic explored, and confirmation bias of group members. The model is validated by modeling and analyzing 19 subreddits on the Reddit social media platform. Contributions include a working definition for fake news, a framework for recognizing fake news, a generic model for social media echo chambers including three pillars central to echo chamber formation, and a gravity well simulation for social media groups, implemented for 19 subreddits
XMD: An Expansive Hardware-telemetry based Mobile Malware Detector to enhance Endpoint Detection
Hardware-based Malware Detectors (HMDs) have shown promise in detecting
malicious workloads. However, the current HMDs focus solely on the CPU core of
a System-on-Chip (SoC) and, therefore, do not exploit the full potential of the
hardware telemetry. In this paper, we propose XMD, an HMD that uses an
expansive set of telemetry channels extracted from the different subsystems of
SoC. XMD exploits the thread-level profiling power of the CPU-core telemetry,
and the global profiling power of non-core telemetry channels, to achieve
significantly better detection performance than currently used Hardware
Performance Counter (HPC) based detectors. We leverage the concept of manifold
hypothesis to analytically prove that adding non-core telemetry channels
improves the separability of the benign and malware classes, resulting in
performance gains. We train and evaluate XMD using hardware telemetries
collected from 723 benign applications and 1033 malware samples on a commodity
Android Operating System (OS)-based mobile device. XMD improves over currently
used HPC-based detectors by 32.91% for the in-distribution test data. XMD
achieves the best detection performance of 86.54% with a false positive rate of
2.9%, compared to the detection rate of 80%, offered by the best performing
signature-based Anti-Virus(AV) on VirusTotal, on the same set of malware
samples.Comment: Revised version based on peer review feedback. Manuscript to appear
in IEEE Transactions on Information Forensics and Securit
Survey of Soft Error Mitigation Techniques Applied to LEON3 Soft Processors on SRAM-Based FPGAs
Soft-core processors implemented in SRAM-based FPGAs are an attractive option for applications to be employed in radiation environments due to their flexibility, relatively-low application development costs, and reconfigurability features enabling them to adapt to the evolving mission needs. Despite the advantages soft-core processors possess, they are seldom used in critical applications because they are more sensitive to radiation than their hard-core counterparts. For instance, both the logic and signal routing circuitry of a soft-core processor as well as its user memory are susceptible to radiation-induced faults. Therefore, soft-core processors must be appropriately hardened against ionizing-radiation to become a feasible design choice for harsh environments and thus to reap all their benefits. This survey henceforth discusses various techniques to protect the configuration and user memories of an LEON3 soft processor, which is one of the most widely used soft-core processors in radiation environments, as reported in the state-of-the-art literature, with the objective of facilitating the choice of right fault-mitigation solution for any given soft-core processor
LLM for SoC Security: A Paradigm Shift
As the ubiquity and complexity of system-on-chip (SoC) designs increase across electronic devices, the task of incorporating security into an SoC design flow poses significant challenges. Existing security solutions are inadequate to provide effective verification of modern SoC designs due to their limitations in scalability, comprehensiveness, and adaptability. On the other hand, Large Language Models (LLMs) are celebrated for their remarkable success in natural language understanding, advanced reasoning, and program synthesis tasks. Recognizing an opportunity, our research delves into leveraging the emergent capabilities of Generative Pre-trained Transformers (GPTs) to address the existing gaps in SoC security, aiming for a more efficient, scalable, and adaptable methodology. By integrating LLMs into the SoC security verification paradigm, we open a new frontier of possibilities and challenges to ensure the security of increasingly complex SoCs. This paper offers an in-depth analysis of existing works, showcases practical case studies, demonstrates comprehensive experiments, and provides useful promoting guidelines. We also present the achievements, prospects, and challenges of employing LLM in different SoC security verification tasks
Error Detection and Diagnosis for System-on-Chip in Space Applications
Tesis por compendio de publicacionesLos componentes electrónicos comerciales, comúnmente llamados componentes
Commercial-Off-The-Shelf (COTS) están presentes en multitud de dispositivos habituales
en nuestro día a día. Particularmente, el uso de microprocesadores y sistemas en chip (SoC)
altamente integrados ha favorecido la aparición de dispositivos electrónicos cada vez más
inteligentes que sostienen el estilo de vida y el avance de la sociedad moderna. Su uso se
ha generalizado incluso en aquellos sistemas que se consideran críticos para la seguridad,
como vehículos, aviones, armamento, dispositivos médicos, implantes o centrales eléctricas.
En cualquiera de ellos, un fallo podría tener graves consecuencias humanas o económicas.
Sin embargo, todos los sistemas electrónicos conviven constantemente con factores internos
y externos que pueden provocar fallos en su funcionamiento. La capacidad de un sistema
para funcionar correctamente en presencia de fallos se denomina tolerancia a fallos, y es
un requisito en el diseño y operación de sistemas críticos.
Los vehículos espaciales como satélites o naves espaciales también hacen uso de
microprocesadores para operar de forma autónoma o semi autónoma durante su vida útil,
con la dificultad añadida de que no pueden ser reparados en órbita, por lo que se consideran
sistemas críticos. Además, las duras condiciones existentes en el espacio, y en particular
los efectos de la radiación, suponen un gran desafío para el correcto funcionamiento de los
dispositivos electrónicos. Concretamente, los fallos transitorios provocados por radiación
(conocidos como soft errors) tienen el potencial de ser una de las mayores amenazas para
la fiabilidad de un sistema en el espacio.
Las misiones espaciales de gran envergadura, típicamente financiadas públicamente
como en el caso de la NASA o la Agencia Espacial Europea (ESA), han tenido
históricamente como requisito evitar el riesgo a toda costa por encima de cualquier
restricción de coste o plazo. Por ello, la selección de componentes resistentes a la radiación
(rad-hard) específicamente diseñados para su uso en el espacio ha sido la metodología
imperante en el paradigma que hoy podemos denominar industria espacial tradicional, u
Old Space. Sin embargo, los componentes rad-hard tienen habitualmente un coste mucho
más alto y unas prestaciones mucho menores que otros componentes COTS equivalentes.
De hecho, los componentes COTS ya han sido utilizados satisfactoriamente en misiones
de la NASA o la ESA cuando las prestaciones requeridas por la misión no podían ser
cubiertas por ningún componente rad-hard existente.
En los últimos años, el acceso al espacio se está facilitando debido en gran parte a la
entrada de empresas privadas en la industria espacial. Estas empresas no siempre buscan
evitar el riesgo a toda costa, sino que deben perseguir una rentabilidad económica, por
lo que hacen un balance entre riesgo, coste y plazo mediante gestión del riesgo en un
paradigma denominado Nuevo Espacio o New Space. Estas empresas a menudo están
interesadas en entregar servicios basados en el espacio con las máximas prestaciones y el mayor beneficio posibles, para lo cual los componentes rad-hard son menos atractivos
debido a su mayor coste y menores prestaciones que los componentes COTS existentes.
Sin embargo, los componentes COTS no han sido específicamente diseñados para su uso
en el espacio y típicamente no incluyen técnicas específicas para evitar que los efectos de
la radiación afecten su funcionamiento. Los componentes COTS se comercializan tal cual
son, y habitualmente no es posible modificarlos para mejorar su resistencia a la radiación.
Además, los elevados niveles de integración de los sistemas en chip (SoC) complejos
de altas prestaciones dificultan su observación y la aplicación de técnicas de tolerancia
a fallos. Este problema es especialmente relevante en el caso de los microprocesadores.
Por tanto, existe un gran interés en el desarrollo de técnicas que permitan conocer y
mejorar el comportamiento de los microprocesadores COTS bajo radiación sin modificar
su arquitectura y sin interferir en su funcionamiento para facilitar su uso en el espacio y
con ello maximizar las prestaciones de las misiones espaciales presentes y futuras.
En esta Tesis se han desarrollado técnicas novedosas para detectar, diagnosticar y
mitigar los errores producidos por radiación en microprocesadores y sistemas en chip
(SoC) comerciales, utilizando la interfaz de traza como punto de observación. La interfaz de
traza es un recurso habitual en los microprocesadores modernos, principalmente enfocado
a soportar las tareas de desarrollo y depuración del software durante la fase de diseño. Sin
embargo, una vez el desarrollo ha concluido, la interfaz de traza típicamente no se utiliza
durante la fase operativa del sistema, por lo que puede ser reutilizada sin coste. La interfaz
de traza constituye un punto de conexión viable para observar el comportamiento de un
microprocesador de forma no intrusiva y sin interferir en su funcionamiento.
Como resultado de esta Tesis se ha desarrollado un módulo IP capaz de recabar
y decodificar la información de traza de un microprocesador COTS moderno de altas
prestaciones. El IP es altamente configurable y personalizable para adaptarse a diferentes
aplicaciones y tipos de procesadores. Ha sido diseñado y validado utilizando el dispositivo
Zynq-7000 de Xilinx como plataforma de desarrollo, que constituye un dispositivo COTS
de interés en la industria espacial. Este dispositivo incluye un procesador ARM Cortex-A9
de doble núcleo, que es representativo del conjunto de microprocesadores hard-core
modernos de altas prestaciones. El IP resultante es compatible con la tecnología ARM
CoreSight, que proporciona acceso a información de traza en los microprocesadores ARM.
El IP incorpora técnicas para detectar errores en el flujo de ejecución y en los datos de la
aplicación ejecutada utilizando la información de traza, en tiempo real y con muy baja
latencia. El IP se ha validado en campañas de inyección de fallos y también en radiación con
protones y neutrones en instalaciones especializadas. También se ha combinado con otras
técnicas de tolerancia a fallos para construir técnicas híbridas de mitigación de errores.
Los resultados experimentales obtenidos demuestran su alta capacidad de detección y
potencialidad en el diagnóstico de errores producidos por radiación.
El resultado de esta Tesis, desarrollada en el marco de un Doctorado Industrial entre
la Universidad Carlos III de Madrid (UC3M) y la empresa Arquimea, se ha transferido satisfactoriamente al entorno empresarial en forma de un proyecto financiado por la
Agencia Espacial Europea para continuar su desarrollo y posterior explotación.Commercial electronic components, also known as Commercial-Off-The-Shelf (COTS),
are present in a wide variety of devices commonly used in our daily life. Particularly, the
use of microprocessors and highly integrated System-on-Chip (SoC) devices has fostered
the advent of increasingly intelligent electronic devices which sustain the lifestyles and the
progress of modern society. Microprocessors are present even in safety-critical systems,
such as vehicles, planes, weapons, medical devices, implants, or power plants. In any of
these cases, a fault could involve severe human or economic consequences. However, every
electronic system deals continuously with internal and external factors that could provoke
faults in its operation. The capacity of a system to operate correctly in presence of faults
is known as fault-tolerance, and it becomes a requirement in the design and operation of
critical systems.
Space vehicles such as satellites or spacecraft also incorporate microprocessors to
operate autonomously or semi-autonomously during their service life, with the additional
difficulty that they cannot be repaired once in-orbit, so they are considered critical systems.
In addition, the harsh conditions in space, and specifically radiation effects, involve a big
challenge for the correct operation of electronic devices. In particular, radiation-induced
soft errors have the potential to become one of the major risks for the reliability of systems
in space.
Large space missions, typically publicly funded as in the case of NASA or European
Space Agency (ESA), have followed historically the requirement to avoid the risk at any
expense, regardless of any cost or schedule restriction. Because of that, the selection of
radiation-resistant components (known as rad-hard) specifically designed to be used in
space has been the dominant methodology in the paradigm of traditional space industry,
also known as “Old Space”. However, rad-hard components have commonly a much higher
associated cost and much lower performance that other equivalent COTS devices. In fact,
COTS components have already been used successfully by NASA and ESA in missions
that requested such high performance that could not be satisfied by any available rad-hard
component.
In the recent years, the access to space is being facilitated in part due to the irruption
of private companies in the space industry. Such companies do not always seek to avoid
the risk at any cost, but they must pursue profitability, so they perform a trade-off between
risk, cost, and schedule through risk management in a paradigm known as “New Space”.
Private companies are often interested in deliver space-based services with the maximum
performance and maximum benefit as possible. With such objective, rad-hard components
are less attractive than COTS due to their higher cost and lower performance.
However, COTS components have not been specifically designed to be used in space
and typically they do not include specific techniques to avoid or mitigate the radiation effects in their operation. COTS components are commercialized “as is”, so it is not
possible to modify them to improve their susceptibility to radiation effects. Moreover,
the high levels of integration of complex, high-performance SoC devices hinder their
observability and the application of fault-tolerance techniques. This problem is especially
relevant in the case of microprocessors. Thus, there is a growing interest in the development
of techniques allowing to understand and improve the behavior of COTS microprocessors
under radiation without modifying their architecture and without interfering with their
operation. Such techniques may facilitate the use of COTS components in space and
maximize the performance of present and future space missions.
In this Thesis, novel techniques have been developed to detect, diagnose, and
mitigate radiation-induced errors in COTS microprocessors and SoCs using the trace
interface as an observation point. The trace interface is a resource commonly found
in modern microprocessors, mainly intended to support software development and
debugging activities during the design phase. However, it is commonly left unused
during the operational phase of the system, so it can be reused with no cost. The trace
interface constitutes a feasible connection point to observe microprocessor behavior in a
non-intrusive manner and without disturbing processor operation.
As a result of this Thesis, an IP module has been developed capable to gather and
decode the trace information of a modern, high-end, COTS microprocessor. The IP is highly
configurable and customizable to support different applications and processor types. The
IP has been designed and validated using the Xilinx Zynq-7000 device as a development
platform, which is an interesting COTS device for the space industry. This device features a
dual-core ARM Cortex-A9 processor, which is a good representative of modern, high-end,
hard-core microprocessors. The resulting IP is compatible with the ARM CoreSight
technology, which enables access to trace information in ARM microprocessors. The IP is
able to detect errors in the execution flow of the microprocessor and in the application data
using trace information, in real time and with very low latency. The IP has been validated
in fault injection campaigns and also under proton and neutron irradiation campaigns in
specialized facilities. It has also been combined with other fault-tolerance techniques
to build hybrid error mitigation approaches. Experimental results demonstrate its high
detection capabilities and high potential for the diagnosis of radiation-induced errors.
The result of this Thesis, developed in the framework of an Industrial Ph.D. between the
University Carlos III of Madrid (UC3M) and the company Arquimea, has been successfully
transferred to the company business as a project sponsored by European Space Agency to
continue its development and subsequent commercialization.Programa de Doctorado en Ingeniería Eléctrica, Electrónica y Automática por la Universidad Carlos III de MadridPresidenta: María Luisa López Vallejo.- Secretario: Enrique San Millán Heredia.- Vocal: Luigi Di Lill
Mixed-Criticality Systems on Commercial-Off-the-Shelf Multi-Processor Systems-on-Chip
Avionics and space industries are struggling with the adoption of technologies
like multi-processor system-on-chips (MPSoCs) due to strict safety requirements.
This thesis propose a new reference architecture for MPSoC-based mixed-criticality
systems (MCS) - i.e., systems integrating applications with different level of criticality - which are a common use case for aforementioned industries.
This thesis proposes a system architecture capable of granting partitioning -
which is, for short, the property of fault containment. It is based on the detection
of spatial and temporal interference, and has been named the online detection of
interference (ODIn) architecture.
Spatial partitioning requires that an application is not able to corrupt resources
used by a different application. In the architecture proposed in this thesis, spatial
partitioning is implemented using type-1 hypervisors, which allow definition of
resource partitions. An application running in a partition can only access resources
granted to that partition, therefore it cannot corrupt resources used by applications
running in other partitions.
Temporal partitioning requires that an application is not able to unexpectedly
change the execution time of other applications. In the proposed architecture, temporal partitioning has been solved using a bounded interference approach, composed of
an offline analysis phase and an online safety net.
The offline phase is based on a statistical profiling of a metric sensitive to
temporal interference’s, performed in nominal conditions, which allows definition of
a set of three thresholds:
1. the detection threshold TD;
2. the warning threshold TW ;
3. the α threshold.
Two rules of detection are defined using such thresholds:
Alarm rule When the value of the metric is above TD.
Warning rule When the value of the metric is in the warning region [TW ;TD] for
more than α consecutive times.
ODIn’s online safety-net exploits performance counters, available in many MPSoC architectures; such counters are configured at bootstrap to monitor the selected
metric(s), and to raise an interrupt request (IRQ) in case the metric value goes above
TD, implementing the alarm rule. The warning rule is implemented in a software detection module, which reads the value of performance counters when the monitored
task yields control to the scheduler and reset them if there is no detection.
ODIn also uses two additional detection mechanisms:
1. a control flow check technique, based on compile-time defined block signatures, is implemented through a set of watchdog processors, each monitoring
one partition.
2. a timeout is implemented through a system watchdog timer (SWDT), which is
able to send an external signal when the timeout is violated.
The recovery actions implemented in ODIn are:
• graceful degradation, to react to IRQs of WDPs monitoring non-critical applications or to warning rule violations; it temporarily stops non-critical applications
to grant resources to the critical application;
• hard recovery, to react to the SWDT, to the WDP of the critical application, or
to alarm rule violations; it causes a switch to a hot stand-by spare computer.
Experimental validation of ODIn was performed on two hardware platforms: the
ZedBoard - dual-core - and the Inventami board - quad-core.
A space benchmark and an avionic benchmark were implemented on both platforms, composed by different modules as showed in Table 1
Each version of the final application was evaluated through fault injection (FI)
campaigns, performed using a specifically designed FI system. There were three
types of FI campaigns:
1. HW FI, to emulate single event effects;
2. SW FI, to emulate bugs in non-critical applications;
3. artificial bug FI, to emulate a bug in non-critical applications introducing
unexpected interference on the critical application.
Experimental results show that ODIn is resilient to all considered types of faul
A General Methodology to Optimize and Benchmark Edge Devices
The explosion of Internet Of Things (IoT), embedded and “smart” devices has also seen the addition of “general purpose” single board computers also referred to as “edge devices.” Determining if one of these generic devices meets the need of a new given task however can be challenging. Software generically written to be portable or plug and play may be too bloated to work properly without significant modification due to much tighter hardware resources. Previous work in this area has been focused on micro or chip-level benchmarking which is mainly useful for chip designers or low level system integrators. A higher or macro level method is needed to not only observe the behavior of these devices under a load but ensure they are appropriately configured for the new task, especially as they begin being integrated on platforms with higher cost of failure like self driving cars or drones. In this research we propose a macro level methodology that iteratively benchmarks and optimizes specific workloads on edge devices. With automation provided by Ansible, a multi stage 2k full factorial experiment and robust analysis process ensures the test workload is maximizing the use of available resources before establishing a final benchmark score. By framing the validation tests with a family of network security monitoring applications an end to end scenario fully exercises and validates the developed process. This also provides an additional vector for future research in the realm of network security. The analysis of the results show the developed process met its original design goals and intentions, with the added fact that the latest edge devices like the XAVIER, TX2 and RPi4 can easily perform as an edge network sensor
Microkernel mechanisms for improving the trustworthiness of commodity hardware
The thesis presents microkernel-based software-implemented mechanisms for improving the trustworthiness of computer systems based on commercial off-the-shelf (COTS) hardware that can malfunction when the hardware is impacted by transient hardware faults. The hardware anomalies, if undetected, can cause data corruptions, system crashes, and security vulnerabilities, significantly undermining system dependability. Specifically, we adopt the single event upset (SEU) fault model and address transient CPU or memory faults.
We take advantage of the functional correctness and isolation guarantee provided by the formally verified seL4 microkernel and hardware redundancy provided by multicore processors, design the redundant co-execution (RCoE) architecture that replicates a whole software system (including the microkernel) onto different CPU cores, and implement two variants, loosely-coupled redundant co-execution (LC-RCoE) and closely-coupled redundant co-execution (CC-RCoE), for the ARM and x86 architectures. RCoE treats each replica of the software system as a state machine and ensures that
the replicas start from the same initial state, observe consistent inputs, perform equivalent state transitions, and thus produce consistent outputs during error-free executions. Compared with other software-based error detection approaches, the distinguishing feature of RCoE is that the microkernel and device drivers are also included in redundant co-execution, significantly extending the sphere of replication (SoR).
Based on RCoE, we introduce two kernel mechanisms, fingerprint validation and kernel barrier timeout, detecting fault-induced execution divergences between the replicated systems, with the flexibility of tuning the error detection latency and coverage. The kernel error-masking mechanisms built on RCoE enable downgrading from triple modular redundancy (TMR) to dual modular redundancy (DMR) without service interruption. We run synthetic benchmarks and system benchmarks to evaluate the performance overhead of the approach, observe that the overhead varies based on the characteristics of workloads and the variants (LC-RCoE or CC-RCoE), and conclude that the approach is applicable for real-world applications. The effectiveness of the error detection mechanisms is assessed by conducting fault injection campaigns on real hardware, and the results demonstrate compelling improvement
- …