489 research outputs found

    Application of reinforcement learning for security enhancement in cognitive radio networks

    Get PDF
    Cognitive radio network (CRN) enables unlicensed users (or secondary users, SUs) to sense for and opportunistically operate in underutilized licensed channels, which are owned by the licensed users (or primary users, PUs). Cognitive radio network (CRN) has been regarded as the next-generation wireless network centered on the application of artificial intelligence, which helps the SUs to learn about, as well as to adaptively and dynamically reconfigure its operating parameters, including the sensing and transmission channels, for network performance enhancement. This motivates the use of artificial intelligence to enhance security schemes for CRNs. Provisioning security in CRNs is challenging since existing techniques, such as entity authentication, are not feasible in the dynamic environment that CRN presents since they require pre-registration. In addition these techniques cannot prevent an authenticated node from acting maliciously. In this article, we advocate the use of reinforcement learning (RL) to achieve optimal or near-optimal solutions for security enhancement through the detection of various malicious nodes and their attacks in CRNs. RL, which is an artificial intelligence technique, has the ability to learn new attacks and to detect previously learned ones. RL has been perceived as a promising approach to enhance the overall security aspect of CRNs. RL, which has been applied to address the dynamic aspect of security schemes in other wireless networks, such as wireless sensor networks and wireless mesh networks can be leveraged to design security schemes in CRNs. We believe that these RL solutions will complement and enhance existing security solutions applied to CRN To the best of our knowledge, this is the first survey article that focuses on the use of RL-based techniques for security enhancement in CRNs

    Strategic Learning for Active, Adaptive, and Autonomous Cyber Defense

    Full text link
    The increasing instances of advanced attacks call for a new defense paradigm that is active, autonomous, and adaptive, named as the \texttt{`3A'} defense paradigm. This chapter introduces three defense schemes that actively interact with attackers to increase the attack cost and gather threat information, i.e., defensive deception for detection and counter-deception, feedback-driven Moving Target Defense (MTD), and adaptive honeypot engagement. Due to the cyber deception, external noise, and the absent knowledge of the other players' behaviors and goals, these schemes possess three progressive levels of information restrictions, i.e., from the parameter uncertainty, the payoff uncertainty, to the environmental uncertainty. To estimate the unknown and reduce uncertainty, we adopt three different strategic learning schemes that fit the associated information restrictions. All three learning schemes share the same feedback structure of sensation, estimation, and actions so that the most rewarding policies get reinforced and converge to the optimal ones in autonomous and adaptive fashions. This work aims to shed lights on proactive defense strategies, lay a solid foundation for strategic learning under incomplete information, and quantify the tradeoff between the security and costs.Comment: arXiv admin note: text overlap with arXiv:1906.1218

    Aprendizagem de coordenação em sistemas multi-agente

    Get PDF
    The ability for an agent to coordinate with others within a system is a valuable property in multi-agent systems. Agents either cooperate as a team to accomplish a common goal, or adapt to opponents to complete different goals without being exploited. Research has shown that learning multi-agent coordination is significantly more complex than learning policies in singleagent environments, and requires a variety of techniques to deal with the properties of a system where agents learn concurrently. This thesis aims to determine how can machine learning be used to achieve coordination within a multi-agent system. It asks what techniques can be used to tackle the increased complexity of such systems and their credit assignment challenges, how to achieve coordination, and how to use communication to improve the behavior of a team. Many algorithms for competitive environments are tabular-based, preventing their use with high-dimension or continuous state-spaces, and may be biased against specific equilibrium strategies. This thesis proposes multiple deep learning extensions for competitive environments, allowing algorithms to reach equilibrium strategies in complex and partially-observable environments, relying only on local information. A tabular algorithm is also extended with a new update rule that eliminates its bias against deterministic strategies. Current state-of-the-art approaches for cooperative environments rely on deep learning to handle the environment’s complexity and benefit from a centralized learning phase. Solutions that incorporate communication between agents often prevent agents from being executed in a distributed manner. This thesis proposes a multi-agent algorithm where agents learn communication protocols to compensate for local partial-observability, and remain independently executed. A centralized learning phase can incorporate additional environment information to increase the robustness and speed with which a team converges to successful policies. The algorithm outperforms current state-of-the-art approaches in a wide variety of multi-agent environments. A permutation invariant network architecture is also proposed to increase the scalability of the algorithm to large team sizes. Further research is needed to identify how can the techniques proposed in this thesis, for cooperative and competitive environments, be used in unison for mixed environments, and whether they are adequate for general artificial intelligence.A capacidade de um agente se coordenar com outros num sistema é uma propriedade valiosa em sistemas multi-agente. Agentes cooperam como uma equipa para cumprir um objetivo comum, ou adaptam-se aos oponentes de forma a completar objetivos egoístas sem serem explorados. Investigação demonstra que aprender coordenação multi-agente é significativamente mais complexo que aprender estratégias em ambientes com um único agente, e requer uma variedade de técnicas para lidar com um ambiente onde agentes aprendem simultaneamente. Esta tese procura determinar como aprendizagem automática pode ser usada para encontrar coordenação em sistemas multi-agente. O documento questiona que técnicas podem ser usadas para enfrentar a superior complexidade destes sistemas e o seu desafio de atribuição de crédito, como aprender coordenação, e como usar comunicação para melhorar o comportamento duma equipa. Múltiplos algoritmos para ambientes competitivos são tabulares, o que impede o seu uso com espaços de estado de alta-dimensão ou contínuos, e podem ter tendências contra estratégias de equilíbrio específicas. Esta tese propõe múltiplas extensões de aprendizagem profunda para ambientes competitivos, permitindo a algoritmos atingir estratégias de equilíbrio em ambientes complexos e parcialmente-observáveis, com base em apenas informação local. Um algoritmo tabular é também extendido com um novo critério de atualização que elimina a sua tendência contra estratégias determinísticas. Atuais soluções de estado-da-arte para ambientes cooperativos têm base em aprendizagem profunda para lidar com a complexidade do ambiente, e beneficiam duma fase de aprendizagem centralizada. Soluções que incorporam comunicação entre agentes frequentemente impedem os próprios de ser executados de forma distribuída. Esta tese propõe um algoritmo multi-agente onde os agentes aprendem protocolos de comunicação para compensarem por observabilidade parcial local, e continuam a ser executados de forma distribuída. Uma fase de aprendizagem centralizada pode incorporar informação adicional sobre ambiente para aumentar a robustez e velocidade com que uma equipa converge para estratégias bem-sucedidas. O algoritmo ultrapassa abordagens estado-da-arte atuais numa grande variedade de ambientes multi-agente. Uma arquitetura de rede invariante a permutações é também proposta para aumentar a escalabilidade do algoritmo para grandes equipas. Mais pesquisa é necessária para identificar como as técnicas propostas nesta tese, para ambientes cooperativos e competitivos, podem ser usadas em conjunto para ambientes mistos, e averiguar se são adequadas a inteligência artificial geral.Apoio financeiro da FCT e do FSE no âmbito do III Quadro Comunitário de ApoioPrograma Doutoral em Informátic

    Docitive Networks. A Step Beyond Cognition

    Get PDF
    Projecte fet en col.laboració amb Centre Tecnològic de Telecomunicacions de CatalunyaCatalà: En les Xarxes Docents es por ta més enllà la idea d'elaborar decisions intel ligents. Per mitjà de compartir informació entre els nodes, amb l'objectiu primordial de reduir la complexitat i millorar el rendiment de les Xarxes Cognitives. Per a això es revisen alguns conceptes importants de les bases de l'Aprenentatge Automàtic, prestant especial atenció a l'aprenentatge per reforç. També es fa una visió de la Teoria de Jocs Evolutius i de la dinàmica de rèpliques. Finalment, simulacions ,basades en el projecte TIC-BUNGEE, es mostren per validar els conceptes introduïts.Castellano: Las Redes Docentes llevan más alla la idea de elaborar decisiones inteligentes, por medio de compartir información entre los nodos, con el objetivo primordial de reducir la complejidad y mejorar el rendimiento de las Redes Cognitiva. Para ello se revisan algunos conceptos importantes de las bases del Aprendizaje Automático, prestando especial atencion al aprendizaje por refuerzo, también damos una visón de la Teoría de Juegos Evolutivos y de la replicación de dinamicas. Por último, las simulaciones basadas en el proyecto TIC-BUNGEE se muestran para validar los conceptos introducidos.English: The Docitive Networks further use the idea of drawing intelligent decisions by means of sharing information between nodes with the prime aim of reduce complexity and enhance performance of Congnitive Networks. To this end we review some important concepts form Machine Learning, paying special atention to Reinforcement Learning, we also go insight Evolutionary Game Theory and Replicator Dynamics. Finally, simulations Based on ICT-BUNGEE project are shown to validate the introduced concepts

    Meta-learning applications for machine-type wireless communications

    Get PDF
    Abstract. Machine Type Communication (MTC) emerged as a key enabling technology for 5G wireless networks and beyond towards the 6G networks. MTC provides two service modes. Massive MTC (mMTC) provides connectivity to a huge number of users. Ultra-Reliable Low Latency Communication (URLLC) achieves stringent reliability and latency requirements to enable industrial and interactive applications. Recently, data-driven learning-based approaches have been proposed to optimize the operation of various MTC applications and allow for obtaining the desired strict performance metrics. In our work, we propose implementing meta-learning alongside other deep-learning models in MTC applications. First, we analyze the model-agnostic meta-learning algorithm (MAML) and its convergence for regression and reinforcement learning (RL) problems. Then, we discuss uncrewed aerial vehicles (UAVs) trajectory planning as a case study in mMTC and RL, illustrating the system model and the main challenges. Hence, we propose the MAML-RL formulation to solve the UAV path learning problem. Moreover, we address the MAML-based few-pilot demodulation problem in massive IoT deployments. Finally, we extend the problem to include the interference cancellation with Non-Orthogonal Multiple Access (NOMA) as a paradigm shift towards non-orthogonal communication thanks to its potential to scale well in massive deployments. We propose a novel, data-driven, meta-learning-aided NOMA uplink model that minimizes the channel estimation overhead and does not require perfect channel knowledge. Unlike conventional deep learning successive interference cancellation (SICNet), Meta-Learning aided SIC (meta-SICNet) can share experiences across different devices, facilitating learning for new incoming devices while reducing training over- head. Our results show the superiority of MAML performance in addressing many problems compared to other deep learning schemes. The simulations also prove that MAML can successfully solve the few-pilot demodulation problem and achieve better performance in terms of symbol error rates (SERs) and convergence latency. Moreover, the analysis confirms that the proposed meta-SICNet outperforms classical SIC and conventional SICNet as it can achieve a lower SER with fewer pilots
    corecore