58 research outputs found

    Adaptive reinforcement learning for heterogeneous network selection

    Get PDF
    Next generation 5G mobile wireless networks will consist of multiple technologies for devices to access the network at the edge. One of the keys to 5G is therefore the ability for device to intelligently select its Radio Access Technology (RAT). Current fully distributed algorithms for RAT selection although guaranteeing convergence to equilibrium states, are often slow, require high exploration times and may converge to undesirable equilibria. In this dissertation, we propose three novel reinforcement learning (RL) frameworks to improve the efficiency of existing distributed RAT selection algorithms in a heterogeneous environment, where users may potentially apply a number of different RAT selection procedures. Although our research focuses on solutions for RAT selection in the current and future mobile wireless networks, the proposed solutions in this dissertation are general and suitable to apply for any large scale distributed multi-agent systems. In the first framework, called RL with Non-positive Regret, we propose a novel adaptive RL for multi-agent non-cooperative repeated games. The main contribution is to use both positive and negative regrets in RL to improve the convergence speed and fairness of the well-known regret-based RL procedure. Significant improvements in performance compared to other related algorithms in the literature are demonstrated. In the second framework, called RL with Network-Assisted Feedback (RLNF), our core contribution is to develop a network feedback model that uses network-assisted information to improve the performance of the distributed RL for RAT selection. RLNF guarantees no-regret payoff in the long-run for any user adopting it, regardless of what other users might do and so can work in an environment where not all users use the same learning strategy. This is an important implementation advantage as RLNF can be implemented within current mobile network standards. In the third framework, we propose a novel adaptive RL-based mechanism for RAT selection that can effectively handle user mobility. The key contribution is to leverage forgetting methods to rapidly react to the changes in the radio conditions when users move. We show that our solution improves the performance of wireless networks and converges much faster when users move compared to the non-adaptive solutions. Another objective of the research is to study the impact of various network models on the performance of different RAT selection approaches. We propose a unified benchmark to compare the performances of different algorithms under the same computational environment. The comparative studies reveal that among all the important network parameters that influence the performance of RAT selection algorithms, the number of base stations that a user can connect to has the most significant impact. This finding provides some guidelines for the proper design of RAT selection algorithms for future 5G. Our evaluation benchmark can serve as a reference for researchers, network developers, and engineers. Overall, the thesis provides different reinforcement learning frameworks to improve the efficiency of current fully distributed algorithms for heterogeneous RAT selection. We prove the convergence of the proposed reinforcement learning procedures using the differential inclusion (DI) technique. The theoretical analyses demonstrate that the use of DI not only provides an effective method to study the convergence properties of adaptive procedures in game-theoretic learning, but also yields a much more concise and extensible proof as compared to the classical approaches.Thesis (Ph.D.) -- University of Adelaide, School of Electrical and Electronic Engineering, 201

    Management of Temporally and Spatially Correlated Failures in Federated Message Oriented Middleware for Resilient and QoS-Aware Messaging Services.

    Get PDF
    PhDMessage Oriented Middleware (MOM) is widely recognized as a promising solution for the communications between heterogeneous distributed systems. Because the resilience and quality-of-service of the messaging substrate plays a critical role in the overall system performance, the evolution of these distributed systems has introduced new requirements for MOM, such as inter domain federation, resilience and QoS support. This thesis focuses on a management frame work that enhances the Resilience and QoS-awareness of MOM, called RQMOM, for federated enterprise systems. A common hierarchical MOM architecture for the federated messaging service is assumed. Each bottom level local domain comprises a cluster of neighbouring brokers that carry a local messaging service, and inter domain messaging are routed through the gateway brokers of the different local domains over the top level federated overlay. Some challenges and solutions for the intra and inter domain messaging are researched. In local domain messaging the common cause of performance degradation is often the fluctuation of workloads which might result in surge of total workload on a broker and overload its processing capacity, since a local domain is often within a well connected network. Against performance degradation, a combination of novel proactive risk-aware workload allocation, which exploits the co-variation between workloads, in addition to existing reactive load balancing is designed and evaluated. In federated inter domain messaging an overlay network of federated gateway brokers distributed in separated geographical locations, on top of the heterogeneous physical network is considered. Geographical correlated failures are threats to cause major interruptions and damages to such systems. To mitigate this rarely addressed challenge, a novel geographical location aware route selection algorithm to support uninterrupted messaging is introduced. It is used with existing overlay routing mechanisms, to maintain routes and hence provide more resilient messaging against geographical correlated failures

    Analysis of Embodied and Situated Systems from an Antireductionist Perspective

    Get PDF
    The analysis of embodied and situated agents form a dynamical system perspective is often limited to a geometrical and qualitative description. However, a quantitative analysis is necessary to achieve a deep understanding of cognitive facts. The field of embodied cognition is multifaceted, and the first part of this thesis is devoted to exploring the diverse meanings proposed in the existing literature. This is a preliminary fundamental step as the creation of synthetic models requires well-founded theoretical and foundational boundaries for operationalising the concept of embodied and situated cognition in a concrete neuro-robotic model. By accepting the dynamical system view the agent is conceived as highly integrated and strictly coupled with the surrounding environment. Therefore the antireductionist framework is followed during the analysis of such systems, using chaos theory to unveil global properties and information theory to describe the complex network of interactions among the heterogeneous sub-components. In the experimental section, several evolutionary robotics experiments are discussed. This class of adaptive systems is consistent with the proposed definition of embodied and situated cognition. In fact, such neuro-robotics platforms autonomously develop a solution to a problem exploiting the continuous sensorimotor interaction with the environment. The first experiment is a stress test for chaos theory, a mathematical framework that studies erratic behaviour in low-dimensional and deterministic dynamical systems. The recorded dataset consists of the robots’ position in the environment during the execution of the task. Subsequently, the time series is projected onto a multidimensional phase space in order to study the underlying dynamic using chaotic numerical descriptors. Finally, such measures are correlated and confronted with the robots’ behavioural strategy and the performance in novel and unpredictable environments. The second experiment explores the possible applications of information-theoretic measures for the analysis of embodied and situated systems. Data is recorded from perceptual and motor neurons while robots are executing a wall-following task and pairwise estimations of the mutual information and the transfer entropy are calculated in order to create an exhaustive map of the nonlinear interactions among variables. Results show that the set of information-theoretic employed in this study unveils characteristics of the agent-environemnt interaction and the functional neural structure. This work aims at testing the explanatory power and impotence of nonlinear time series analysis applied to observables recorded from neuro-robotics embodied and situated systems

    Centralized learning and planning : for cognitive robots operating in human domains

    Get PDF

    Advances in Human-Robot Interaction

    Get PDF
    Rapid advances in the field of robotics have made it possible to use robots not just in industrial automation but also in entertainment, rehabilitation, and home service. Since robots will likely affect many aspects of human existence, fundamental questions of human-robot interaction must be formulated and, if at all possible, resolved. Some of these questions are addressed in this collection of papers by leading HRI researchers

    Infrastructure-less D2D Communications through Opportunistic Networks

    Get PDF
    Mención Internacional en el título de doctorIn recent years, we have experienced several social media blackouts, which have shown how much our daily experiences depend on high-quality communication services. Blackouts have occurred because of technical problems, natural disasters, hacker attacks or even due to deliberate censorship actions undertaken by governments. In all cases, the spontaneous reaction of people consisted in finding alternative channels and media so as to reach out to their contacts and partake their experiences. Thus, it has clearly emerged that infrastructured networks—and cellular networks in particular—are well engineered and have been extremely successful so far, although other paradigms should be explored to connect people. The most promising of today’s alternative paradigms is Device-to-Device (D2D) because it allows for building networks almost freely, and because 5G standards are (for the first time) seriously addressing the possibility of using D2D communications. In this dissertation I look at opportunistic D2D networking, possibly operating in an infrastructure-less environment, and I investigate several schemes through modeling and simulation, deriving metrics that characterize their performance. In particular, I consider variations of the Floating Content (FC) paradigm, that was previously proposed in the technical literature. Using FC, it is possible to probabilistically store information over a given restricted local area of interest, by opportunistically spreading it to mobile users while in the area. In more detail, a piece of information which is injected in the area by delivering it to one or more of the mobile users, is opportunistically exchanged among mobile users whenever they come in proximity of one another, progressively reaching most (ideally all) users in the area and thus making the information dwell in the area of interest, like in a sort of distributed storage. While previous works on FC almost exclusively concentrated on the communication component, in this dissertation I look at the storage and computing components of FC, as well as its capability of transferring information from one area of interest to another. I first present background work, including a brief review of my Master Thesis activity, devoted to the design, implementation and validation of a smartphone opportunistic information sharing application. The goal of the app was to collect experimental data that permitted a detailed analysis of the occurring events, and a careful assessment of the performance of opportunistic information sharing services. Through experiments, I showed that many key assumptions commonly adopted in analytical and simulation works do not hold with current technologies. I also showed that the high density of devices and the enforcement of long transmission ranges for links at the edge might counter-intuitively impair performance. The insight obtained during my Master Thesis work was extremely useful to devise smart operating procedures for the opportunistic D2D communications considered in this dissertation. In the core of this dissertation, initially I propose and study a set of schemes to explore and combine different information dissemination paradigms along with real users mobility and predictions focused on the smart diffusion of content over disjoint areas of interest. To analyze the viability of such schemes, I have implemented a Python simulator to evaluate the average availability and lifetime of a piece of information, as well as storage usage and network utilization metrics. Comparing the performance of these predictive schemes with state-of-the-art approaches, results demonstrate the need for smart usage of communication opportunities and storage. The proposed algorithms allow for an important reduction in network activity by decreasing the number of data exchanges by up to 92%, requiring the use of up to 50% less of on-device storage, while guaranteeing the dissemination of information with performance similar to legacy epidemic dissemination protocols. In a second step, I have worked on the analysis of the storage capacity of probabilistic distributed storage systems, developing a simple yet powerful information theoretical analysis based on a mean field model of opportunistic information exchange. I have also extended the previous simulator to compare the numerical results generated by the analytical model to the predictions of realistic simulations under different setups, showing in this way the accuracy of the analytical approach, and characterizing the properties of the system storage capacity. I conclude from analysis and simulated results that when the density of contents seeded in a floating system is larger than the maximum amount which can be sustained by the system in steady state, the mean content availability decreases, and the stored information saturates due to the effects of resource contention. With the presence of static nodes, in a system with infinite host memory and at the mean field limit, there is no upper bound to the amount of injected contents which a floating system can sustain. However, as with no static nodes, by increasing the injected information, the amount of stored information eventually reaches a saturation value which corresponds to the injected information at which the mean amount of time spent exchanging content during a contact is equal to the mean duration of a contact. As a final step of my dissertation, I have also explored by simulation the computing and learning capabilities of an infrastructure-less opportunistic communication, storage and computing system, considering an environment that hosts a distributed Machine Learning (ML) paradigm that uses observations collected in the area over which the FC system operates to infer properties of the area. Results show that the ML system can operate in two regimes, depending on the load of the FC scheme. At low FC load, the ML system in each node operates on observations collected by all users and opportunistically shared among nodes. At high FC load, especially when the data to be opportunistically exchanged becomes too large to be transmitted during the average contact time between nodes, the ML system can only exploit the observations endogenous to each user, which are much less numerous. As a result, I conclude that such setups are adequate to support general instances of distributed ML algorithms with continuous learning, only under the condition of low to medium loads of the FC system. While the load of the FC system induces a sort of phase transition on the ML system performance, the effect of computing load is more progressive. When the computing capacity is not sufficient to train all observations, some will be skipped, and performance progressively declines. In summary, with respect to traditional studies of the FC opportunistic information diffusion paradigm, which only look at the communication component over one area of interest, I have considered three types of extensions by looking at the performance of FC: over several disjoint areas of interest; in terms of information storage capacity; in terms of computing capacity that supports distributed learning. The three topics are treated respectively in Chapters 3 to 5.This work has been supported by IMDEA Networks InstitutePrograma de Doctorado en Ingeniería Telemática por la Universidad Carlos III de MadridPresidente: Claudio Ettori Casetti.- Secretario: Antonio de la Oliva Delgado.- Vocal: Christoph Somme
    • …
    corecore