77 research outputs found

    Policy-based power consumption management in smart energy community using single agent and multi agent Q learning algorithms

    Get PDF
    Power consumption in residential sector has increased due to growing population, economic growth, invention of many electrical appliances and therefore is becoming a growing concern in the power industry. Managing power consumption in residential sector without sacrificing user comfort has become one of the main research areas recently. The complexity of the power system keeps growing due to the penetration of alternative sources of electric energy such as solar plant, Hydro, Biomass, Geothermal and wind farm to meet the growing demand for electricity. To overcome the challenges due to complexity, the power grid needs to be intelligent in all aspects. As the grid gets smarter and smarter, considerable efforts are being undertaken to make the houses and businesses smarter in consuming the electrical energy to minimize and level the electricity demand which is also known as Demand Side Management (DSM). It also necessitates that the conventional way of modelling, control and energy management in all sectors needs to be enhanced or replaced by intelligent information processing techniques. In our research work, it has been done in several stages. (Purpose of Study and Results) We proposed a policy-based framework which allows intelligent and flexible energy management of home appliances in a smart home which is complex and dynamic in ways that saves energy automatically. We considered the challenges in formalizing the behaviour of the appliances using their states and managing the energy consumption using policies. Policies are rules which are created and edited by a house agent to deal with situations or power problems that are likely to occur. Each time the power problem arises the house agent will refer to policy and one or a set of rules will be executed to overcome that situation. Our policy-based smart home can manage energy efficiently and can significantly participate in reducing peak energy demand (thereby may reduce carbon emission). Our proposed policy-based framework achieves peak shaving so that power consumption adapts to available power, while ensuring the comfort level of the inhabitants and taking device characteristics in to account. Our simulation results on MATLAB indicate that the proposed Policy driven homes can effectively contribute to Demand side power management by decreasing the peak hour usage of the appliances and can efficiently manage energy in a smart home in a user-friendly way. We propounded and developed peak demand management algorithms for a Smart Energy Community using different types of coordination mechanisms for coordination of multiple house agents working in the same environment. These algorithms use centralized model, decentralized model, hybrid model and Pareto resource allocation model for resource allocation. We modelled user comfort for the appliance based on user preference, the power reduction capability and the important activities that run around the house associated with that appliance. Moreover, we compared these algorithms with respect to their peak reduction capability, overall comfort of the community, simplicity of the algorithm and community involvement and finally able to find the best performing algorithm among them. Our simulation results show that the proposed coordination algorithms can effectively reduce peak demand while maintaining user comfort. With the help of our proposed algorithms, the demand for electricity of a smart community can be managed intelligently and sustainably. This work is not only aiming for peak reduction management it aims for achieving it while keeping the comfort level of the inhabitants is minimum. It can learn user’s behaviour and establish the set of optimal rules dynamically. If the available power to a house is kept at a certain level the house agent will learn to use this notional power to operate all the appliances according to the requirements and comfort level of the household. This way the consumers are forced to use the power below the set level which can result in the over-all power consumption be maintained at a certain rate or level which means sustainability is possible or depletion of natural resources for electricity can be reduced. Temporal interactions of Energy Demand by local users and renewable energy sources can also be done more efficiently by having a set of new policy rules to switch between the utility and the renewable source of energy but it is beyond the scope of this thesis. We applied Q learning techniques to a home energy management agent where the agent learns to find the optimal sequence of turning off appliances so that the appliances with higher priority will not be switched off during peak demand period or power consumption management. The policy-based home energy management determines the optimal policy at every instant dynamically by learning through the interaction with the environment using one of the reinforcement learning approaches called Q-learning. The Q-learning home power consumption problem formulation consisting of state space, actions and reward function is presented. The implications of these simulation results are that the proposed Q- learning based power consumption management is very effective and enables the users to have minimum discomfort during participation in peak demand management or at the time when power consumption management is essential when the available power is rationale. This work is extended to a group of 10 houses and three multi agent Q- learning algorithms are proposed and developed for improving the individual and community comfort while at the same time keeping the power consumption below the available power level or electricity price below the set price. The proposed algorithms are weighted strategy sharing algorithm, concurrent Q learning algorithm and cooperative distributive learning algorithm. These proposed algorithms are coded and tested for managing power consumption of a group of 10 houses and the performance of all three algorithms with respect to power management and community comfort is studied and compared. Actual power consumption of a community and modified power consumption curves using Weighted Strategy Sharing algorithm, Concurrent learning and Distributive Q Learning and user comfort results are presented, and the results are analysed in this thesis

    Docitive Networks. A Step Beyond Cognition

    Get PDF
    Projecte fet en col.laboració amb Centre Tecnològic de Telecomunicacions de CatalunyaCatalà: En les Xarxes Docents es por ta més enllà la idea d'elaborar decisions intel ligents. Per mitjà de compartir informació entre els nodes, amb l'objectiu primordial de reduir la complexitat i millorar el rendiment de les Xarxes Cognitives. Per a això es revisen alguns conceptes importants de les bases de l'Aprenentatge Automàtic, prestant especial atenció a l'aprenentatge per reforç. També es fa una visió de la Teoria de Jocs Evolutius i de la dinàmica de rèpliques. Finalment, simulacions ,basades en el projecte TIC-BUNGEE, es mostren per validar els conceptes introduïts.Castellano: Las Redes Docentes llevan más alla la idea de elaborar decisiones inteligentes, por medio de compartir información entre los nodos, con el objetivo primordial de reducir la complejidad y mejorar el rendimiento de las Redes Cognitiva. Para ello se revisan algunos conceptos importantes de las bases del Aprendizaje Automático, prestando especial atencion al aprendizaje por refuerzo, también damos una visón de la Teoría de Juegos Evolutivos y de la replicación de dinamicas. Por último, las simulaciones basadas en el proyecto TIC-BUNGEE se muestran para validar los conceptos introducidos.English: The Docitive Networks further use the idea of drawing intelligent decisions by means of sharing information between nodes with the prime aim of reduce complexity and enhance performance of Congnitive Networks. To this end we review some important concepts form Machine Learning, paying special atention to Reinforcement Learning, we also go insight Evolutionary Game Theory and Replicator Dynamics. Finally, simulations Based on ICT-BUNGEE project are shown to validate the introduced concepts

    Behaviour design in microrobots:hierarchical reinforcement learning under resource constraints

    Get PDF
    In order to verify models of collective behaviors of animals, robots could be manipulated to implement the model and interact with real animals in a mixed-society. This thesis describes design of the behavioral hierarchy of a miniature robot, that is able to interact with cockroaches, and participates in their collective decision makings. The robots are controlled via a hierarchical behavior-based controller in which, more complex behaviors are built by combining simpler behaviors through fusion and arbitration mechanisms. The experiments in the mixed-society confirms the similarity between the collective patterns of the mixed-society and those of the real society. Moreover, the robots are able to induce new collective patterns by modulation of some behavioral parameters. Difficulties in the manual extraction of the behavioral hierarchy and inability to revise it, direct us to benefit from machine learning techniques, in order to devise the composition hierarchy and coordination in an automated way. We derive a Compact Q-Learning method for micro-robots with processing and memory constraints, and try to learn behavior coordination through it. The behavior composition part is still done manually. However, the problem of the curse of dimensionality makes incorporation of this kind of flat-learning techniques unsuitable. Even though optimizing them could temporarily speed up the learning process and widen their range of applications, their scalability to real world applications remains under question. In the next steps, we apply hierarchical learning techniques to automate both behavior coordination and composition parts. In some situations, many features of the state space might be irrelevant to what the robot currently learns. Abstracting these features and discovering the hierarchy among them can help the robot learn the behavioral hierarchy faster. We formalize the automatic state abstraction problem with different heuristics, and derive three new splitting criteria that adapt decision tree learning techniques to state abstraction. Proof of performance is supported by strong evidences from simulation results in deterministic and non-deterministic environments. Simulation results show encouraging enhancements in the required number of learning trials, robot's performance, size of the learned abstraction trees, and computation time of the algorithms. In the other hand, learning in a group provides free sources of knowledge that, if communicated, can broaden the scales of learning, both temporally and spatially. We present two approaches to combine output or structure of abstraction trees. The trees are stored in different RL robots in a multi-robot system, or in the trees learned by the same robot but using different methods. Simulation results in a non-deterministic football learning task provide strong evidences for enhancement in convergence rate and policy performance, specially in heterogeneous cooperations

    Collective Machine Learning: Team Learning and Classification in Multi-Agent Systems

    Get PDF
    This dissertation focuses on the collaboration of multiple heterogeneous, intelligent agents (hardware or software) which collaborate to learn a task and are capable of sharing knowledge. The concept of collaborative learning in multi-agent and multi-robot systems is largely under studied, and represents an area where further research is needed to gain a deeper understanding of team learning. This work presents experimental results which illustrate the importance of heterogeneous teams of collaborative learning agents, as well as outlines heuristics which govern successful construction of teams of classifiers. A number of application domains are studied in this dissertation. One approach is focused on the effects of sharing knowledge and collaboration of multiple heterogeneous, intelligent agents (hardware or software) which work together to learn a task. As each agent employs a different machine learning technique, the system consists of multiple knowledge sources and their respective heterogeneous knowledge representations. Collaboration between agents involves sharing knowledge to both speed up team learning, as well as to refine the team's overall performance and group behavior. Experiments have been performed that vary the team composition in terms of machine learning algorithms, learning strategies employed by the agents, and sharing frequency for a predator-prey cooperative pursuit task. For lifelong learning, heterogeneous learning teams were more successful compared to homogeneous learning counterparts. Interestingly, sharing increased the learning rate, but sharing with higher frequency showed diminishing results. Lastly, knowledge conflicts are reduced over time, as more sharing takes place. These results support further investigation of the merits of heterogeneous learning. This dissertation also focuses on discovering heuristics for constructing successful teams of heterogeneous classifiers, including many aspects of team learning and collaboration. In one application, multi-agent machine learning and classifier combination are utilized to learn rock facies sequences from wireline well log data. Gas and oil reservoirs have been the focus of modeling efforts for many years as an attempt to locate zones with high volumes. Certain subsurface layers and layer sequences, such as those containing shale, are known to be impermeable to gas and/or liquid. Oil and natural gas then become trapped by these layers, making it possible to drill wells to reach the supply, and extract for use. The drilling of these wells, however, is costly. Here, the focus is on how to construct a successful set of classifiers, which periodically collaborate, to increase the classification accuracy. Utilizing multiple, heterogeneous collaborative learning agents is shown to be successful for this classification problem. We were able to obtain 84.5% absolute accuracy using the Multi-Agent Collaborative Learning Architecture, an improvement of about 6.5% over the best results achieved by Kansas Geological Survey with the same data set. Several heuristics are presented for constructing teams of multiple collaborative classifiers for predicting rock facies. Another application utilizes multi-agent machine learning and classifier combination to learn water presence using airborne polar radar data acquired from Greenland in 1999 and 2007. Ground and airborne depth-soundings of the Greenland and Antarctic ice sheets have been used for many years to determine characteristics such as ice thickness, subglacial topography, and mass balance of large bodies of ice. Ice coring efforts have supported these radar data to provide ground truth for validation of the state (wet or frozen) of the interface between the bottom of the ice sheet and the underlying bedrock. Subglacial state governs the friction, flow speed, transport of material, and overall change of the ice sheet. In this dissertation, we focus on how to construct a successful set of classifiers which periodically collaborate to increase classification accuracy. The underlying method results in radar independence, allowing model transfer from 1999 to 2007 to produce water presence maps of the Greenland ice sheet with differing radars. We were able to obtain 86% accuracy using the Multi-Agent Collaborative Learning Architecture with this data set. Utilizing multiple, heterogeneous collaborative learning agents is shown to be successful for this classification problem as well. Several heuristics, some of which agree with those found in the other applications, are presented for constructing teams of multiple collaborative classifiers for predicting subglacial water presence. General findings from these different experiments suggest that constructing a team of classifiers using a heterogeneous mixture of homogeneous teams is preferred. Larger teams generally perform better, as decisions from multiple learners can be combined to arrive at a consensus decision. Employing heterogeneous learning algorithms integrates different error models to arrive at higher accuracy classification from complementary knowledge bases. Collaboration, although not found to be universally useful, offers certain team configurations an advantage. Collaboration with low to medium frequency was found to be beneficial, while high frequency collaboration was found to be detrimental to team classification accuracy. Full mode learning, where each learner receives the entire training set for the learning phase, consistently outperforms independent mode learning, where the training set is distributed to all learners in a team in a non-overlapping fashion. Results presented in this dissertation support the application of multi-agent machine learning and collaboration to current challenging, real-world classification problems

    Docitive Networks. A Step Beyond Cognition

    Get PDF
    Projecte fet en col.laboració amb Centre Tecnològic de Telecomunicacions de CatalunyaCatalà: En les Xarxes Docents es por ta més enllà la idea d'elaborar decisions intel ligents. Per mitjà de compartir informació entre els nodes, amb l'objectiu primordial de reduir la complexitat i millorar el rendiment de les Xarxes Cognitives. Per a això es revisen alguns conceptes importants de les bases de l'Aprenentatge Automàtic, prestant especial atenció a l'aprenentatge per reforç. També es fa una visió de la Teoria de Jocs Evolutius i de la dinàmica de rèpliques. Finalment, simulacions ,basades en el projecte TIC-BUNGEE, es mostren per validar els conceptes introduïts.Castellano: Las Redes Docentes llevan más alla la idea de elaborar decisiones inteligentes, por medio de compartir información entre los nodos, con el objetivo primordial de reducir la complejidad y mejorar el rendimiento de las Redes Cognitiva. Para ello se revisan algunos conceptos importantes de las bases del Aprendizaje Automático, prestando especial atencion al aprendizaje por refuerzo, también damos una visón de la Teoría de Juegos Evolutivos y de la replicación de dinamicas. Por último, las simulaciones basadas en el proyecto TIC-BUNGEE se muestran para validar los conceptos introducidos.English: The Docitive Networks further use the idea of drawing intelligent decisions by means of sharing information between nodes with the prime aim of reduce complexity and enhance performance of Congnitive Networks. To this end we review some important concepts form Machine Learning, paying special atention to Reinforcement Learning, we also go insight Evolutionary Game Theory and Replicator Dynamics. Finally, simulations Based on ICT-BUNGEE project are shown to validate the introduced concepts

    Multi-Objective Optimization for Speed and Stability of a Sony Aibo Gait

    Get PDF
    Locomotion is a fundamental facet of mobile robotics that many higher level aspects rely on. However, this is not a simple problem for legged robots with many degrees of freedom. For this reason, machine learning techniques have been applied to the domain. Although impressive results have been achieved, there remains a fundamental problem with using most machine learning methods. The learning algorithms usually require a large dataset which is prohibitively hard to collect on an actual robot. Further, learning in simulation has had limited success transitioning to the real world. Also, many learning algorithms optimize for a single fitness function, neglecting many of the effects on other parts of the system. As part of the RoboCup 4-legged league, many researchers have worked on increasing the walking/gait speed of Sony AIBO robots. Recently, the effort shifted from developing a quick gait, to developing a gait that also provides a stable sensing platform. However, to date, optimization of both velocity and camera stability has only occurred using a single fitness function that incorporates the two objectives with a weighting that defines the desired tradeoff between them. However, the true nature of this tradeoff is not understood because the pareto front has never been charted, so this a priori decision is uninformed. This project applies the Nondominated Sorting Genetic Algorithm-II (NSGA-II) to find a pareto set of fast, stable gait parameters. This allows a user to select the best tradeoff between balance and speed for a given application. Three fitness functions are defined: one speed measure and two stability measures. A plot of evolved gaits shows a pareto front that indicates speed and stability are indeed conflicting goals. Interestingly, the results also show that tradeoffs also exist between different measures of stability

    Measuring the generative power of an organisational routine with design theories: the case of design thinking in a large firm

    Get PDF
    International audienceThis article studies how a large firm uses Design Thinking (DT) as a core process in specific design and development team whose mission is to bridge the gap between unidentified market needs and business units research & development effort. We analyse two cases where new concepts were developed and promoted to business units for implementation by following DT methodology. Our study shows that the DT routine reveals some generative power to explore the user perspective, yet it appears uncontrolled when it comes to generate a wider variety of ideas and knowledge challenging the design ecosystem ontology omitted and made invariant through user-focus hence it faces difficulties to engage with stakeholders and other organisational routines for an enhanced creativity and organisational change

    A Unified Framework for Solving Multiagent Task Assignment Problems

    Get PDF
    Multiagent task assignment problem descriptors do not fully represent the complex interactions in a multiagent domain, and algorithmic solutions vary widely depending on how the domain is represented. This issue is compounded as related research fields contain descriptors that similarly describe multiagent task assignment problems, including complex domain interactions, but generally do not provide the mechanisms needed to solve the multiagent aspect of task assignment. This research presents a unified approach to representing and solving the multiagent task assignment problem for complex problem domains. Ideas central to multiagent task allocation, project scheduling, constraint satisfaction, and coalition formation are combined to form the basis of the constrained multiagent task scheduling (CMTS) problem. Basic analysis reveals the exponential size of the solution space for a CMTS problem, approximated by O(2n(m+n)) based on the number of agents and tasks involved in a problem. The shape of the solution space is shown to contain numerous discontinuous regions due to the complexities involved in relational constraints defined between agents and tasks. The CMTS descriptor represents a wide range of classical and modern problems, such as job shop scheduling, the traveling salesman problem, vehicle routing, and cooperative multi-object tracking. Problems using the CMTS representation are solvable by a suite of algorithms, with varying degrees of suitability. Solution generating methods range from simple random scheduling to state-of-the-art biologically inspired approaches. Techniques from classical task assignment solvers are extended to handle multiagent task problems where agents can also multitask. Additional ideas are incorporated from constraint satisfaction, project scheduling, evolutionary algorithms, dynamic coalition formation, auctioning, and behavior-based robotics to highlight how different solution generation strategies apply to the complex problem space
    • …
    corecore