218,240 research outputs found

    Low Emission Building Control with Zero-Shot Reinforcement Learning

    Full text link
    Heating and cooling systems in buildings account for 31% of global energy use, much of which are regulated by Rule Based Controllers (RBCs) that neither maximise energy efficiency nor minimise emissions by interacting optimally with the grid. Control via Reinforcement Learning (RL) has been shown to significantly improve building energy efficiency, but existing solutions require access to building-specific simulators or data that cannot be expected for every building in the world. In response, we show it is possible to obtain emission-reducing policies without such knowledge a priori--a paradigm we call zero-shot building control. We combine ideas from system identification and model-based RL to create PEARL (Probabilistic Emission-Abating Reinforcement Learning) and show that a short period of active exploration is all that is required to build a performant model. In experiments across three varied building energy simulations, we show PEARL outperforms an existing RBC once, and popular RL baselines in all cases, reducing building emissions by as much as 31% whilst maintaining thermal comfort. Our source code is available online via https://enjeeneer.io/projects/pearl/Comment: Accepted at AAAI 2023. Code available via https://enjeeneer.io/projects/pearl

    MDP Homomorphic Networks: Group Symmetries in Reinforcement Learning

    Get PDF
    This paper introduces MDP homomorphic networks for deep reinforcement learning. MDP homomorphic networks are neural networks that are equivariant under symmetries in the joint state-action space of an MDP. Current approaches to deep reinforcement learning do not usually exploit knowledge about such structure. By building this prior knowledge into policy and value networks using an equivariance constraint, we can reduce the size of the solution space. We specifically focus on group-structured symmetries (invertible transformations). Additionally, we introduce an easy method for constructing equivariant network layers numerically, so the system designer need not solve the constraints by hand, as is typically done. We construct MDP homomorphic MLPs and CNNs that are equivariant under either a group of reflections or rotations. We show that such networks converge faster than unstructured baselines on CartPole, a grid world and Pong

    A CASE STUDY INVESTIGATING RULE BASED DESIGN IN AN INDUSTRIAL SETTING

    Get PDF
    This thesis presents a case study on the implementation of a rule based design (RBD) process for an engineer-to-order (ETO) company. The time taken for programming and challenges associated with this process are documented in order to understand the benefits and limitations of RBD. These times are obtained while developing RBD programs for grid assemblies of bottle packaging machines that are designed and manufactured by Hartness International (HI). In this project, commercially available computer-aided design (CAD) and RBD software are integrated to capture the design and manufacturing knowledge used to automate the grid design process of HI. The stages involved in RBD automation are identified as CAD modeling, knowledge acquisition, capturing parameters, RBD programming, debugging, and testing, and production deployment. The stages and associated times in RBD program development process are recorded for eighteen different grid products. Empirical models are developed to predict development times of RBD program, specifically enabling HI to estimate their return on investment. The models are demonstrated for an additional grid product where the predicted time is compared to actual RBD program time, falling within 20% of each other. This builds confidence in the accuracy of the models. Modeling guidelines for preparing CAD models are also presented to help in RBD program development. An important observation from this case study is that a majority of the time is spent capturing information about product during the knowledge acquisition stage, where the programmer\u27s development of a RBD program is dependent upon the designer\u27s product knowledge. Finally, refining these models to include other factors such as time for building CAD models, programmers experience with the RBD software (learning curve), and finally extending these models to other product domains are identified possible areas of future work

    The Knowledge Life Cycle for e-learning

    No full text
    In this paper, we examine the semantic aspects of e-learning from both pedagogical and technological points of view. We suggest that if semantics are to fulfil their potential in the learning domain then a paradigm shift in perspective is necessary, from information-based content delivery to knowledge-based collaborative learning services. We propose a semantics driven Knowledge Life Cycle that characterises the key phases in managing semantics and knowledge, show how this can be applied to the learning domain and demonstrate the value of semantics via an example of knowledge reuse in learning assessment management

    Technologies may help thinking

    Get PDF
    The objective of teachers’ personal and professional development is an excellent reason to reflect upon the innovation issues in education and a rare opportunity to implement the use of portfolios in the teaching practices. The most recent developments of digital technologies allow experiencing new organisational and knowledge building that state the diversity and multiplicity of purposes, both alone and as a group. From the reflection on these two aspects comes up the present proposal for the analysis and evaluation of the technologies which may easily be accessed by the educational community and may be used in the process of electronic portfolios building. In what teachers are concerned the use of portfolios can become a powerful means helping the change of the educational practices (Cardoso, Peixoto, Serrano and Moreira, 1996) if it is adopted as a metacognitive and reflexive strategy about teaching about them (Galvão, 2005). However there is a lack of information about what portfolios are, which technologies can be used, how they are prepared and how to take advantage of them. All these questions point out to the need of a specific training in this field. Accordingly, this chapter especially aims at helping teachers in that process, providing an analysis and evaluation technologies grid based on their pedagogical potentialities for the building of digital portfolios
    • 

    corecore