Search CORE

2,513 research outputs found

Oppositional Reinforcement Learning with Applications

Author: Shokri Maryam
Publication venue: 'University of Waterloo'
Publication date: 05/09/2008
Field of study

Machine intelligence techniques contribute to solving real-world problems. Reinforcement learning (RL) is one of the machine intelligence techniques with several characteristics that make it suitable for the applications, for which the model of the environment is not available to the agent. In real-world applications, intelligent agents generally face a very large state space which limits the usability of reinforcement learning. The condition for convergence of reinforcement learning implies that each state-action pair must be visited infinite times, a condition which can be considered impossible to be satisfied in many practical situations. The goal of this work is to propose a class of new techniques to overcome this problem for off-policy, step-by-step (incremental) and model-free reinforcement learning with discrete state and action space. The focus of this research is using the design characteristics of RL agent to improve its performance regarding the running time while maintaining an acceptable level of accuracy. One way of improving the performance of the intelligent agents is using the model of environment. In this work, a special type of knowledge about the agent actions is employed to improve its performance because in many applications the model of environment may only be known partially or not at all. The concept of opposition is employed in the framework of reinforcement learning to achieve this goal. One of the components of RL agent is the action. For each action we define its associate opposite action. The actions and opposite actions are implemented in the framework of reinforcement learning to update the value function resulting in a faster convergence. At the beginning of this research the concept of opposition is incorporated in the components of reinforcement learning, states, actions, and reinforcement signal which results in introduction of the oppositional target domain estimation algorithm, OTE. OTE reduces the search and navigation area and accelerates the speed of search for a target. The OTE algorithm is limited to the applications, in which the model of the environment is provided for the agent. Hence, further investigation is conducted to extend the concept of opposition to the model-free reinforcement learning algorithms. This extension contributes to the generating of several algorithms based on using the concept of opposition for Q(lambda) technique. The design of reinforcement learning agent depends on the application. The emphasize of this research is on the characteristics of the actions. Hence, the primary challenge of this work is design and incorporation of the opposite actions in the framework of RL agents. In this research, three different applications, namely grid navigation, elevator control problem, and image thresholding are implemented to address this challenge in context of different applications. The design challenges and some solutions to overcome the problems and improve the algorithms are also investigated. The opposition-based Q(lambda) algorithms are tested for the applications mentioned earlier. The general idea behind the opposition-based Q(lambda) algorithms is that in Q-value updating, the agent updates the value of an action in a given state. Hence, if the agent knows the value of opposite action then instead of one value, the agent can update two Q-values at the same time without taking its corresponding opposite action causing an explicit transition to opposite state. If the agent knows both values of action and its opposite action for a given state, then it can update two Q-values. This accelerates the learning process in general and the exploration phase in particular. Several algorithms are outlined in this work. The OQ(lambda) will be introduced to accelerate Q(lambda) algorithm in discrete state spaces. The NOQ(lambda) method is an extension of OQ(lambda) to operate in a broader range of non-deterministic environments. The update of the opposition trace in OQ(lambda) depends on the next state of the opposite action (which generally is not taken by the agent). This limits the usability of this technique to the deterministic environments because the next state should be known to the agent. NOQ(lambda) will be presented to update the opposition trace independent of knowing the next state for the opposite action. The results show the improvement of the performance in terms of running time for the proposed algorithms comparing to the standard Q(lambda) technique

University of Waterloo's Institutional Repository

Graph kernels based on tree patterns for molecules

Author: Mahé Pierre
Vert Jean-Philippe
Publication venue
Publication date: 15/09/2006
Field of study

Motivated by chemical applications, we revisit and extend a family of positive definite kernels for graphs based on the detection of common subtrees, initially proposed by Ramon et al. (2003). We propose new kernels with a parameter to control the complexity of the subtrees used as features to represent the graphs. This parameter allows to smoothly interpolate between classical graph kernels based on the count of common walks, on the one hand, and kernels that emphasize the detection of large common subtrees, on the other hand. We also propose two modular extensions to this formulation. The first extension increases the number of subtrees that define the feature space, and the second one removes noisy features from the graph representations. We validate experimentally these new kernels on binary classification tasks consisting in discriminating toxic and non-toxic molecules with support vector machines

arXiv.org e-Print Archive

HAL-MINES ParisTech

Kolmogorov Complexity in perspective. Part II: Classification, Information Processing and Duality

Author: Ferbus-Zanda Marie
Publication venue
Publication date: 01/01/2010
Field of study

We survey diverse approaches to the notion of information: from Shannon entropy to Kolmogorov complexity. Two of the main applications of Kolmogorov complexity are presented: randomness and classification. The survey is divided in two parts published in a same volume. Part II is dedicated to the relation between logic and information system, within the scope of Kolmogorov algorithmic information theory. We present a recent application of Kolmogorov complexity: classification using compression, an idea with provocative implementation by authors such as Bennett, Vitanyi and Cilibrasi. This stresses how Kolmogorov complexity, besides being a foundation to randomness, is also related to classification. Another approach to classification is also considered: the so-called "Google classification". It uses another original and attractive idea which is connected to the classification using compression and to Kolmogorov complexity from a conceptual point of view. We present and unify these different approaches to classification in terms of Bottom-Up versus Top-Down operational modes, of which we point the fundamental principles and the underlying duality. We look at the way these two dual modes are used in different approaches to information system, particularly the relational model for database introduced by Codd in the 70's. This allows to point out diverse forms of a fundamental duality. These operational modes are also reinterpreted in the context of the comprehension schema of axiomatic set theory ZF. This leads us to develop how Kolmogorov's complexity is linked to intensionality, abstraction, classification and information system.Comment: 43 page

arXiv.org e-Print Archive

Hal-Diderot

Synchronous Online Philosophy Courses: An Experiment in Progress

Author: McDonald Fritz
Publication venue
Publication date: 01/01/2018
Field of study

There are two main ways to teach a course online: synchronously or asynchronously. In an asynchronous course, students can log on at their convenience and do the course work. In a synchronous course, there is a requirement that all students be online at specific times, to allow for a shared course environment. In this article, the author discusses the strengths and weaknesses of synchronous online learning for the teaching of undergraduate philosophy courses. The author discusses specific strategies and technologies he uses in the teaching of online philosophy courses. In particular, the author discusses how he uses videoconferencing to create a classroom-like environment in an online class

PhilPapers

Theoretical analysis of the electrical aspects of the basic electro-impulse problem in aircraft de-icing applications

Author: Henderson Robert A.
Schrag Robert L.
Publication venue
Publication date
Field of study

A method of modelling a system consisting of a cylindrical coil with its axis perpendicular to a metal plate of finite thickness, and a simple electrical circuit for producing a transient current in the coil, is discussed in the context of using such a system for de-icing aircraft surfaces. A transmission line model of the coil and metal plate is developed as the heart of the system model. It is shown that this transmission model is central to calculation of the coil impedance, the coil current, the magnetic fields established on the surfaces of the metal plate, and the resultant total force between the coil and the plate. FORTRAN algorithms were developed for numerical calculation of each of these quantities, and the algorithms were applied to an experimental prototype system in which these quantities had been measured. Good agreement is seen to exist between the predicted and measured results

NASA Technical Reports Server

Another Look at the Confidence Intervals for the Noncentral T Distribution

Author: Lecoutre Bruno
Publication venue: DigitalCommons@WayneState
Publication date: 01/05/2007
Field of study

An alternative approach to the computation of confidence intervals for the noncentrality parameter of the Noncentral t distribution is proposed. It involves the percent points of a statistical distribution. This conceptual improvement renders the technical process for deriving the limits more comprehensible. Accurate approximations can be derived and easily used

Crossref

Digital Commons@Wayne State University

A programmatic approach to finding document edges in images

Author
Publication venue
Publication date
Field of study

It is very common that a valid ID is required when signing up for websites, services and apps. As more and more of our daily tasks move to smartphones, users should have the ability to simply take a picture of their form of identification, regardless of the background and orientation. The proposed algorithm allows to process images coming from cameras, identifying the borders of paper documents, to then crop and straighten the selected area to fit a rectangular container. The proposed solution would not use artificial intelligence models, as it would be extremely difficult to gather a large enough dataset of valid IDs, even after considering data augmentation techniques

Padua Thesis and Dissertation Archive

Anonymity Analysis of Cryptocurrencies

Author: Morris Liam
Publication venue: RIT Scholar Works
Publication date: 20/04/2015
Field of study

Cash in the real world allows for parties to exchange currency without the need to go through some sort of central authority. One person, Alice, can simply hand cash over to another person, Bob. In this transaction the only two people that have knowledge of this exchange are Alice and Bob. Until recently there was no electronic equivalent to this exchange. In 1982 David Chaum proposed a system of anonymous electronic cash based on blind signatures, and in 1990 founded DigiCash as an electronic cash company. There were a few banks that implemented electronic cash systems, but these banks and DigiCash ultimately went bankrupt in 1997 and 1998 despite the enthusiasm surrounding anonymous electronic cash. Between 1998 and 2008 there were no successful implementations of electronic cash that offer a decentralized, anonymous, and untraceable system. In 2008 a paper was published by Satoshi Nakamoto on the cryptocurrency known as Bitcoin. A cryptocurrency is a form of electronic cash backed by mathematical and cryptographic constructs, unlike traditional currency which was historically backed by gold or silver. Cryptocurrencies have seen rising popularity in recent years due to their decentralized, distributed, peer-to-peer protocols. Part of this rising popularity is also attributable to the supposed anonymity of these protocols; however, due to the public transaction history required for these protocols and the fact that transactions are pseudonymous and not purely anonymous, this supposed anonymity does not exist. While the systems may achieve the goal of decentralized currency it does not achieve the goal of untraceability. In this thesis we analyze the technical implementations of Bitcoin and other cryptocurrencies to determine the level of anonymity provided by these protocols. We also analyze proposed improvements for their feasibility

RIT Scholar Works

Cork oak woodland land-cover types classification: a comparison between UAV sensed imagery and field survey

Author: Cerasoli Sofia
Gómez-Candón David
Heuschmidt Florence
Silva Joao M.N.
Soares Cristina
Publication venue: 'Informa UK Limited'
Publication date: 29/07/2021
Field of study

This work assesses the use of aerial imagery for the vegetation cover characterization in cork oak woodlands. The study was conducted in a cork oak woodland in central Portugal during the summer of 2017. Two supervised classification methods, pixel-based and object-based image analysis (OBIA), were tested using a high spatial resolution image mosaic. Images were captured by an unmanned aerial vehicle (UAV) equipped with a red, green, blue (RGB) camera. Four different vegetation covers were distinguished: cork oak, shrubs, grass and other (bare soil and tree shadow). Results have been compared with field data obtained by the point-intercept (PI) method. Data comparison reveals the reliability of aerial imagery classification methods in cork oak woodlands. Results show that cork oak was accurately classified at a level of 82.7% with pixel-based method and 79.5% with OBIA . 96.7% of shrubs were identified by OBIA, whereas there was an overestimation of 21.7% with pixel approach. Grass presents an overestimation of 22.7% with OBIA and 12.0% with pixel-based method. Limitations rise from using only spectral information in the visible range. Thus, further research with the use of additional bands (vegetation indices or height information) could result in better land-cover type classification.info:eu-repo/semantics/acceptedVersio

IRTA Pubpro