12,210 research outputs found

    Discovering Restricted Regular Expressions with Interleaving

    Full text link
    Discovering a concise schema from given XML documents is an important problem in XML applications. In this paper, we focus on the problem of learning an unordered schema from a given set of XML examples, which is actually a problem of learning a restricted regular expression with interleaving using positive example strings. Schemas with interleaving could present meaningful knowledge that cannot be disclosed by previous inference techniques. Moreover, inference of the minimal schema with interleaving is challenging. The problem of finding a minimal schema with interleaving is shown to be NP-hard. Therefore, we develop an approximation algorithm and a heuristic solution to tackle the problem using techniques different from known inference algorithms. We do experiments on real-world data sets to demonstrate the effectiveness of our approaches. Our heuristic algorithm is shown to produce results that are very close to optimal.Comment: 12 page

    Feedback Generation for Performance Problems in Introductory Programming Assignments

    Full text link
    Providing feedback on programming assignments manually is a tedious, error prone, and time-consuming task. In this paper, we motivate and address the problem of generating feedback on performance aspects in introductory programming assignments. We studied a large number of functionally correct student solutions to introductory programming assignments and observed: (1) There are different algorithmic strategies, with varying levels of efficiency, for solving a given problem. These different strategies merit different feedback. (2) The same algorithmic strategy can be implemented in countless different ways, which are not relevant for reporting feedback on the student program. We propose a light-weight programming language extension that allows a teacher to define an algorithmic strategy by specifying certain key values that should occur during the execution of an implementation. We describe a dynamic analysis based approach to test whether a student's program matches a teacher's specification. Our experimental results illustrate the effectiveness of both our specification language and our dynamic analysis. On one of our benchmarks consisting of 2316 functionally correct implementations to 3 programming problems, we identified 16 strategies that we were able to describe using our specification language (in 95 minutes after inspecting 66, i.e., around 3%, implementations). Our dynamic analysis correctly matched each implementation with its corresponding specification, thereby automatically producing the intended feedback.Comment: Tech report/extended version of FSE 2014 pape

    The Complexity of SORE-definability Problems

    Get PDF
    Single occurrence regular expressions (SORE) are a special kind of deterministic regular expressions, which are extensively used in the schema languages DTD and XSD for XML documents. In this paper, with motivations from the simplification of XML schemas, we consider the SORE-definability problem: Given a regular expression, decide whether it has an equivalent SORE. We investigate extensively the complexity of the SORE-definability problem: We consider both (standard) regular expressions and regular expressions with counting, and distinguish between the alphabets of size at least two and unary alphabets. In all cases, we obtain tight complexity bounds. In addition, we consider another variant of this problem, the bounded SORE-definability problem, which is to decide, given a regular expression E and a number M (encoded in unary or binary), whether there is an SORE, which is equivalent to E on the set of words of length at most M. We show that in several cases, there is an exponential decrease in the complexity when switching from the SORE-definability problem to its bounded variant

    Regular Boardgames

    Full text link
    We propose a new General Game Playing (GGP) language called Regular Boardgames (RBG), which is based on the theory of regular languages. The objective of RBG is to join key properties as expressiveness, efficiency, and naturalness of the description in one GGP formalism, compensating certain drawbacks of the existing languages. This often makes RBG more suitable for various research and practical developments in GGP. While dedicated mostly for describing board games, RBG is universal for the class of all finite deterministic turn-based games with perfect information. We establish foundations of RBG, and analyze it theoretically and experimentally, focusing on the efficiency of reasoning. Regular Boardgames is the first GGP language that allows efficient encoding and playing games with complex rules and with large branching factor (e.g.\ amazons, arimaa, large chess variants, go, international checkers, paper soccer).Comment: AAAI 201

    Minimal model of associative learning for cross-situational lexicon acquisition

    Get PDF
    An explanation for the acquisition of word-object mappings is the associative learning in a cross-situational scenario. Here we present analytical results of the performance of a simple associative learning algorithm for acquiring a one-to-one mapping between NN objects and NN words based solely on the co-occurrence between objects and words. In particular, a learning trial in our learning scenario consists of the presentation of C+1<NC + 1 < N objects together with a target word, which refers to one of the objects in the context. We find that the learning times are distributed exponentially and the learning rates are given by ln[N(N1)C+(N1)2]\ln{[\frac{N(N-1)}{C + (N-1)^{2}}]} in the case the NN target words are sampled randomly and by 1Nln[N1C]\frac{1}{N} \ln [\frac{N-1}{C}] in the case they follow a deterministic presentation sequence. This learning performance is much superior to those exhibited by humans and more realistic learning algorithms in cross-situational experiments. We show that introduction of discrimination limitations using Weber's law and forgetting reduce the performance of the associative algorithm to the human level

    Self-Optimizing and Pareto-Optimal Policies in General Environments based on Bayes-Mixtures

    Full text link
    The problem of making sequential decisions in unknown probabilistic environments is studied. In cycle tt action yty_t results in perception xtx_t and reward rtr_t, where all quantities in general may depend on the complete history. The perception xtx_t and reward rtr_t are sampled from the (reactive) environmental probability distribution μ\mu. This very general setting includes, but is not limited to, (partial observable, k-th order) Markov decision processes. Sequential decision theory tells us how to act in order to maximize the total expected reward, called value, if μ\mu is known. Reinforcement learning is usually used if μ\mu is unknown. In the Bayesian approach one defines a mixture distribution ξ\xi as a weighted sum of distributions \nu\in\M, where \M is any class of distributions including the true environment μ\mu. We show that the Bayes-optimal policy pξp^\xi based on the mixture ξ\xi is self-optimizing in the sense that the average value converges asymptotically for all \mu\in\M to the optimal value achieved by the (infeasible) Bayes-optimal policy pμp^\mu which knows μ\mu in advance. We show that the necessary condition that \M admits self-optimizing policies at all, is also sufficient. No other structural assumptions are made on \M. As an example application, we discuss ergodic Markov decision processes, which allow for self-optimizing policies. Furthermore, we show that pξp^\xi is Pareto-optimal in the sense that there is no other policy yielding higher or equal value in {\em all} environments \nu\in\M and a strictly higher value in at least one.Comment: 15 page
    corecore