17,217 research outputs found

    H-TSP: Hierarchically Solving the Large-Scale Travelling Salesman Problem

    Full text link
    We propose an end-to-end learning framework based on hierarchical reinforcement learning, called H-TSP, for addressing the large-scale Travelling Salesman Problem (TSP). The proposed H-TSP constructs a solution of a TSP instance starting from the scratch relying on two components: the upper-level policy chooses a small subset of nodes (up to 200 in our experiment) from all nodes that are to be traversed, while the lower-level policy takes the chosen nodes as input and outputs a tour connecting them to the existing partial route (initially only containing the depot). After jointly training the upper-level and lower-level policies, our approach can directly generate solutions for the given TSP instances without relying on any time-consuming search procedures. To demonstrate effectiveness of the proposed approach, we have conducted extensive experiments on randomly generated TSP instances with different numbers of nodes. We show that H-TSP can achieve comparable results (gap 3.42% vs. 7.32%) as SOTA search-based approaches, and more importantly, we reduce the time consumption up to two orders of magnitude (3.32s vs. 395.85s). To the best of our knowledge, H-TSP is the first end-to-end deep reinforcement learning approach that can scale to TSP instances of up to 10000 nodes. Although there are still gaps to SOTA results with respect to solution quality, we believe that H-TSP will be useful for practical applications, particularly those that are time-sensitive e.g., on-call routing and ride hailing service.Comment: Accepted by AAAI 2023, February 202

    Bayesian networks for disease diagnosis: What are they, who has used them and how?

    Full text link
    A Bayesian network (BN) is a probabilistic graph based on Bayes' theorem, used to show dependencies or cause-and-effect relationships between variables. They are widely applied in diagnostic processes since they allow the incorporation of medical knowledge to the model while expressing uncertainty in terms of probability. This systematic review presents the state of the art in the applications of BNs in medicine in general and in the diagnosis and prognosis of diseases in particular. Indexed articles from the last 40 years were included. The studies generally used the typical measures of diagnostic and prognostic accuracy: sensitivity, specificity, accuracy, precision, and the area under the ROC curve. Overall, we found that disease diagnosis and prognosis based on BNs can be successfully used to model complex medical problems that require reasoning under conditions of uncertainty.Comment: 22 pages, 5 figures, 1 table, Student PhD first pape

    Self-Supervised Learning to Prove Equivalence Between Straight-Line Programs via Rewrite Rules

    Full text link
    We target the problem of automatically synthesizing proofs of semantic equivalence between two programs made of sequences of statements. We represent programs using abstract syntax trees (AST), where a given set of semantics-preserving rewrite rules can be applied on a specific AST pattern to generate a transformed and semantically equivalent program. In our system, two programs are equivalent if there exists a sequence of application of these rewrite rules that leads to rewriting one program into the other. We propose a neural network architecture based on a transformer model to generate proofs of equivalence between program pairs. The system outputs a sequence of rewrites, and the validity of the sequence is simply checked by verifying it can be applied. If no valid sequence is produced by the neural network, the system reports the programs as non-equivalent, ensuring by design no programs may be incorrectly reported as equivalent. Our system is fully implemented for a given grammar which can represent straight-line programs with function calls and multiple types. To efficiently train the system to generate such sequences, we develop an original incremental training technique, named self-supervised sample selection. We extensively study the effectiveness of this novel training approach on proofs of increasing complexity and length. Our system, S4Eq, achieves 97% proof success on a curated dataset of 10,000 pairs of equivalent programsComment: 30 pages including appendi

    Offline and Online Models for Learning Pairwise Relations in Data

    Get PDF
    Pairwise relations between data points are essential for numerous machine learning algorithms. Many representation learning methods consider pairwise relations to identify the latent features and patterns in the data. This thesis, investigates learning of pairwise relations from two different perspectives: offline learning and online learning.The first part of the thesis focuses on offline learning by starting with an investigation of the performance modeling of a synchronization method in concurrent programming using a Markov chain whose state transition matrix models pairwise relations between involved cores in a computer process.Then the thesis focuses on a particular pairwise distance measure, the minimax distance, and explores memory-efficient approaches to computing this distance by proposing a hierarchical representation of the data with a linear memory requirement with respect to the number of data points, from which the exact pairwise minimax distances can be derived in a memory-efficient manner. Then, a memory-efficient sampling method is proposed that follows the aforementioned hierarchical representation of the data and samples the data points in a way that the minimax distances between all data points are maximally preserved. Finally, the thesis proposes a practical non-parametric clustering of vehicle motion trajectories to annotate traffic scenarios based on transitive relations between trajectories in an embedded space.The second part of the thesis takes an online learning perspective, and starts by presenting an online learning method for identifying bottlenecks in a road network by extracting the minimax path, where bottlenecks are considered as road segments with the highest cost, e.g., in the sense of travel time. Inspired by real-world road networks, the thesis assumes a stochastic traffic environment in which the road-specific probability distribution of travel time is unknown. Therefore, it needs to learn the parameters of the probability distribution through observations by modeling the bottleneck identification task as a combinatorial semi-bandit problem. The proposed approach takes into account the prior knowledge and follows a Bayesian approach to update the parameters. Moreover, it develops a combinatorial variant of Thompson Sampling and derives an upper bound for the corresponding Bayesian regret. Furthermore, the thesis proposes an approximate algorithm to address the respective computational intractability issue.Finally, the thesis considers contextual information of road network segments by extending the proposed model to a contextual combinatorial semi-bandit framework and investigates and develops various algorithms for this contextual combinatorial setting

    Identifying Student Profiles Within Online Judge Systems Using Explainable Artificial Intelligence

    Get PDF
    Online Judge (OJ) systems are typically considered within programming-related courses as they yield fast and objective assessments of the code developed by the students. Such an evaluation generally provides a single decision based on a rubric, most commonly whether the submission successfully accomplished the assignment. Nevertheless, since in an educational context such information may be deemed insufficient, it would be beneficial for both the student and the instructor to receive additional feedback about the overall development of the task. This work aims to tackle this limitation by considering the further exploitation of the information gathered by the OJ and automatically inferring feedback for both the student and the instructor. More precisely, we consider the use of learning-based schemes—particularly, Multi-Instance Learning and classical Machine Learning formulations—to model student behaviour. Besides, Explainable Artificial Intelligence is contemplated to provide human-understandable feedback. The proposal has been evaluated considering a case of study comprising 2,500 submissions from roughly 90 different students from a programming-related course in a Computer Science degree. The results obtained validate the proposal: the model is capable of significantly predicting the user outcome (either passing or failing the assignment) solely based on the behavioural pattern inferred by the submissions provided to the OJ. Moreover, the proposal is able to identify prone-to-fail student groups and profiles as well as other relevant information, which eventually serves as feedback to both the student and the instructor.This work has been partially funded by the “Programa Redes-I3CE de investigacion en docencia universitaria del Instituto de Ciencias de la Educacion (REDES-I3CE-2020-5069)” of the University of Alicante. The third author is supported by grant APOSTD/2020/256 from “Programa I+D+I de la Generalitat Valenciana”

    Neural Architecture Search: Insights from 1000 Papers

    Full text link
    In the past decade, advances in deep learning have resulted in breakthroughs in a variety of areas, including computer vision, natural language understanding, speech recognition, and reinforcement learning. Specialized, high-performing neural architectures are crucial to the success of deep learning in these areas. Neural architecture search (NAS), the process of automating the design of neural architectures for a given task, is an inevitable next step in automating machine learning and has already outpaced the best human-designed architectures on many tasks. In the past few years, research in NAS has been progressing rapidly, with over 1000 papers released since 2020 (Deng and Lindauer, 2021). In this survey, we provide an organized and comprehensive guide to neural architecture search. We give a taxonomy of search spaces, algorithms, and speedup techniques, and we discuss resources such as benchmarks, best practices, other surveys, and open-source libraries

    On the existence of highly organized communities in networks of locally interacting agents

    Full text link
    In this paper we investigate phenomena of spontaneous emergence or purposeful formation of highly organized structures in networks of related agents. We show that the formation of large organized structures requires exponentially large, in the size of the structures, networks. Our approach is based on Kolmogorov, or descriptional, complexity of networks viewed as finite size strings. We apply this approach to the study of the emergence or formation of simple organized, hierarchical, structures based on Sierpinski Graphs and we prove a Ramsey type theorem that bounds the number of vertices in Kolmogorov random graphs that contain Sierpinski Graphs as subgraphs. Moreover, we show that Sierpinski Graphs encompass close-knit relationships among their vertices that facilitate fast spread and learning of information when agents in their vertices are engaged in pairwise interactions modelled as two person games. Finally, we generalize our findings for any organized structure with succinct representations. Our work can be deployed, in particular, to study problems related to the security of networks by identifying conditions which enable or forbid the formation of sufficiently large insider subnetworks with malicious common goal to overtake the network or cause disruption of its operation

    Continuous eigenfunctions of the transfer operator for the Dyson model

    Full text link
    In this article we prove that there exists a continuous eigenfunction for the transfer operator corresponding to potentials for the classical Dyson model in the subcritical regime for which the parameter α\alpha is greater than 3/23/2, and we conjecture that this value is sharp. This is a significant improvement on previous results where the existence of a continuous eigenfunction of the transfer operator was only established for general potentials satisfying summable variations, which would correspond to the parameter range α>2\alpha > 2. Moreover, this complements as result by Bissacot, Endo, van Enter and Le Ny \cite{vanenter}, who showed that there is no continuous eigenfunction at low temperatures. Our approach to obtaining these new results involves a novel approach based on random cluster models

    Ideograph: A Language for Expressing and Manipulating Structured Data

    Full text link
    We introduce Ideograph, a language for expressing and manipulating structured data. Its types describe kinds of structures, such as natural numbers, lists, multisets, binary trees, syntax trees with variable binding, directed multigraphs, and relational databases. Fully normalized terms of a type correspond exactly to members of the structure, analogous to a Church-encoding. Moreover, definable operations over these structures are guaranteed to respect the structures' equivalences. In this paper, we give the syntax and semantics of the non-polymorphic subset of Ideograph, and we demonstrate how it can represent and manipulate several interesting structures.Comment: In Proceedings TERMGRAPH 2022, arXiv:2303.1421

    Satisfiability of Non-Linear Transcendental Arithmetic as a Certificate Search Problem

    Full text link
    For typical first-order logical theories, satisfying assignments have a straightforward finite representation that can directly serve as a certificate that a given assignment satisfies the given formula. For non-linear real arithmetic with transcendental functions, however, no general finite representation of satisfying assignments is available. Hence, in this paper, we introduce a different form of satisfiability certificate for this theory, formulate the satisfiability verification problem as the problem of searching for such a certificate, and show how to perform this search in a systematic fashion. This does not only ease the independent verification of results, but also allows the systematic design of new, efficient search techniques. Computational experiments document that the resulting method is able to prove satisfiability of a substantially higher number of benchmark problems than existing methods
    • …
    corecore