59,413 research outputs found
Algorithmic Programming Language Identification
Motivated by the amount of code that goes unidentified on the web, we
introduce a practical method for algorithmically identifying the programming
language of source code. Our work is based on supervised learning and
intelligent statistical features. We also explored, but abandoned, a
grammatical approach. In testing, our implementation greatly outperforms that
of an existing tool that relies on a Bayesian classifier. Code is written in
Python and available under an MIT license.Comment: 11 pages. Code:
https://github.com/simon-weber/Programming-Language-Identificatio
The statistical mechanics of multi-index matching problems with site disorder
We study the statistical mechanics of multi-index matching problems where the
quenched disorder is a geometric site disorder rather than a link disorder. A
recently developed functional formalism is exploited which yields exact results
in the finite temperature thermodynamic limit. Particular attention is paid to
the zero temperature limit of maximal matching problems where the method allows
us to obtain the average value of the optimal match and also sheds light on the
algorithmic heuristics leading to that optimal matchComment: 11 pages 11 figures, RevTe
Recommended from our members
O.R.,Statistics,A.I.- the potential for interdisciplinary progress
This paper examines the need for O.R. workers to become more involved in the development of A.I. A brief outline of A.I. is provided noting problems, techniques and objectives similar to those found in O.R. This outline gives an indication of how interdisciplinary development might proceed and indicates the direction in which O.R. training should be progressin
Designing optimal- and fast-on-average pattern matching algorithms
Given a pattern and a text , the speed of a pattern matching algorithm
over with regard to , is the ratio of the length of to the number of
text accesses performed to search into . We first propose a general
method for computing the limit of the expected speed of pattern matching
algorithms, with regard to , over iid texts. Next, we show how to determine
the greatest speed which can be achieved among a large class of algorithms,
altogether with an algorithm running this speed. Since the complexity of this
determination make it impossible to deal with patterns of length greater than
4, we propose a polynomial heuristic. Finally, our approaches are compared with
9 pre-existing pattern matching algorithms from both a theoretical and a
practical point of view, i.e. both in terms of limit expected speed on iid
texts, and in terms of observed average speed on real data. In all cases, the
pre-existing algorithms are outperformed
Code Generation = A* + BURS
A system called BURS that is based on term rewrite systems and a search algorithm A* are combined to produce a code generator that generates optimal code. The theory underlying BURS is re-developed, formalised and explained in this work. The search algorithm uses a cost heuristic that is derived from the termrewrite system to direct the search. The advantage of using a search algorithm is that we need to compute only those costs that may be part of an optimal rewrite sequence
- …