Article thumbnail

DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT DRAFT MARKUP OPTIMISATION BY PROBABILISTIC PARSING: LALR(5000000), THE FIRST PRIZE WINNER IN THE 2001 ICFP PROGRAMMING CONTEST

By Chung-chieh Shan and Dylan Thurston

Abstract

will become clear below. Our team is named “Haskell Carrots”, for reasons that will not become clear below. Our entry won the first prize. The challenge task of the contest was to write an optimisation program for a markup language called “SML/NG”. Given an SML/NG document as input and a time limit as a parameter, the program has to produce a semantically equivalent document that is as small as possible. We describe this task in more detail in §1. Our approach was probably quite different from that of many groups: We spent a day and a half thinking about the algorithm before writing any code. The key insight was to think of the task as an optimal parsing problem. This observation was made relatively quickly; we explain it below in §2. Having made the observation, we then spent some time reading up on the standard methods for context-free parsing. We decided to use CYK parsing, a dynamic programming algorithm. We searched the literature for techniques to improve parsing performance, and specialised the general algorithms we learned about to the specific situation of SML/NG. Our program had to use available time well without exceeding it; this called for an incrementa

Year: 2012
OAI identifier: oai:CiteSeerX.psu:10.1.1.215.5098
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://citeseerx.ist.psu.edu/v... (external link)
  • http://www.digitas.harvard.edu... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.