Search CORE

1,262 research outputs found

Use of Weighted Finite State Transducers in Part of Speech Tagging

Author: Radev Dragomir R.
Tzoukermann Evelyne
Publication venue
Publication date: 01/01/1997
Field of study

This paper addresses issues in part of speech disambiguation using finite-state transducers and presents two main contributions to the field. One of them is the use of finite-state machines for part of speech tagging. Linguistic and statistical information is represented in terms of weights on transitions in weighted finite-state transducers. Another contribution is the successful combination of techniques -- linguistic and statistical -- for word disambiguation, compounded with the notion of word classes.Comment: uses psfig, ipamac

arXiv.org e-Print Archive

CiteSeerX

An implementation and analysis of the Abstract Syntax Notation One and the basic encoding rules

Author: Harvey James D.
Weaver Alfred C.
Publication venue
Publication date
Field of study

The details of abstract syntax notation one standard (ASN.1) and the basic encoding rules standard (BER) that collectively solve the problem of data transfer across incompatible host environments are presented, and a compiler that was built to automate their use is described. Experiences with this compiler are also discussed which provide a quantitative analysis of the performance costs associated with the application of these standards. An evaluation is offered as to how well suited ASN.1 and BER are in solving the common data representation problem

NASA Technical Reports Server

Morphological Disambiguation by Voting Constraints

Author: Oflazer Kemal
Tur Gokhan
Publication venue
Publication date: 01/01/1997
Field of study

We present a constraint-based morphological disambiguation system in which individual constraints vote on matching morphological parses, and disambiguation of all the tokens in a sentence is performed at the end by selecting parses that receive the highest votes. This constraint application paradigm makes the outcome of the disambiguation independent of the rule sequence, and hence relieves the rule developer from worrying about potentially conflicting rule sequencing. Our results for disambiguating Turkish indicate that using about 500 constraint rules and some additional simple statistics, we can attain a recall of 95-96% and a precision of 94-95% with about 1.01 parses per token. Our system is implemented in Prolog and we are currently investigating an efficient implementation based on finite state transducers.Comment: 8 pages, Latex source. To appear in Proceedings of ACL/EACL'97 Compressed postscript also available as ftp://ftp.cs.bilkent.edu.tr/pub/ko/acl97.ps.

arXiv.org e-Print Archive

CiteSeerX

Crossref

A Verified Information-Flow Architecture

Author: Collins Nathan
de Amorim Arthur Azevedo
DeHon André
Demange Delphine
Hritcu Catalin
Pichardie David
Pierce Benjamin C.
Pollack Randy
Tolmach Andrew
Publication venue
Publication date: 01/01/2016
Field of study

SAFE is a clean-slate design for a highly secure computer system, with pervasive mechanisms for tracking and limiting information flows. At the lowest level, the SAFE hardware supports fine-grained programmable tags, with efficient and flexible propagation and combination of tags as instructions are executed. The operating system virtualizes these generic facilities to present an information-flow abstract machine that allows user programs to label sensitive data with rich confidentiality policies. We present a formal, machine-checked model of the key hardware and software mechanisms used to dynamically control information flow in SAFE and an end-to-end proof of noninterference for this model. We use a refinement proof methodology to propagate the noninterference property of the abstract machine down to the concrete machine level. We use an intermediate layer in the refinement chain that factors out the details of the information-flow control policy and devise a code generator for compiling such information-flow policies into low-level monitor code. Finally, we verify the correctness of this generator using a dedicated Hoare logic that abstracts from low-level machine instructions into a reusable set of verified structured code generators

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

PDXScholar (Portland State University)

HAL-Rennes 1

Error-tolerant Finite State Recognition with Applications to Morphological Analysis and Spelling Correction

Author: Oflazer Kemal
Publication venue
Publication date: 21/07/1995
Field of study

Error-tolerant recognition enables the recognition of strings that deviate mildly from any string in the regular set recognized by the underlying finite state recognizer. Such recognition has applications in error-tolerant morphological processing, spelling correction, and approximate string matching in information retrieval. After a description of the concepts and algorithms involved, we give examples from two applications: In the context of morphological analysis, error-tolerant recognition allows misspelled input word forms to be corrected, and morphologically analyzed concurrently. We present an application of this to error-tolerant analysis of agglutinative morphology of Turkish words. The algorithm can be applied to morphological analysis of any language whose morphology is fully captured by a single (and possibly very large) finite state transducer, regardless of the word formation processes and morphographemic phenomena involved. In the context of spelling correction, error-tolerant recognition can be used to enumerate correct candidate forms from a given misspelled string within a certain edit distance. Again, it can be applied to any language with a word list comprising all inflected forms, or whose morphology is fully described by a finite state transducer. We present experimental results for spelling correction for a number of languages. These results indicate that such recognition works very efficiently for candidate generation in spelling correction for many European languages such as English, Dutch, French, German, Italian (and others) with very large word lists of root and inflected forms (some containing well over 200,000 forms), generating all candidate solutions within 10 to 45 milliseconds (with edit distance 1) on a SparcStation 10/41. For spelling correction in Turkish, error-tolerantComment: Replaces 9504031. gzipped, uuencoded postscript file. To appear in Computational Linguistics Volume 22 No:1, 1996, Also available as ftp://ftp.cs.bilkent.edu.tr/pub/ko/clpaper9512.ps.

arXiv.org e-Print Archive

CiteSeerX

Bilkent University Institutional Repository

A formally verified compiler back-end

Author: A Dold
A Dold
A Hobor
A Pnueli
ACJ Fox
AJ Chlipala
AW Appel
AW Appel
AW Appel
BK Rosen
C Lindig
CW Barrett
D Cachera
D Lacey
D Leinenbach
D Leinenbach
E Eide
F Henderson
G Barthe
G Barthe
G Barthe
G Barthe
G Clemmensen
G Goos
G Klein
G Li
G Li
G Morrisett
G Morrisett
GA Kildall
GC Necula
GC Necula
GC Necula
GC Necula
GJ Chaitin
GP Huet
H-J Boehm
IBM Corporation
J Chen
J Guttman
J Knoop
J Knoop
J McCarthy
J-B Tristan
J-B Tristan
JO Blech
JR Ellis
JS Moore
JS Moore
L Beringer
L Chirica
L George
L Rideau
LD Zuck
M Huisman
M Müller-Olm
M Strecker
MA Dave
N Benton
P Letouzey
P Letouzey
PH Hartel
PW O’Hearn
Q Huang
R Milner
R Stärk
S Beyer
S Blazy
S Blazy
S Coupet-Grimal
S Gulwani
S Lerner
SL Peyton Jones
SS Muchnick
TC Hales
WM McKeeman
X Feng
X Leroy
X Leroy
X Leroy
X Leroy
X Rival
Xavier Leroy
Y Bertot
Y Bertot
Z Shao
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2008
Field of study

This article describes the development and formal verification (proof of semantic preservation) of a compiler back-end from Cminor (a simple imperative intermediate language) to PowerPC assembly code, using the Coq proof assistant both for programming the compiler and for proving its correctness. Such a verified compiler is useful in the context of formal methods applied to the certification of critical software: the verification of the compiler guarantees that the safety properties proved on the source code hold for the executable compiled code as well

arXiv.org e-Print Archive

CiteSeerX

Crossref

INRIA a CCSD electronic archive server