17 research outputs found
A type-logical treebank for French
International audienceThe goal of the current paper is to describe the TLGbank, a treebank of type-logical proof semi-automatically extracted from the French Treebank. Though the framework chosen for the treebank are multimodal type-logical grammars, we have ensured that the analysis is compatible with other mondern type-logical grammars, such the displacement calculus and first-order linear logic. We describe the extraction procedure, analyse first results and compare the treebank to the CCGbank
ABCツリーバンク : 学際的な言語研究のための基盤資源
会議名: 国立国語研究所オープンハウス2021, 開催地: オンライン, 会期: 2021年9月10
Semi-automated Extraction of a Wide-Coverage Type-Logical Grammar for French
International audienceThe paper describes the development of a wide-coverage type-logical grammar for French, which has been extracted from the Paris 7 treebank and received a significant amount of manual ver- ification and cleanup. The resulting treebank is evaluated using a supertagger and performs at a level comparable to the best supertagging results for English.Cet article décrit le développement d'une grammaire catégorielle à large couverture du Français, extraite à partir du corpus arboré de Paris 7 et vérifiée et corrigée manuellement. Le gram- maire catégorielle résultant est évaluée en utilisant un supertagger et obtient des résultats comparables aux meilleurs supertaggers pour l'Anglais
Perspectives on neural proof nets
In this paper I will present a novel way of combining proof net proof search
with neural networks. It contrasts with the 'standard' approach which has been
applied to proof search in type-logical grammars in various different forms. In
the standard approach, we first transform words to formulas (supertagging) then
match atomic formulas to obtain a proof. I will introduce an alternative way to
split the task into two: first, we generate the graph structure in a way which
guarantees it corresponds to a lambda-term, then we obtain the detailed
structure using vertex labelling. Vertex labelling is a well-studied task in
graph neural networks, and different ways of implementing graph generation
using neural networks will be explored.Comment: This is an extended version of an invited talk for the workshop
End-to-End Compositional Models of Vector-Based Semantic
Development of a General-Purpose Categorial Grammar Treebank
National Institute for Japanese Language and LinguisticsOchanomizu UniversityThe University of TokyoThe University of Toky
Trustworthy Formal Natural Language Specifications
Interactive proof assistants are computer programs carefully constructed to
check a human-designed proof of a mathematical claim with high confidence in
the implementation. However, this only validates truth of a formal claim, which
may have been mistranslated from a claim made in natural language. This is
especially problematic when using proof assistants to formally verify the
correctness of software with respect to a natural language specification. The
translation from informal to formal remains a challenging, time-consuming
process that is difficult to audit for correctness.
This paper shows that it is possible to build support for specifications
written in expressive subsets of natural language, within existing proof
assistants, consistent with the principles used to establish trust and
auditability in proof assistants themselves. We implement a means to provide
specifications in a modularly extensible formal subset of English, and have
them automatically translated into formal claims, entirely within the Lean
proof assistant. Our approach is extensible (placing no permanent restrictions
on grammatical structure), modular (allowing information about new words to be
distributed alongside libraries), and produces proof certificates explaining
how each word was interpreted and how the sentence's structure was used to
compute the meaning.
We apply our prototype to the translation of various English descriptions of
formal specifications from a popular textbook into Lean formalizations; all can
be translated correctly with a modest lexicon with only minor modifications
related to lexicon size.Comment: arXiv admin note: substantial text overlap with arXiv:2205.0781