25 research outputs found
Representing Online Handwriting for Recognition in Large Vision-Language Models
The adoption of tablets with touchscreens and styluses is increasing, and a
key feature is converting handwriting to text, enabling search, indexing, and
AI assistance. Meanwhile, vision-language models (VLMs) are now the go-to
solution for image understanding, thanks to both their state-of-the-art
performance across a variety of tasks and the simplicity of a unified approach
to training, fine-tuning, and inference. While VLMs obtain high performance on
image-based tasks, they perform poorly on handwriting recognition when applied
naively, i.e., by rendering handwriting as an image and performing optical
character recognition (OCR). In this paper, we study online handwriting
recognition with VLMs, going beyond naive OCR. We propose a novel tokenized
representation of digital ink (online handwriting) that includes both a
time-ordered sequence of strokes as text, and as image. We show that this
representation yields results comparable to or better than state-of-the-art
online handwriting recognizers. Wide applicability is shown through results
with two different VLM families, on multiple public datasets. Our approach can
be applied to off-the-shelf VLMs, does not require any changes in their
architecture, and can be used in both fine-tuning and parameter-efficient
tuning. We perform a detailed ablation study to identify the key elements of
the proposed representation
The many-valued theorem prover 3TAP. 3rd. edition
This is the 3TAP handbook. 3TAP is a many-valued tableau-based
theorem prover developed at the University of Karlsruhe.
The handbook serves a triple purpose: first, it documents the
history and development of the prover 3TAP; second, it provides a
user\u27s manual, and third it is intended as a reference manual for
future developers, including porting hints.
This version of the handbook describes 3TAP Version 3.0 as of
September 30,1994
On the execution of high level formal specifications
Executable specifications can serve as prototypes of the specified system and as oracles for automated testing of implementations, and so are more useful than non-executable specifications. Executable specifications can also be debugged in much the same way as programs, allowing errors to be detected and corrected at the specification level rather than in later stages of software development. However, existing executable specification languages often force the specifier to work at a low level of abstraction, which negates many of the advantages of non-executable specifications. This dissertation shows how to execute specifications written at a level of abstraction comparable to that found in specifications written in non-executable specification languages. The key innovation is an algorithm for evaluating and satisfying first order predicate logic assertions written over abstract model types. This is important because many specification languages use such assertions. Some of the features of this algorithm were inspired by techniques from constraint logic programming