513 research outputs found
Handwritten Word Spotting with Corrected Attributes
International audienceWe propose an approach to multi-writer word spotting, where the goal is to find a query word in a dataset comprised of document images. We propose an attributes-based approach that leads to a low-dimensional, fixed-length representation of the word images that is fast to compute and, especially, fast to compare. This approach naturally leads to an unified representation of word images and strings, which seamlessly allows one to indistinctly perform query-by-example, where the query is an image, and query-by-string, where the query is a string. We also propose a calibration scheme to correct the attributes scores based on Canonical Correlation Analysis that greatly improves the results on a challenging dataset. We test our approach on two public datasets showing state-of-the-art results
Natural Language Commanding via Program Synthesis
We present Semantic Interpreter, a natural language-friendly AI system for
productivity software such as Microsoft Office that leverages large language
models (LLMs) to execute user intent across application features. While LLMs
are excellent at understanding user intent expressed as natural language, they
are not sufficient for fulfilling application-specific user intent that
requires more than text-to-text transformations. We therefore introduce the
Office Domain Specific Language (ODSL), a concise, high-level language
specialized for performing actions in and interacting with entities in Office
applications. Semantic Interpreter leverages an Analysis-Retrieval prompt
construction method with LLMs for program synthesis, translating natural
language user utterances to ODSL programs that can be transpiled to application
APIs and then executed. We focus our discussion primarily on a research
exploration for Microsoft PowerPoint
Textual Membership Queries
Human labeling of data can be very time-consuming and expensive, yet, in many
cases it is critical for the success of the learning process. In order to
minimize human labeling efforts, we propose a novel active learning solution
that does not rely on existing sources of unlabeled data. It uses a small
amount of labeled data as the core set for the synthesis of useful membership
queries (MQs) - unlabeled instances generated by an algorithm for human
labeling. Our solution uses modification operators, functions that modify
instances to some extent. We apply the operators on a small set of instances
(core set), creating a set of new membership queries. Using this framework, we
look at the instance space as a search space and apply search algorithms in
order to generate new examples highly relevant to the learner. We implement
this framework in the textual domain and test it on several text classification
tasks and show improved classifier performance as more MQs are labeled and
incorporated into the training set. To the best of our knowledge, this is the
first work on membership queries in the textual domain.Comment: Accepted to IJCAI 2020. Code is available at
github.com/jonzarecki/textual-mqs . Additional material is available at
tinyurl.com/sup-textualmqs . SOLE copyright holder is IJCAI (International
Joint Conferences on Artificial Intelligence), all rights reserve
Using contour information and segmentation for object registration, modeling and retrieval
This thesis considers different aspects of the utilization of contour information and syntactic and semantic image segmentation for object registration, modeling and retrieval in the context of content-based indexing and retrieval in large collections of images. Target applications include retrieval in collections of closed silhouettes, holistic w ord recognition in handwritten historical manuscripts and shape registration. Also, the thesis explores the feasibility of contour-based syntactic features for improving the correspondence of the output of bottom-up segmentation to semantic objects present in the scene and discusses the feasibility of different strategies for image analysis utilizing contour information, e.g. segmentation driven by visual features versus segmentation driven by shape models or semi-automatic in selected application scenarios.
There are three contributions in this thesis. The first contribution considers structure analysis based on the shape and spatial configuration of image regions (socalled syntactic visual features) and their utilization for automatic image segmentation. The second contribution is the study of novel shape features, matching algorithms and similarity measures. Various applications of the proposed solutions are presented throughout the thesis providing the basis for the third contribution which is a discussion of the feasibility of different recognition strategies utilizing contour information. In each case, the performance and generality of the proposed approach has been analyzed based on extensive rigorous experimentation using as large as possible test collections
- …