80,069 research outputs found
CrossBeam: Learning to Search in Bottom-Up Program Synthesis
Many approaches to program synthesis perform a search within an enormous
space of programs to find one that satisfies a given specification. Prior works
have used neural models to guide combinatorial search algorithms, but such
approaches still explore a huge portion of the search space and quickly become
intractable as the size of the desired program increases. To tame the search
space blowup, we propose training a neural model to learn a hands-on search
policy for bottom-up synthesis, instead of relying on a combinatorial search
algorithm. Our approach, called CrossBeam, uses the neural model to choose how
to combine previously-explored programs into new programs, taking into account
the search history and partial program executions. Motivated by work in
structured prediction on learning to search, CrossBeam is trained on-policy
using data extracted from its own bottom-up searches on training tasks. We
evaluate CrossBeam in two very different domains, string manipulation and logic
programming. We observe that CrossBeam learns to search efficiently, exploring
much smaller portions of the program space compared to the state-of-the-art.Comment: Published at ICLR 202
Linguistically-driven framework for computationally efficient and scalable sign recognition
We introduce a new general framework for sign recognition from monocular video using limited quantities of annotated data. The novelty of the hybrid framework we describe here is that we exploit state-of-the art learning methods while also incorporating features based on what we know about the linguistic composition of lexical signs. In particular, we analyze hand shape, orientation, location, and motion trajectories, and then use CRFs to combine this linguistically significant information for purposes of sign recognition. Our robust modeling and recognition of these sub-components of sign production allow an efficient parameterization of the sign recognition problem as compared with purely data-driven methods. This parameterization enables a scalable and extendable time-series learning approach that advances the state of the art in sign recognition, as shown by the results reported here for recognition of isolated, citation-form, lexical signs from American Sign Language (ASL)
A new framework for sign language recognition based on 3D handshape identification and linguistic modeling
Current approaches to sign recognition by computer generally have at least some of the following limitations: they rely on laboratory
conditions for sign production, are limited to a small vocabulary, rely on 2D modeling (and therefore cannot deal with occlusions
and off-plane rotations), and/or achieve limited success. Here we propose a new framework that (1) provides a new tracking method
less dependent than others on laboratory conditions and able to deal with variations in background and skin regions (such as the
face, forearms, or other hands); (2) allows for identification of 3D hand configurations that are linguistically important in American
Sign Language (ASL); and (3) incorporates statistical information reflecting linguistic constraints in sign production. For purposes of
large-scale computer-based sign language recognition from video, the ability to distinguish hand configurations accurately is critical.
Our current method estimates the 3D hand configuration to distinguish among 77 hand configurations linguistically relevant for
ASL. Constraining the problem in this way makes recognition of 3D hand configuration more tractable and provides the information
specifically needed for sign recognition. Further improvements are obtained by incorporation of statistical information about linguistic
dependencies among handshapes within a sign derived from an annotated corpus of almost 10,000 sign tokens
- …