170,855 research outputs found
Treebank-based acquisition of Chinese LFG resources for parsing and generation
This thesis describes a treebank-based approach to automatically acquire robust,wide-coverage Lexical-Functional Grammar (LFG) resources for Chinese parsing
and generation, which is part of a larger project on the rapid construction of deep, large-scale, constraint-based, multilingual grammatical resources. I present an application-oriented LFG analysis for Chinese core linguistic phenomena and (in cooperation with PARC) develop a gold-standard dependency-bank of Chinese f-structures for evaluation. Based on the Penn Chinese Treebank, I design and implement two architectures for inducing Chinese LFG resources, one annotation-based and the other dependency conversion-based. I then apply the f-structure acquisition algorithm together with external, state-of-the-art parsers to parsing new text into "proto" f-structures. In order to convert "proto" f-structures into "proper" f-structures or deep dependencies, I present a novel Non-Local Dependency (NLD) recovery algorithm using subcategorisation frames and f-structure paths linking antecedents and traces in NLDs extracted from the automatically-built LFG f-structure treebank. Based on the grammars extracted from the f-structure annotated treebank, I develop a PCFG-based chart generator and a new n-gram based pure dependency generator to realise Chinese sentences from LFG f-structures.
The work reported in this thesis is the first effort to scale treebank-based, probabilistic Chinese LFG resources from proof-of-concept research to unrestricted, real
text. Although this thesis concentrates on Chinese and LFG, many of the methodologies, e.g. the acquisition of predicate-argument structures, NLD resolution and
the PCFG- and dependency n-gram-based generation models, are largely language and formalism independent and should generalise to diverse languages as well as to labelled bilexical dependency representations other than LFG
Requirements for a Research-oriented IC Design System
Computer-aided design techniques for integrated circuits grown in an incremental way, responding to various perceived needs, so that today there are a number of useful programs for logic generation, simulation at various levels, test preparation, artwork generation and
analysis (including design rule checking), and interactive graphical editing. While the design of many circuits has benefitted from these programs, when industry wants to produce a high-volume part, the design and layout are done manually, followed by digitizing and
perhaps some graphic editing before it is converted to pattern generation format, leading to the often heard statement that computer-aided design of integrated circuits doesn't work. If progress is to be made, it seems clear that the entire design process has to be thought through in basic terms, and much more attention must
be paid to the way in which computational techniques can complement the designer's abilities. Currently, it is appropriate to try to characterize the design process in abstract terms, so that implementation and technological biases don't cloud the view of a desired system. In this paper, we briefly describe the conversion of
algorithms to masks at a very general level, and then describe several projects at MIT which aim to provide contributions to an integrated design system. It is emphasized that no complete system design exists
now at MIT, and that we believe that general design considerations must constantly be tested by building (and rebuilding) the various subcomponents, the structure of which is guided by our view of the overall design process
Gauge sector statistics of intersecting D-brane models
In this article, which is based on the first part of my PhD thesis, I review
the statistics of the open string sector in T^6/(Z_2xZ_2) orientifold
compactifications of the type IIA string. After an introduction to the
orientifold setup, I discuss the two different techniques that have been
developed to analyse the gauge sector statistics, using either a saddle point
approximation or a direct computer based method. The two approaches are
explained and compared by means of eight- and six-dimensional toy models. In
the four-dimensional case the results are presented in detail. Special emphasis
is put on models containing phenomenologically interesting gauge groups and
chiral matter, in particular those containing a standard model or SU(5) part.Comment: 51 pages, 29 figures; v2: ref. added, version to appear in Fortsch.
Phys; v3: ref. adde
Procedural function-based modelling of volumetric microstructures
We propose a new approach to modelling heterogeneous objects containing internal volumetric structures with size of details orders of magnitude smaller than the overall size of the object. The proposed function-based procedural representation provides compact, precise, and arbitrarily parameterised models of coherent microstructures, which can undergo blending, deformations, and other geometric operations, and can be directly rendered and fabricated without generating any auxiliary representations (such as polygonal meshes and voxel arrays). In particular, modelling of regular lattices and cellular microstructures as well as irregular porous media is discussed and illustrated. We also present a method to estimate parameters of the given model by fitting it to microstructure data obtained with magnetic resonance imaging and other measurements of natural and artificial objects. Examples of rendering and digital fabrication of microstructure models are presented
Tensor Product Generation Networks for Deep NLP Modeling
We present a new approach to the design of deep networks for natural language
processing (NLP), based on the general technique of Tensor Product
Representations (TPRs) for encoding and processing symbol structures in
distributed neural networks. A network architecture --- the Tensor Product
Generation Network (TPGN) --- is proposed which is capable in principle of
carrying out TPR computation, but which uses unconstrained deep learning to
design its internal representations. Instantiated in a model for image-caption
generation, TPGN outperforms LSTM baselines when evaluated on the COCO dataset.
The TPR-capable structure enables interpretation of internal representations
and operations, which prove to contain considerable grammatical content. Our
caption-generation model can be interpreted as generating sequences of
grammatical categories and retrieving words by their categories from a plan
encoded as a distributed representation
- âŠ