8,946 research outputs found
A Labelled Analytic Theorem Proving Environment for Categorial Grammar
We present a system for the investigation of computational properties of
categorial grammar parsing based on a labelled analytic tableaux theorem
prover. This proof method allows us to take a modular approach, in which the
basic grammar can be kept constant, while a range of categorial calculi can be
captured by assigning different properties to the labelling algebra. The
theorem proving strategy is particularly well suited to the treatment of
categorial grammar, because it allows us to distribute the computational cost
between the algorithm which deals with the grammatical types and the algebraic
checker which constrains the derivation.Comment: 11 pages, LaTeX2e, uses examples.sty and a4wide.st
Using WordNet for Building WordNets
This paper summarises a set of methodologies and techniques for the fast
construction of multilingual WordNets. The English WordNet is used in this
approach as a backbone for Catalan and Spanish WordNets and as a lexical
knowledge resource for several subtasks.Comment: 8 pages, postscript file. In workshop on Usage of WordNet in NL
The Phyre2 web portal for protein modeling, prediction and analysis
Phyre2 is a suite of tools available on the web to predict and analyze protein structure, function and mutations. The focus of Phyre2 is to provide biologists with a simple and intuitive interface to state-of-the-art protein bioinformatics tools. Phyre2 replaces Phyre, the original version of the server for which we previously published a paper in Nature Protocols. In this updated protocol, we describe Phyre2, which uses advanced remote homology detection methods to build 3D models, predict ligand binding sites and analyze the effect of amino acid variants (e.g., nonsynonymous SNPs (nsSNPs)) for a user's protein sequence. Users are guided through results by a simple interface at a level of detail they determine. This protocol will guide users from submitting a protein sequence to interpreting the secondary and tertiary structure of their models, their domain composition and model quality. A range of additional available tools is described to find a protein structure in a genome, to submit large number of sequences at once and to automatically run weekly searches for proteins that are difficult to model. The server is available at http://www.sbg.bio.ic.ac.uk/phyre2. A typical structure prediction will be returned between 30 min and 2 h after submission
An Adaptive Locally Connected Neuron Model: Focusing Neuron
This paper presents a new artificial neuron model capable of learning its
receptive field in the topological domain of inputs. The model provides
adaptive and differentiable local connectivity (plasticity) applicable to any
domain. It requires no other tool than the backpropagation algorithm to learn
its parameters which control the receptive field locations and apertures. This
research explores whether this ability makes the neuron focus on informative
inputs and yields any advantage over fully connected neurons. The experiments
include tests of focusing neuron networks of one or two hidden layers on
synthetic and well-known image recognition data sets. The results demonstrated
that the focusing neurons can move their receptive fields towards more
informative inputs. In the simple two-hidden layer networks, the focusing
layers outperformed the dense layers in the classification of the 2D spatial
data sets. Moreover, the focusing networks performed better than the dense
networks even when 70 of the weights were pruned. The tests on
convolutional networks revealed that using focusing layers instead of dense
layers for the classification of convolutional features may work better in some
data sets.Comment: 45 pages, a national patent filed, submitted to Turkish Patent
Office, No: -2017/17601, Date: 09.11.201
Towards Incremental Parsing of Natural Language using Recursive Neural Networks
In this paper we develop novel algorithmic ideas for building a natural language
parser grounded upon the hypothesis of incrementality. Although widely accepted
and experimentally supported under a cognitive perspective as a model of the human
parser, the incrementality assumption has never been exploited for building automatic
parsers of unconstrained real texts. The essentials of the hypothesis are that words are
processed in a left-to-right fashion, and the syntactic structure is kept totally connected
at each step.
Our proposal relies on a machine learning technique for predicting the correctness of
partial syntactic structures that are built during the parsing process. A recursive neural
network architecture is employed for computing predictions after a training phase on
examples drawn from a corpus of parsed sentences, the Penn Treebank. Our results
indicate the viability of the approach andlay out the premises for a novel generation of
algorithms for natural language processing which more closely model human parsing.
These algorithms may prove very useful in the development of eÆcient parsers
What Makes a Good Plan? An Efficient Planning Approach to Control Diffusion Processes in Networks
In this paper, we analyze the quality of a large class of simple dynamic
resource allocation (DRA) strategies which we name priority planning. Their aim
is to control an undesired diffusion process by distributing resources to the
contagious nodes of the network according to a predefined priority-order. In
our analysis, we reduce the DRA problem to the linear arrangement of the nodes
of the network. Under this perspective, we shed light on the role of a
fundamental characteristic of this arrangement, the maximum cutwidth, for
assessing the quality of any priority planning strategy. Our theoretical
analysis validates the role of the maximum cutwidth by deriving bounds for the
extinction time of the diffusion process. Finally, using the results of our
analysis, we propose a novel and efficient DRA strategy, called Maximum
Cutwidth Minimization, that outperforms other competing strategies in our
simulations.Comment: 18 pages, 3 figure
Domain-Specific Knowledge Acquisition for Conceptual Sentence Analysis
The availability of on-line corpora is rapidly changing the field of natural language processing (NLP) from one dominated by theoretical models of often very specific linguistic phenomena to one guided by computational models that simultaneously account for a wide variety of phenomena that occur in real-world text. Thus far, among the best-performing and most robust systems for reading and summarizing large amounts of real-world text are knowledge-based natural language systems. These systems rely heavily on domain-specific, handcrafted knowledge to handle the myriad syntactic, semantic, and pragmatic ambiguities that pervade virtually all aspects of sentence analysis. Not surprisingly, however, generating this knowledge for new domains is time-consuming, difficult, and error-prone, and requires the expertise of computational linguists familiar with the underlying NLP system. This thesis presents Kenmore, a general framework for domain-specific knowledge acquisition for conceptual sentence analysis. To ease the acquisition of knowledge in new domains, Kenmore exploits an on-line corpus using symbolic machine learning techniques and robust sentence analysis while requiring only minimal human intervention. Unlike most approaches to knowledge acquisition for natural language systems, the framework uniformly addresses a range of subproblems in sentence analysis, each of which traditionally had required a separate computational mechanism. The thesis presents the results of using Kenmore with corpora from two real-world domains (1) to perform part-of-speech tagging, semantic feature tagging, and concept tagging of all open-class words in the corpus; (2) to acquire heuristics for part-ofspeech disambiguation, semantic feature disambiguation, and concept activation; and (3) to find the antecedents of relative pronouns
- …