14,997 research outputs found
Developmental Stages of Perception and Language Acquisition in a Perceptually Grounded Robot
The objective of this research is to develop a system for language learning based on a minimum of pre-wired language-specific functionality, that is compatible with observations of perceptual and language capabilities in the human developmental trajectory. In the proposed system, meaning (in terms of descriptions of events and spatial relations) is extracted from video images based on detection of position, motion, physical contact and their parameters. Mapping of sentence form to meaning is performed by learning grammatical constructions that are retrieved from a construction inventory based on the constellation of closed class items uniquely identifying the target sentence structure. The resulting system displays robust acquisition behavior that reproduces certain observations from developmental studies, with very modest âinnateâ language specificity
Context-sensitive Spelling Correction Using Google Web 1T 5-Gram Information
In computing, spell checking is the process of detecting and sometimes
providing spelling suggestions for incorrectly spelled words in a text.
Basically, a spell checker is a computer program that uses a dictionary of
words to perform spell checking. The bigger the dictionary is, the higher is
the error detection rate. The fact that spell checkers are based on regular
dictionaries, they suffer from data sparseness problem as they cannot capture
large vocabulary of words including proper names, domain-specific terms,
technical jargons, special acronyms, and terminologies. As a result, they
exhibit low error detection rate and often fail to catch major errors in the
text. This paper proposes a new context-sensitive spelling correction method
for detecting and correcting non-word and real-word errors in digital text
documents. The approach hinges around data statistics from Google Web 1T 5-gram
data set which consists of a big volume of n-gram word sequences, extracted
from the World Wide Web. Fundamentally, the proposed method comprises an error
detector that detects misspellings, a candidate spellings generator based on a
character 2-gram model that generates correction suggestions, and an error
corrector that performs contextual error correction. Experiments conducted on a
set of text documents from different domains and containing misspellings,
showed an outstanding spelling error correction rate and a drastic reduction of
both non-word and real-word errors. In a further study, the proposed algorithm
is to be parallelized so as to lower the computational cost of the error
detection and correction processes.Comment: LACSC - Lebanese Association for Computational Sciences -
http://www.lacsc.or
Bayesian Information Extraction Network
Dynamic Bayesian networks (DBNs) offer an elegant way to integrate various
aspects of language in one model. Many existing algorithms developed for
learning and inference in DBNs are applicable to probabilistic language
modeling. To demonstrate the potential of DBNs for natural language processing,
we employ a DBN in an information extraction task. We show how to assemble
wealth of emerging linguistic instruments for shallow parsing, syntactic and
semantic tagging, morphological decomposition, named entity recognition etc. in
order to incrementally build a robust information extraction system. Our method
outperforms previously published results on an established benchmark domain.Comment: 6 page
The Narrow Conception of Computational Psychology
One particularly successful approach to modeling within cognitive science is computational psychology. Computational psychology explores psychological processes by building and testing computational models with human data. In this paper, it is argued that a specific approach to understanding computation, what is called the ânarrow conceptionâ, has problematically limited the kinds of models, theories, and explanations that are offered within computational psychology. After raising two problems for the narrow conception, an alternative, âwide approachâ to computational psychology is proposed
Collecting Diverse Natural Language Inference Problems for Sentence Representation Evaluation
We present a large-scale collection of diverse natural language inference
(NLI) datasets that help provide insight into how well a sentence
representation captures distinct types of reasoning. The collection results
from recasting 13 existing datasets from 7 semantic phenomena into a common NLI
structure, resulting in over half a million labeled context-hypothesis pairs in
total. We refer to our collection as the DNC: Diverse Natural Language
Inference Collection. The DNC is available online at https://www.decomp.net,
and will grow over time as additional resources are recast and added from novel
sources.Comment: To be presented at EMNLP 2018. 15 page
Genetic Algorithm for Grammar Induction and Rules Verification through a PDA Simulator
The focus of this paper is towards developing a grammatical inference system uses a genetic algorithm (GA), has a powerful global exploration capability that can exploit the optimum offspring. The implemented system runs in two phases: first, generation of grammar rules and verification and then applies the GAâs operation to optimize the rules. A pushdown automata simulator has been developed, which parse the training data over the grammarâs rules. An inverted mutation with random mask and then âXORâ operator has been applied introduces diversity in the population, helps the GA not to get trapped at local optimum. Taguchi method has been incorporated to tune the parameters makes the proposed approach more robust, statistically sound and quickly convergent. The performance of the proposed system has been compared with: classical GA, random offspring GA and crowding algorithms. Overall, a grammatical inference system has been developed that employs a PDA simulator for verification
Context-free Grammar Extraction form Web Document using Probabilities Association
The explosive growth of World Wide Web resulted in the largest Knowledge base ever developed and made available to the public. These documents are typically formatted for human viewing (HTML) and vary widely from document to document. So we canât construct a global schema, discovery of rules from it is complex and tedious process. Most of the existing system uses hand coded wrappers to extract information, which is monotonous and time consuming. Learning grammatical information from given set of Web pages (HTML) has attracted lots of attention in the past decades. In this paper I proposed a method of learning Context-free grammar rules from HTML documents using probabilities association of HTML tags.
DOI: 10.17762/ijritcc2321-8169.160410
- âŠ