21,534 research outputs found
Genetic Algorithm (GA) in Feature Selection for CRF Based Manipuri Multiword Expression (MWE) Identification
This paper deals with the identification of Multiword Expressions (MWEs) in
Manipuri, a highly agglutinative Indian Language. Manipuri is listed in the
Eight Schedule of Indian Constitution. MWE plays an important role in the
applications of Natural Language Processing(NLP) like Machine Translation, Part
of Speech tagging, Information Retrieval, Question Answering etc. Feature
selection is an important factor in the recognition of Manipuri MWEs using
Conditional Random Field (CRF). The disadvantage of manual selection and
choosing of the appropriate features for running CRF motivates us to think of
Genetic Algorithm (GA). Using GA we are able to find the optimal features to
run the CRF. We have tried with fifty generations in feature selection along
with three fold cross validation as fitness function. This model demonstrated
the Recall (R) of 64.08%, Precision (P) of 86.84% and F-measure (F) of 73.74%,
showing an improvement over the CRF based Manipuri MWE identification without
GA application.Comment: 14 pages, 6 figures, see
http://airccse.org/journal/jcsit/1011csit05.pd
SERKET: An Architecture for Connecting Stochastic Models to Realize a Large-Scale Cognitive Model
To realize human-like robot intelligence, a large-scale cognitive
architecture is required for robots to understand the environment through a
variety of sensors with which they are equipped. In this paper, we propose a
novel framework named Serket that enables the construction of a large-scale
generative model and its inference easily by connecting sub-modules to allow
the robots to acquire various capabilities through interaction with their
environments and others. We consider that large-scale cognitive models can be
constructed by connecting smaller fundamental models hierarchically while
maintaining their programmatic independence. Moreover, connected modules are
dependent on each other, and parameters are required to be optimized as a
whole. Conventionally, the equations for parameter estimation have to be
derived and implemented depending on the models. However, it becomes harder to
derive and implement those of a larger scale model. To solve these problems, in
this paper, we propose a method for parameter estimation by communicating the
minimal parameters between various modules while maintaining their programmatic
independence. Therefore, Serket makes it easy to construct large-scale models
and estimate their parameters via the connection of modules. Experimental
results demonstrated that the model can be constructed by connecting modules,
the parameters can be optimized as a whole, and they are comparable with the
original models that we have proposed
Combined optimization of feature selection and algorithm parameters in machine learning of language
Comparative machine learning experiments have become an important methodology in empirical approaches to natural language processing (i) to investigate which machine learning algorithms have the 'right bias' to solve specific natural language processing tasks, and (ii) to investigate which sources of information add to accuracy in a learning approach. Using automatic word sense disambiguation as an example task, we show that with the methodology currently used in comparative machine learning experiments, the results may often not be reliable because of the role of and interaction between feature selection and algorithm parameter optimization. We propose genetic algorithms as a practical approach to achieve both higher accuracy within a single approach, and more reliable comparisons
- …