Search CORE

21,534 research outputs found

Genetic Algorithm (GA) in Feature Selection for CRF Based Manipuri Multiword Expression (MWE) Identification

Author: Bandyopadhyay Sivaji
Nongmeikapam Kishorjit
Publication venue: 'Academy and Industry Research Collaboration Center (AIRCC)'
Publication date: 10/11/2011
Field of study

This paper deals with the identification of Multiword Expressions (MWEs) in Manipuri, a highly agglutinative Indian Language. Manipuri is listed in the Eight Schedule of Indian Constitution. MWE plays an important role in the applications of Natural Language Processing(NLP) like Machine Translation, Part of Speech tagging, Information Retrieval, Question Answering etc. Feature selection is an important factor in the recognition of Manipuri MWEs using Conditional Random Field (CRF). The disadvantage of manual selection and choosing of the appropriate features for running CRF motivates us to think of Genetic Algorithm (GA). Using GA we are able to find the optimal features to run the CRF. We have tried with fifty generations in feature selection along with three fold cross validation as fitness function. This model demonstrated the Recall (R) of 64.08%, Precision (P) of 86.84% and F-measure (F) of 73.74%, showing an improvement over the CRF based Manipuri MWE identification without GA application.Comment: 14 pages, 6 figures, see http://airccse.org/journal/jcsit/1011csit05.pd

arXiv.org e-Print Archive

Crossref

Analytical Challenges in Modern Tax Administration: A Brief History of Analytics at the IRS

Author: Butler Jeff
Publication venue: Ohio State University. Moritz College of Law
Publication date: 01/01/2020
Field of study

KnowledgeBank at OSU

SERKET: An Architecture for Connecting Stochastic Models to Realize a Large-Scale Cognitive Model

Author: Nagai Takayuki
Nakamura Tomoaki
Taniguchi Tadahiro
Publication venue
Publication date: 05/12/2017
Field of study

To realize human-like robot intelligence, a large-scale cognitive architecture is required for robots to understand the environment through a variety of sensors with which they are equipped. In this paper, we propose a novel framework named Serket that enables the construction of a large-scale generative model and its inference easily by connecting sub-modules to allow the robots to acquire various capabilities through interaction with their environments and others. We consider that large-scale cognitive models can be constructed by connecting smaller fundamental models hierarchically while maintaining their programmatic independence. Moreover, connected modules are dependent on each other, and parameters are required to be optimized as a whole. Conventionally, the equations for parameter estimation have to be derived and implemented depending on the models. However, it becomes harder to derive and implement those of a larger scale model. To solve these problems, in this paper, we propose a method for parameter estimation by communicating the minimal parameters between various modules while maintaining their programmatic independence. Therefore, Serket makes it easy to construct large-scale models and estimate their parameters via the connection of modules. Experimental results demonstrated that the model can be constructed by connecting modules, the parameters can be optimized as a whole, and they are comparable with the original models that we have proposed

arXiv.org e-Print Archive

Directory of Open Access Journals

Frontiers - Publisher Connector

Combined optimization of feature selection and algorithm parameters in machine learning of language

Author: Daelemans Walter
De Meulder Fien
Hoste Veronique
Naudts Bart
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

Comparative machine learning experiments have become an important methodology in empirical approaches to natural language processing (i) to investigate which machine learning algorithms have the 'right bias' to solve specific natural language processing tasks, and (ii) to investigate which sources of information add to accuracy in a learning approach. Using automatic word sense disambiguation as an example task, we show that with the methodology currently used in comparative machine learning experiments, the results may often not be reliable because of the role of and interaction between feature selection and algorithm parameter optimization. We propose genetic algorithms as a practical approach to achieve both higher accuracy within a single approach, and more reliable comparisons

CiteSeerX

Ghent University Academic Bibliography