research

INDIAN LANGUAGE TEXT MINING

Abstract

India is the home of different languages, due to its cultural and geographical diversity. In the Constitution of India, a provision is made for each of the Indian states to choose their own official language for communicating at the state level for official purpose. In India, the growth in consumption of Indian language content started because of growth of electronic devices and technology. The availability of constantly increasing amount of textual data of various Indian regional languages in electronic form has accelerated. But not much work has been done in Indian languages text processing. So there is a huge gap from the stored data to the knowledge that could be constructed from the data. This transition won't occur automatically, that's where Text mining comes into picture. This research is concerned with the study and analyzes the text mining for Indian regional languages Text mining refers to such a knowledge discovery process when the source data under consideration is text. Text mining is a new and exciting research area that tries to solve the information overload problem by using techniques from information retrieval, information extraction as well as natural language processing (NLP) and connects them with the algorithms and methods of KDD, data mining, machine learning and statistics. Some applications of text mining are: document classification, information retrieval, clustering documents, information extraction, and performance evaluation. In this paper we made an attempt to show the need of text mining for Indian language

    Similar works