Location of Repository

Enhancing a Web Crawler with Arabic Search.

By Qui V. Nguyen


Many advantages of the Internetâ ease of access, limited regulation, vast potential audience, and fast flow of informationâ have turned it into the most popular way to communicate and exchange ideas. Criminal and terrorist groups also use these advantages to turn the Internet into their new play/battle fields to conduct their illegal/terror activities. There are millions of Web sites in different languages on the Internet, but the lack of foreign language search engines makes it impossible to analyze foreign language Web sites efficiently. This thesis will enhance an open source Web crawler with Arabic search capability, thus improving an existing social networking tool to perform page correlation and analysis of Arabic Web sites. A social networking tool with Arabic search capabilities could become a valuable tool for the intelligence community. Its page correlation and analysis results could be used to collect open source intelligence and build a network of Web sites that are related to terrorist or criminal activities.http://hdl.handle.net/10945/7288Lieutenant, United States Nav

Topics: Nutch, Lucene, Web Crawler, Information Retrieval in Arabic, Stemming in Arabic, Nutch, Lucene, Web Crawler, Information Retrieval in Arabic, Stemming in Arabic
Publisher: Monterey, California: Naval Postgraduate School
Year: 2012
OAI identifier: oai:calhoun.nps.edu:10945/7288

Suggested articles


To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.