This short paper constitutes the first phase of a long-term project focused on probing American urban culture by examining the hyperlinks and text of personal weblogs. It discusses methods of extracting geographic location information from weblogs and ways of indexing weblogs to city units. After a brief introduction to the broader research plan, the paper proposes a process to automatically extract geographic information from different weblogs. From both theoretical and practical perspectives, we will explain and justify the rationale of using 3-digit zip codes as units for comparing urban cultures. A distribution of American bloggers registered with Livejournal and Diaryland, two popular blog hosting services, will be presented to demonstrate the geocoding of the blogosphere, and to compare the distribution of these two hosts in terms of concentrations of populations and demographic profiles. Finally, we will discuss how to further improve the indexing methods
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.