In my thesis I am presenting an approach of conceptual spaces for vizulalization of text corpora. Thesis is divided into two parts. First part is overview of methods for text corpora analysis and the second one presents some ways for result vizualization.
Due to increasing number of eletronic data, we tend to automatic analisys and organisation of this data into various, pre-unknown groups.
Some algorithms, that are providing us ways to do this, are presented (such as latent semantic analysis, probabilistic latent semantic analysis, latend Dirichlet Allocartion) further on in thesis..
We are looking for unknown topics, that arise in the text corpora. Text corpora is then analyzed with selected algorithm and presented in conceptual space. Conceptual space represents information by geometric structures: semantics of words are represented by points and relations between them are represented with regions. This suggests that word semantics is generated from concepts, that are represented as regions in conceptual space.
For vizualization of conceptual space of text corpora, I decided to use three dimensional representation with self-organizing maps and two dimensional representation with Voronoi diagram. Both representations allow spatial interaction, which can offer us easier way to imagine the conceptual space