1 research outputs found

    Triple Viz: A tool to explore document content from a graphical representation of subject-verb-object triples

    Get PDF
    Most of the data available is unstructured. Text mining is the process of automatically extracting information from text. This thesis combines text mining with visualization to develop TripleViz, a lightweight, web-based tool used to process and analyze documents extracting subject-verb-object (SVO) triples, and visualize them as graphs. The SVO triples extracted from documents are visualized using the open-source visualization tools Turtled and Gephi. TripleViz extracts noun phrases and visualizes them in either full or head format to avoid overcrowding on the screen. For the same reason, TripleViz provides an option to select only triples that contain words of interest as provided by the user in the form of a word list. Within TripleViz, the user can also view color-coded output text highlighting words from a word list. This thesis presents an experiment in classifying newspaper articles and blogs into either "specific event" or "generic", which shows a moderate improvement over a strong baseline
    corecore