12 research outputs found

    Effect of Visualization of News Articles in Data Driven Games

    Get PDF
    The popularity of prediction games such as fantasy sports has been on the rise and the amount of data available for the players to make the prediction in such games is growing rapidly. Prediction games, an area of data driven games require the user to interpret archival data along with real-time data from a domain to make a prediction about a future event. This work is being done in the context of a prediction game where players select geographic locations as part the game. News articles can serve as a source of information and have the potential to improve engagement and learning. To make sense of the millions of news artifacts published online is not possible manually. Moreover, keyword-based search is very limited when it comes to exploring data rather than just searching for something particular. The proposed work will develop a visualization and user interface to represent the news articles in an activity-appropriate manner and allow the users to explore news data using approximate search along with keyword-based search. The first component of this system extracts news articles related to the game, then geotags and clusters them based on the geographic references in the articles. Displaying these clusters on a map takes advantage of the spatially referenced news data. This thesis will compare alternative visualizations of the geo-tagged and clustered news articles and their value to players of the game. A user study was conducted to evaluate the visualizations and their effect on the engagement of the players with a data driven game. The results show that the map visualization is very effective in engaging players with the game when compared to the regular list form of news representation. Moreover, the overall performance of the players who used the map visualization was better than the performance of the users who used the list visualization. Future work will explore more on fine-tuning data sources which provide the input to the map visualization as well as variations in the display and accessible features on the map interface to enable users to control data visualization according to their imagination and preference

    Design of a Prediction Game in the Domain of Computer Security

    Get PDF
    Prediction Games are games where players analyze historical data and make predictions about future events. The predictions are scored which gives the players an idea of where they stand. This game is appropriate for domains where data is coming in frequently and where there is a decent quantity of historical data for participants to explore. Knowledge in Computer Security carries high value personally and professionally and statistics in this data domain is collected by many organizations for various reasons. This thesis explores the design of a prediction game in the field of Computer Security. The goals of this project include identifying data sets that could be used for a prediction game, designing a prediction activity which will be helpful to players, and developing a prototype version of the prediction game. A heuristic evaluation of the prototype will provide feedback for improvements to the game mechanics such as user interface and data visualizations. At the end of the project, there will be a greater understanding of the availability of computer security data, how it can be used for developing prediction games, and tradeoffs in the design of computer security prediction games

    Evaluating Layout and Clustering Algorithms for Visualizing Named Entity Graph

    Get PDF
    Myriad of layout and clustering algorithms exist to generate visual graphs of named entities. Consequently, it is hard for researchers to select the appropriate algorithms that fulfill their needs. This paper intends to assist the researchers by presenting the performance evaluation of the combination of graph layout algorithm followed by a clustering algorithm. The layout algorithms are OpenORD and Hu’s algorithms, and the clustering algorithms are Chinese Whispers and GivanNewman algorithms. The evaluation is carried out on bio-named entities that are linked by some annotated relations. The results of the experimentations highlight the strengths and weaknesses of the four combinations regarding running time, loss of relations (or edges), edge crossing, and cluttered display

    Revealing Hidden Community Structures and Identifying Bridges in Complex Networks: An Application to Analyzing Contents of Web Pages for Browsing

    Get PDF
    International audienceThe emergence of scale free and small world properties in real world complex networks has stimulated lots of activity in the field of network analysis. An example of such a network comes from the field of Content Analysis (CA) and Text Mining where the goal is to analyze the contents of a set of web pages. The Network can be represented by the words appearing in the web pages as nodes and the edges representing a relation between two words if they appear in a document together. In this paper we present a CA system that helps users analyze these networks representing the textual contents of a set of web pages visually. Major contributions include a methodology to cluster complex networks based on duplication of nodes and identification of bridges i.e. words that might be of user interest but have a low frequency in the document corpus. We have tested this system with a number of data sets and users have found it very useful for the exploration of data. One of the case studies is presented in detail which is based on browsing a collection of web pages on Wikipedia (http://en.wikipedia.org/wiki/Main_Page)

    Prediction Games: Encouraging Engagement with Data

    Get PDF
    Prediction games, data-driven games modeled after fantasy sports, are aimed to motivate people to explore, analyze, and develop their own understanding of large data sets. They revolve around activities where players examine historical data and information resources to make predictions about future events. As a result, they may help improve the players’ domain knowledge and data interpretation skills. But what matters in the design of such games? And, as we envision prediction games created by instructors in an educational environment, what forms of support aid the authoring of prediction activities yet involve very little to no programming? To answer these questions, we first conducted a survey of fantasy sports players which showed that many seek out information including news and data. They analyze this content to make predictions, resulting in them learning more about the sport. Next, we developed Fantasy Forecaster, a prediction game prototype to gather system requirements and user feedback. Lessons from the survey and development of the prototype informed our prediction games framework and its implementation in the climate domain: Fantasy Climate. Fantasy Climate is a prediction game based on weather data where players select a location among a set of choices based on whether their assessment of upcoming weather. In particular, they are asked to select which location will be warmest and coolest compared to their historic norms on an upcoming date. The game also featured communication tools, integrated climate-related news, and historical weather data with visualizations to make sense of them. User studies of Fantasy Climate revealed that social interaction, particularly asynchronous discussions made the game more engaging and helped players gather information for prediction making. Also, the in-game presentation of domain-related news had an effect on engagement and players' performance. From our prediction games framework and the implementation of Fantasy Climate, we identified a set of necessary and valuable prediction activity specifications which led to the development of the Activity Creation Wizard (ACW). The ACW is an environment that guides the author through a series of steps to author their prediction activity. Features of the ACW included a help system that provides the author with explanations, tutorials and examples during the authoring process. Also included were a template component that allows the author to reuse the customizations of a previously created prediction activity, and tools to automate repetitive and tedious tasks such as building the prediction schedule. The evaluation of the ACW showed no background knowledge was required to use the ACW to author a prediction activity. The help system was in general adequate in assisting the participants in their information needs, templates were found useful by many, and automation reduced the time taken for repetitive tasks. Some authors did not want to use templates or automation in order to have more control over the design of their activity. However, the help system, templates, and automation tools of the ACW were not sufficient in helping the participants understand the consequences of their customization on the prediction activity. Reasoning about the effects of their choices on gameplay was noted as the primary challenge during the authoring task by several participants. Additionally, the evaluation identified alternative ways of authoring the prediction activity that challenged our current design of the ACW, including the potential value of co-dependent customizations and collaborative authoring. Finally, the ACW evaluation also involved a task where participants created a prediction game in the domain of their choice. Interviews with participants on their created prediction games revealed two major findings. One finding was that educational, social, and socio-cultural factors play an important role in what makes prediction games engaging. The other finding was authoring resulted in a recognition by the participants of the educational benefits of prediction games which align well with the primary motives of this research work

    Unsupervised discovery of relations for analysis of textual data in digital forensics

    Get PDF
    This dissertation addresses the problem of analysing digital data in digital forensics. It will be shown that text mining methods can be adapted and applied to digital forensics to aid analysts to more quickly, efficiently and accurately analyse data to reveal truly useful information. Investigators who wish to utilise digital evidence must examine and organise the data to piece together events and facts of a crime. The difficulty with finding relevant information quickly using the current tools and methods is that these tools rely very heavily on background knowledge for query terms and do not fully utilise the content of the data. A novel framework in which to perform evidence discovery is proposed in order to reduce the quantity of data to be analysed, aid the analysts' exploration of the data and enhance the intelligibility of the presentation of the data. The framework combines information extraction techniques with visual exploration techniques to provide a novel approach to performing evidence discovery, in the form of an evidence discovery system. By utilising unrestricted, unsupervised information extraction techniques, the investigator does not require input queries or keywords for searching, thus enabling the investigator to analyse portions of the data that may not have been identified by keyword searches. The evidence discovery system produces text graphs of the most important concepts and associations extracted from the full text to establish ties between the concepts and provide an overview and general representation of the text. Through an interactive visual interface the investigator can explore the data to identify suspects, events and the relations between suspects. Two models are proposed for performing the relation extraction process of the evidence discovery framework. The first model takes a statistical approach to discovering relations based on co-occurrences of complex concepts. The second model utilises a linguistic approach using named entity extraction and information extraction patterns. A preliminary study was performed to assess the usefulness of a text mining approach to digital forensics as against the traditional information retrieval approach. It was concluded that the novel approach to text analysis for evidence discovery presented in this dissertation is a viable and promising approach. The preliminary experiment showed that the results obtained from the evidence discovery system, using either of the relation extraction models, are sensible and useful. The approach advocated in this dissertation can therefore be successfully applied to the analysis of textual data for digital forensics CopyrightDissertation (MSc)--University of Pretoria, 2010.Computer Scienceunrestricte

    On Two Web IR Boosting Tools: Clustering and Ranking

    Get PDF
    This thesis investigates several research problems which arise in modern Web Information Retrieval (WebIR). The Holy Grail of modern WebIR is to find a way to organize and to rank results so that the most ``relevant' come first. The first break-through technique was the exploitation of the link structure of the Web graph in order to rank the result pages, using the well-known Hits and Pagerank algorithms. This link-analysis approaches have been improved and extended, but yet they seem to be insufficient in providing a satisfying search experience. In a number of situations a flat list of search results is not enough, and the users might desire to have search results grouped on-the-fly in folders of similar topics. In addition, the folders should be annotated with meaningful labels for rapid identification of the desired group of results. In other situations, users may have different search goals even when they express them with the same query. In this case the search results should be personalized according to the users' on-line activities. In order to address this need, we will discuss the algorithmic ideas behind SnakeT, a hierarchical clustering meta-search engine which personalizes searches according to the clusters selected by users on-the-fly. There are also situations where users might desire to access fresh information. In these cases, traditional link analysis could not be suitable. In fact, it is possible that there is not enough time to have many links pointing to a recently produced piece of information. In order to address this need, we will discuss the algorithmic and numerical ideas behind a new ranking algorithm suitable for ranking fresh type of information, such as news articles or blogs. When link analysis suffices to produce good quality search results, the huge amount of Web information asks for fast ranking methodologies. We will discuss numerical methodologies for accelerating the eingenvector-like computation, commonly used by link analysis. An important result of this thesis is that we show how to address the above predominant issues of Web Information Retrieval by using clustering and ranking methodologies. We will demonstrate that both clustering and ranking have a mutual reinforcement propriety which has not yet been studied intensively. This propriety can be exploited to boost the precision of both the two methodologies
    corecore