71 research outputs found

    Networked Data Analytics: Network Comparison And Applied Graph Signal Processing

    Get PDF
    Networked data structures has been getting big, ubiquitous, and pervasive. As our day-to-day activities become more incorporated with and influenced by the digital world, we rely more on our intuition to provide us a high-level idea and subconscious understanding of the encountered data. This thesis aims at translating the qualitative intuitions we have about networked data into quantitative and formal tools by designing rigorous yet reasonable algorithms. In a nutshell, this thesis constructs models to compare and cluster networked data, to simplify a complicated networked structure, and to formalize the notion of smoothness and variation for domain-specific signals on a network. This thesis consists of two interrelated thrusts which explore both the scenarios where networks have intrinsic value and are themselves the object of study, and where the interest is for signals defined on top of the networks, so we leverage the information in the network to analyze the signals. Our results suggest that the intuition we have in analyzing huge data can be transformed into rigorous algorithms, and often the intuition results in superior performance, new observations, better complexity, and/or bridging two commonly implemented methods. Even though different in the principles they investigate, both thrusts are constructed on what we think as a contemporary alternation in data analytics: from building an algorithm then understanding it to having an intuition then building an algorithm around it. We show that in order to formalize the intuitive idea to measure the difference between a pair of networks of arbitrary sizes, we could design two algorithms based on the intuition to find mappings between the node sets or to map one network into the subset of another network. Such methods also lead to a clustering algorithm to categorize networked data structures. Besides, we could define the notion of frequencies of a given network by ordering features in the network according to how important they are to the overall information conveyed by the network. These proposed algorithms succeed in comparing collaboration histories of researchers, clustering research communities via their publication patterns, categorizing moving objects from uncertain measurmenets, and separating networks constructed from different processes. In the context of data analytics on top of networks, we design domain-specific tools by leveraging the recent advances in graph signal processing, which formalizes the intuitive notion of smoothness and variation of signals defined on top of networked structures, and generalizes conventional Fourier analysis to the graph domain. In specific, we show how these tools can be used to better classify the cancer subtypes by considering genetic profiles as signals on top of gene-to-gene interaction networks, to gain new insights to explain the difference between human beings in learning new tasks and switching attentions by considering brain activities as signals on top of brain connectivity networks, as well as to demonstrate how common methods in rating prediction are special graph filters and to base on this observation to design novel recommendation system algorithms

    Network Geometry

    Get PDF
    Networks are finite metric spaces, with distances defined by the shortest paths between nodes. However, this is not the only form of network geometry: two others are the geometry of latent spaces underlying many networks and the effective geometry induced by dynamical processes in networks. These three approaches to network geometry are intimately related, and all three of them have been found to be exceptionally efficient in discovering fractality, scale invariance, self-similarity and other forms of fundamental symmetries in networks. Network geometry is also of great use in a variety of practical applications, from understanding how the brain works to routing in the Internet. We review the most important theoretical and practical developments dealing with these approaches to network geometry and offer perspectives on future research directions and challenges in this frontier in the study of complexity

    BeSocratic: An Intelligent Tutoring System for the Recognition, Evaluation, and Analysis of Free-form Student Input

    Get PDF
    This dissertation describes a novel intelligent tutoring system, BeSocratic, which aims to help fill the gap between simple multiple-choice systems and free-response systems. BeSocratic focuses on targeting questions that are free-form in nature yet defined to the point which allows for automatic evaluation and analysis. The system includes a set of modules which provide instructors with tools to assess student performance. Beyond text boxes and multiple-choice questions, BeSocratic contains several modules that recognize, evaluate, provide feedback, and analyze student-drawn structures, including Euclidean graphs, chemistry molecules, computer science graphs, and simple drawings. Our system uses a visual, rule-based authoring system which enables the creation of activities for use within science, technology, engineering, and mathematics classrooms. BeSocratic records each action that students make within the system. Using a set of post-analysis tools, teachers have the ability to examine both individual and group performances. We accomplish this using hidden Markov model-based clustering techniques and visualizations. These visualizations can help teachers quickly identify common strategies and errors for large groups of students. Furthermore, analysis results can be used directly to improve activities through advanced detection of student errors and refined feedback. BeSocratic activities have been created and tested at several universities. We report specific results from several activities, and discuss how BeSocratic\u27s analysis tools are being used with data from other systems. We specifically detail two chemistry activities and one computer science activity: (1) an activity focused on improving mechanism use, (2) an activity which assesses student understanding of Gibbs energy, and (3) an activity which teaches students the fundamentals of splay trees. In addition to analyzing data collected from students within BeSocratic, we share our visualizations and results from analyzing data gathered with another educational system, PhET

    Network communities and the foreign exchange market

    Get PDF
    Many systems studied in the biological, physical, and social sciences are composed of multiple interacting components. Often the number of components and interactions is so large that attaining an understanding of the system necessitates some form of simplication. A common representation that captures the key connection patterns is a network in which the nodes correspond to system components and the edges represent interactions. In this thesis we use network techniques and more traditional clustering methods to coarse-grain systems composed of many interacting components and to identify the most important interactions.\ud \ud This thesis focuses on two main themes: the analysis of financial systems and the study of network communities, an important mesoscopic feature of many networks. In the first part of the thesis, we discuss some of the issues associated with the analysis of financial data and investigate the potential for risk-free profit in the foreign exchange market. We then use principal component analysis (PCA) to identify common features in the correlation structure of different financial markets. In the second part of the thesis, we focus on network communities. We investigate the evolving structure of foreign exchange (FX) market correlations by representing the correlations as time-dependent networks and investigating the evolution of network communities. We employ a node-centric approach that allows us to track the effects of the community evolution on the functional roles of individual nodes and uncovers major trading changes that occurred in the market. Finally, we consider the community structure of networks from a wide variety of different disciplines. We introduce a framework for comparing network communities and use this technique to identify networks with similar mesoscopic structures. Based on this similarity, we create taxonomies of a large set of networks from different fields and individual families of networks from the same field

    INVESTIGATIONS ON COGNITIVE COMPUTATION AND COMPUTATIONAL COGNITION

    Get PDF
    This Thesis describes our work at the boundary between Computer Science and Cognitive (Neuro)Science. In particular, (1) we have worked on methodological improvements to clustering-based meta-analysis of neuroimaging data, which is a technique that allows to collectively assess, in a quantitative way, activation peaks from several functional imaging studies, in order to extract the most robust results in the cognitive domain of interest. Hierarchical clustering is often used in this context, yet it is prone to the problem of non-uniqueness of the solution: a different permutation of the same input data might result in a different clustering result. In this Thesis, we propose a new version of hierarchical clustering that solves this problem. We also show the results of a meta-analysis, carried out using this algorithm, aimed at identifying specific cerebral circuits involved in single word reading. Moreover, (2) we describe preliminary work on a new connectionist model of single word reading, named the two-component model because it postulates a cascaded information flow from a more cognitive component that computes a distributed internal representation for the input word, to an articulatory component that translates this code into the corresponding sequence of phonemes. Output production is started when the internal code, which evolves in time, reaches a sufficient degree of clarity; this mechanism has been advanced as a possible explanation for behavioral effects consistently reported in the literature on reading, with a specific focus on the so called serial effects. This model is here discussed in its strength and weaknesses. Finally, (3) we have turned to consider how features that are typical of human cognition can inform the design of improved artificial agents; here, we have focused on modelling concepts inspired by emotion theory. A model of emotional interaction between artificial agents, based on probabilistic finite state automata, is presented: in this model, agents have personalities and attitudes that can change through the course of interaction (e.g. by reinforcement learning) to achieve autonomous adaptation to the interaction partner. Markov chain properties are then applied to derive reliable predictions of the outcome of an interaction. Taken together, these works show how the interplay between Cognitive Science and Computer Science can be fruitful, both for advancing our knowledge of the human brain and for designing more and more intelligent artificial systems

    Explorative Graph Visualization

    Get PDF
    Netzwerkstrukturen (Graphen) sind heutzutage weit verbreitet. Ihre Untersuchung dient dazu, ein besseres Verständnis ihrer Struktur und der durch sie modellierten realen Aspekte zu gewinnen. Die Exploration solcher Netzwerke wird zumeist mit Visualisierungstechniken unterstützt. Ziel dieser Arbeit ist es, einen Überblick über die Probleme dieser Visualisierungen zu geben und konkrete Lösungsansätze aufzuzeigen. Dabei werden neue Visualisierungstechniken eingeführt, um den Nutzen der geführten Diskussion für die explorative Graphvisualisierung am konkreten Beispiel zu belegen.Network structures (graphs) have become a natural part of everyday life and their analysis helps to gain an understanding of their inherent structure and the real-world aspects thereby expressed. The exploration of graphs is largely supported and driven by visual means. The aim of this thesis is to give a comprehensive view on the problems associated with these visual means and to detail concrete solution approaches for them. Concrete visualization techniques are introduced to underline the value of this comprehensive discussion for supporting explorative graph visualization

    Sensing and Visualizing Social Context from Spatial Proximity

    Get PDF
    The concept of pervasive computing, as introduced by Marc Weiser under the name ubiquitous computing in the early 90s, spurred research into various kinds of context-aware systems and applications. There is a wide range of contextual parameters, including location, time, temperature, devices and people in proximity, which have been part of the initial ideas about context-aware computing. While locational context is already a well understood concept, social context---based on the people around us---proves to be harder to grasp and to operationalize. This work continues the line of research into social context, which is based on the proximity and meeting patterns of people in the physical space. It takes this research out of the lab and out of well controlled situations into our urban environments, which are full of ambiguity and opportunities. The key to this research is the tool that caused dramatic change in individual and collective behavior during the last 20 years and which is a manifestation of many of the ideas of the pervasive computing paradigm: the mobile phone. In this work, the mobile is regarded as a proxy for people. Through it, the social environment becomes accessible to digital measurement and processing. To understand the large amount of data that now becomes available to automatic measurement, we will turn to the discipline of social network analysis. It provides powerful methods, that are able to condense data and extract relevant meaning. Visualization helps to understand and interpret the results. This thesis contains a number of experiments, that demonstrate how the automatic measurement of social proximity data through Bluetooth can be used to measure variables of personal behavior, group behavior and the behavior of groups in relation to places. The principal contributions are: * A methodology to visualize personal social context by using an ego proximity network. Specific episodes can be localized and compared. * method to compare different days in terms of social context, e.g. to support automatic diary applications. * A method to compose social geographic maps. Locations of similar social context are detected and combined. * Functions to measure short-term changes in social activity, based on the distinction between strange and familiar devices. * The characterization of Bluetooth inquiries for social proximity sensing. * A dataset of Bluetooth sightings from an ego perspective in seven different settings. Additionally, some settings feature multiple stationary scanners and Cell-ID measurements. * Soft- and hardware to capture, collect, store and analyze Bluetooth proximity data

    Graph-level operations: A high-level interface for graph visualization technique specification

    Get PDF
    More and more the world is being described as graphs---as connections between people, places, and ideas---since they provide a richer model than simply understanding each item in isolation. In order to help analysts understand these graphs, researchers have developed and studied a large number of graph visualization techniques. This variety of techniques presents solutions to a breadth of graph analysis tasks, but it introduces a new issue: complexity. The variety introduces both the complexity of comparing techniques in an objective way and the engineering complexity of implementing so many techniques. In this thesis, I present graph-level operations models (or GLO models) as an elegant solution to these challenges. A GLO model consists of a model of visual elements and a set of functions (GLOs) that manipulate those elements. I introduce GLOv1 and GLOv2, GLO models derived from six hand-picked graph visualization techniques and twenty-nine techniques derived from a review of 430 graph visualization publications, respectively. I show how to use GLOs to define graph visualization techniques, including a model's original seed techniques as well as novel techniques. I demonstrate the analysis potential of the GLO model by clustering the twenty-nine seed techniques using two different GLO-based schemes. Finally, I demonstrate the practical engineering potential of the model through an open-source Javascript implementation (GLO.js) and two applications built atop the implementation for exploring a graph and discovering novel techniques using GLOs (GLO-STIX and GLO-CLI).Ph.D
    corecore