13 research outputs found

    Multi-dimensional mining of unstructured data with limited supervision

    Get PDF
    As one of the most important data forms, unstructured text data plays a crucial role in data-driven decision making in domains ranging from social networking and information retrieval to healthcare and scientific research. In many emerging applications, people's information needs from text data are becoming multi-dimensional---they demand useful insights for multiple aspects from the given text corpus. However, turning massive text data into multi-dimensional knowledge remains a challenge that cannot be readily addressed by existing data mining techniques. In this thesis, we propose algorithms that turn unstructured text data into multi-dimensional knowledge with limited supervision. We investigate two core questions: 1. How to identify task-relevant data with declarative queries in multiple dimensions? 2. How to distill knowledge from data in a multi-dimensional space? To address the above questions, we propose an integrated cube construction and exploitation framework. First, we develop a cube construction module that organizes unstructured data into a cube structure, by discovering latent multi-dimensional and multi-granular structure from the unstructured text corpus and allocating documents into the structure. Second, we develop a cube exploitation module that models multiple dimensions in the cube space, thereby distilling multi-dimensional knowledge from data to provide insights along multiple dimensions. Together, these two modules constitute an integrated pipeline: leveraging the cube structure, users can perform multi-dimensional, multi-granular data selection with declarative queries; and with cube exploitation algorithms, users can make accurate cross-dimension predictions or extract multi-dimensional patterns for decision making. The proposed framework has two distinctive advantages when turning text data into multi-dimensional knowledge: flexibility and label-efficiency. First, it enables acquiring multi-dimensional knowledge flexibly, as the cube structure allows users to easily identify task-relevant data along multiple dimensions at varied granularities and further distill multi-dimensional knowledge. Second, the algorithms for cube construction and exploitation require little supervision; this makes the framework appealing for many applications where labeled data are expensive to obtain

    Topology Reconstruction of Dynamical Networks via Constrained Lyapunov Equations

    Get PDF
    The network structure (or topology) of a dynamical network is often unavailable or uncertain. Hence, we consider the problem of network reconstruction. Network reconstruction aims at inferring the topology of a dynamical network using measurements obtained from the network. In this technical note we define the notion of solvability of the network reconstruction problem. Subsequently, we provide necessary and sufficient conditions under which the network reconstruction problem is solvable. Finally, using constrained Lyapunov equations, we establish novel network reconstruction algorithms, applicable to general dynamical networks. We also provide specialized algorithms for specific network dynamics, such as the well-known consensus and adjacency dynamics.Comment: 8 page

    Mobility and interaction patterns in social networks

    Get PDF
    The question of analyzing the predictability of human behavior has been widely studied in literature, to unveil how individuals move, how they can be mobilized and, more philosophically, to understand to what extent our decisions are random or whether we are free to choose. As a consequence of humans relate to each other, we also tend to live in groups at different hierarchies in a social way so it is interesting to analyze how individual features and choices affect the global structure of a society. In this work, we explore the limits of human predictability in terms of shopping behavior, observing that, even when we are constrained to a limited set of possible places where we can make a purchase, predicting where the next purchase will happen is not accurately possible to do by only observing the past. The next question is to study how individual decisions affect emergent phenomena such as the economy or information diffusion across a country. We analyze the contents, temporal and mobility patterns extracted from users’ social media publications to build a profile of the geographical regions that allow to predict the unemployment rate. Finally, we also use a mobile phone call dataset to test whether the dynamics at the urban level, how people create and destroy links within a city, affect the inter-urban diffusion of diseases, virus or rumors. Our results suggest that inter-regional structure is robust and does not vary significantly on time so diffusion processes can be well modeled in terms of static properties of the inter-urban network.Programa Oficial de Doctorado en Ingeniería MatemáticaPresidente: Javier Borge Holthoefer.- Secretario: Rubén Cuevas Rumín.- Vocal: Josep Perelló Palo
    corecore