954 research outputs found

    Representing Social Networks as Dynamic Heterogeneous Graphs

    Full text link
    Graph representations for real-world social networks in the past have missed two important elements: the multiplexity of connections as well as representing time. To this end, in this paper, we present a new dynamic heterogeneous graph representation for social networks which includes time in every single component of the graph, i.e., nodes and edges, each of different types that captures heterogeneity. We illustrate the power of this representation by presenting four time-dependent queries and deep learning problems that cannot easily be handled in conventional homogeneous graph representations commonly used. As a proof of concept we present a detailed representation of a new social media platform (Steemit), which we use to illustrate both the dynamic querying capability as well as prediction tasks using graph neural networks (GNNs). The results illustrate the power of the dynamic heterogeneous graph representation to model social networks. Given that this is a relatively understudied area we also illustrate opportunities for future work in query optimization as well as new dynamic prediction tasks on heterogeneous graph structures

    Manufacturing as a Data-Driven Practice: Methodologies, Technologies, and Tools

    Get PDF
    n recent years, the introduction and exploitation of innovative information technologies in industrial contexts have led to the continuous growth of digital shop floor envi- ronments. The new Industry-4.0 model allows smart factories to become very advanced IT industries, generating an ever- increasing amount of valuable data. As a consequence, the neces- sity of powerful and reliable software architectures is becoming prominent along with data-driven methodologies to extract useful and hidden knowledge supporting the decision making process. This paper discusses the latest software technologies needed to collect, manage and elaborate all data generated through innovative IoT architectures deployed over the production line, with the aim of extracting useful knowledge for the orchestration of high-level control services that can generate added business value. This survey covers the entire data life-cycle in manufacturing environments, discussing key functional and methodological aspects along with a rich and properly classified set of technologies and tools, useful to add intelligence to data-driven services. Therefore, it serves both as a first guided step towards the rich landscape of literature for readers approaching this field, and as a global yet detailed overview of the current state-of-the-art in the Industry 4.0 domain for experts. As a case study, we discuss in detail the deployment of the proposed solutions for two research project demonstrators, showing their ability to mitigate manufacturing line interruptions and reduce the corresponding impacts and costs

    Flood hazard hydrology: interdisciplinary geospatial preparedness and policy

    Get PDF
    Thesis (Ph.D.) University of Alaska Fairbanks, 2017Floods rank as the deadliest and most frequently occurring natural hazard worldwide, and in 2013 floods in the United States ranked second only to wind storms in accounting for loss of life and damage to property. While flood disasters remain difficult to accurately predict, more precise forecasts and better understanding of the frequency, magnitude and timing of floods can help reduce the loss of life and costs associated with the impact of flood events. There is a common perception that 1) local-to-national-level decision makers do not have accurate, reliable and actionable data and knowledge they need in order to make informed flood-related decisions, and 2) because of science--policy disconnects, critical flood and scientific analyses and insights are failing to influence policymakers in national water resource and flood-related decisions that have significant local impact. This dissertation explores these perceived information gaps and disconnects, and seeks to answer the question of whether flood data can be accurately generated, transformed into useful actionable knowledge for local flood event decision makers, and then effectively communicated to influence policy. Utilizing an interdisciplinary mixed-methods research design approach, this thesis develops a methodological framework and interpretative lens for each of three distinct stages of flood-related information interaction: 1) data generation—using machine learning to estimate streamflow flood data for forecasting and response; 2) knowledge development and sharing—creating a geoanalytic visualization decision support system for flood events; and 3) knowledge actualization—using heuristic toolsets for translating scientific knowledge into policy action. Each stage is elaborated on in three distinct research papers, incorporated as chapters in this dissertation, that focus on developing practical data and methodologies that are useful to scientists, local flood event decision makers, and policymakers. Data and analytical results of this research indicate that, if certain conditions are met, it is possible to provide local decision makers and policy makers with the useful actionable knowledge they need to make timely and informed decisions

    Learning with Attributed Networks: Algorithms and Applications

    Get PDF
    abstract: Attributes - that delineating the properties of data, and connections - that describing the dependencies of data, are two essential components to characterize most real-world phenomena. The synergy between these two principal elements renders a unique data representation - the attributed networks. In many cases, people are inundated with vast amounts of data that can be structured into attributed networks, and their use has been attractive to researchers and practitioners in different disciplines. For example, in social media, users interact with each other and also post personalized content; in scientific collaboration, researchers cooperate and are distinct from peers by their unique research interests; in complex diseases studies, rich gene expression complements to the gene-regulatory networks. Clearly, attributed networks are ubiquitous and form a critical component of modern information infrastructure. To gain deep insights from such networks, it requires a fundamental understanding of their unique characteristics and be aware of the related computational challenges. My dissertation research aims to develop a suite of novel learning algorithms to understand, characterize, and gain actionable insights from attributed networks, to benefit high-impact real-world applications. In the first part of this dissertation, I mainly focus on developing learning algorithms for attributed networks in a static environment at two different levels: (i) attribute level - by designing feature selection algorithms to find high-quality features that are tightly correlated with the network topology; and (ii) node level - by presenting network embedding algorithms to learn discriminative node embeddings by preserving node proximity w.r.t. network topology structure and node attribute similarity. As changes are essential components of attributed networks and the results of learning algorithms will become stale over time, in the second part of this dissertation, I propose a family of online algorithms for attributed networks in a dynamic environment to continuously update the learning results on the fly. In fact, developing application-aware learning algorithms is more desired with a clear understanding of the application domains and their unique intents. As such, in the third part of this dissertation, I am also committed to advancing real-world applications on attributed networks by incorporating the objectives of external tasks into the learning process.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Modeling a Longitudinal Relational Research Data System

    Get PDF
    A study was conducted to propose a research-based model for a longitudinal data research system that addressed recommendations from a synthesis of literature related to: (1) needs reported by the U.S. Department of Education, (2) the twelve mandatory elements that define federally approved state longitudinal data systems (SLDS), (3) the constraints experienced by seven Midwestern states toward providing access to essential educational and employment data, and (4) constraints reported by experts in data warehousing systems. The review of literature investigated U.S. government legislation related to SLDS and protection of personally identifiable information, SLDS design and complexity, repurposing business data warehouse systems for educational outcomes research, and the use of longitudinal research systems for education and employment outcomes. The results were integrated with practitioner experience to derive design objectives and design elements for a model system optimized for longitudinal research. The resulting model incorporated a design-build engineering approach to achieve a cost effective, obsolescence-resistant, and scalable design. The software application has robust security features, is compatible with Macintosh and PC computers, and is capable of two-way live connections with industry standard database hardware and software. Design features included: (1) An inverted formal planning process to connect decision makers and data users to the sources of data through development of local interactive research planning tools, (2) a data processing module that replaced personally identifiable information with a system-generated code to support the use of de-identified disaggregate raw data across tables and agencies in all phases of data storage, retrieval, analysis, visualization, and reporting in compliance with restrictions on disclosure of personally identifiable information, (3) functionality to support complex statistical analysis across data tables using knowledge discovery in databases and data mining techniques, and (4) integrated training for users. The longitudinal research database model demonstrates the result of a top down-bottom up design process which starts with defining strategic and operational planning goals and the data that must be collected and analyzed to support them. The process continues with analyzing and reporting data in a mathematically programmed, fully functional system operated by multiple level users that could be more effective and less costly than repurposed business data warehouse systems

    Interpreting mechanism of Synergism of drug combinations using attention based hierarchical graph pooling

    Full text link
    The synergistic drug combinations provide huge potentials to enhance therapeutic efficacy and to reduce adverse reactions. However, effective and synergistic drug combination prediction remains an open question because of the unknown causal disease signaling pathways. Though various deep learning (AI) models have been proposed to quantitatively predict the synergism of drug combinations. The major limitation of existing deep learning methods is that they are inherently not interpretable, which makes the conclusion of AI models un-transparent to human experts, henceforth limiting the robustness of the model conclusion and the implementation ability of these models in the real-world human-AI healthcare. In this paper, we develop an interpretable graph neural network (GNN) that reveals the underlying essential therapeutic targets and mechanism of the synergy (MoS) by mining the sub-molecular network of great importance. The key point of the interpretable GNN prediction model is a novel graph pooling layer, Self-Attention based Node and Edge pool (henceforth SANEpool), that can compute the attention score (importance) of nodes and edges based on the node features and the graph topology. As such, the proposed GNN model provides a systematic way to predict and interpret the drug combination synergism based on the detected crucial sub-molecular network. We evaluate SANEpool on molecular networks formulated by genes from 46 core cancer signaling pathways and drug combinations from NCI ALMANAC drug combination screening data. The experimental results indicate that 1) SANEpool can achieve the current state-of-art performance among other popular graph neural networks; and 2) the sub-molecular network detected by SANEpool are self-explainable and salient for identifying synergistic drug combinations

    Towards better organizational analytics capability:a maturity model

    Get PDF
    Abstract. Data and analytics are changing the markets. Significant improvements in competitiveness can be achieved through utilizing data and analytics. Data and analytics can be used to support in all levels of decision making from operational to strategic levels. However, studies suggest that organizations are failing to realize these benefits. Many of the analytics initiatives fail and only a small partition of organizations’ data is used in decision making. This happens mostly because utilizing data and analytics in larger scale is a difficult and complex matter. Companies need to harness multiple resources and capabilities in a business context and use them synergistically to deliver value. Capabilities must be developed step by step and cannot be bought. Bottlenecks like siloed data, lack of commitment and lack of understanding slow down the development. The focus of this thesis is to gain insight on how these resources and capabilities can be managed and understood better to pursue a position where modern applications of data and analytics could be utilized even better. The study is conducted in two parts. In the first part, the terminology, disciplines, analytics capabilities, and success factors of data and analytics development are examined through the literature. Then a comprehensive tool for identifying and reviewing these analytics capabilities is built through analyzing and combining existing tools and earlier insights. This tool, organizational analytics maturity model, and other findings are then reviewed and complemented with empirical interviews. The main findings of this thesis were mapped analytics capabilities, success factors of analytics, and the organizational analytics maturity model. These results help practitioners and researchers to better understand the complexity of the subject and what dimensions must be taken into account when pursuing success with data and analytics.Kohti parempaa organisaation analytiikkakyvykkyyttä : maturiteettimalli. Tiivistelmä. Datan ja analytiikka muuttaa eri organisaatioiden välistä kilpailua. Huomattavia parannuksia kilpailukyvyssä voidaan saada aikaan oikeanlaisella datan ja analytiikan hyödyntämisellä. Data ja analytiikkaa voidaan käyttää kaikilla päätöksen teon asteilla operatiivisista päätöksistä strategiselle tasolle asti. Tästä huolimatta tutkimukset osoittavat, että organisaatiot eivät ole onnistuneet saavuttamaan näitä hyötyjä. Monet analytiikka-aloitteet epäonnistuvat ja vain pientä osaa yritysten keräämästä datasta hyödynnetään päätöksenteossa. Tämä johtuu pääosin siitä, että datan ja analytiikan hyödyntäminen isossa kontekstissa on vaikeaa ja monimutkaista. Organisaatioiden täytyy valjastaa useita resursseja ja kyvykkyyksiä liiketoimintakontekstissa ja käyttää näitä synergisesti tuottaakseen arvoa. Näitä kyvykkyyksiä ei voida ostaa suoraan, vaan ne joudutaan asteittain kehittämään osaksi organisaatiota. Kehitykseen liittyy myös paljon ongelmakohtia, jotka hidastavat kokonaiskehitystä. Siiloutunut data ja sitoutumisen ja ymmärryksen puute ovat esimerkkejä kehityksen kompastuskivistä. Tämän opinnäytteen tarkoitus on syventää ymmärrystä siitä, miten näitä resursseja ja kyvykkyyksiä hallitaan ja ymmärretään paremmin. Miten organisaatio pääsee tilaan, jossa se voi hyödyntää moderneja datan ja analytiikan mahdollisuuksia? Tutkimus muodostuu kahdesta osasta. Ensimmäisessä osassa käsitellään terminologia, analytiikkakyvykkyydet ja niiden menestystekijät. Sen jälkeen luodaan kokonaisvaltainen työkalu, organisaation analytiikkamaturiteettimalli, kyvykkyyksien tunnistamiseksi ja kehittämiseksi. Tämä malli rakennetaan ensimmäisten löydösten pohjalta. Tutkimuksen toisessa osassa aiemmat löydökset ja rakennettu malli validoidaan ja täydennetään empiirisillä haastatteluilla. Tämän työn päälöydökset ovat kartoitetut analytiikkakyvykkyydet, niiden menestystekijät ja organisaation analytiikkamaturiteettimalli. Nämä löydökset auttavat ammattilaisia ja tutkijoita ymmärtämään paremmin aiheen monimutkaisuuden ja mitä dimensioita tulee ottaa huomioon, kun pyritään menestykseen datan ja analytiikan avulla

    Supply chain resilience and risk management strategies and methods

    Get PDF
    Abstract. The changing global market due to Industry 4.0 and the recent pandemic effect has created a need for more responsiveness in an organization’s supply chain. Supply chain resilience offers the firm not only to avoid disruptions but also to withstand the losses due to a disruption. The objective of this research is to find out how resilience is defined so far in other literature and find out the strategies available to gain the resilience fit for an organization. First, in the literature review, the previous studies on resilience were studied to understand what supply chain resilience means. Then, the key results and findings are discussed and conclusions are presented. The research found some interesting strategies for gaining the resilience fit. The benefits and the stakeholders for each strategy are also pointed out. These strategies can be used according to the organization’s business strategy. These strategies aligned with the business strategy can make a huge difference to withstand potential disruption and gaining a competitive advantage against the market competitors
    • …
    corecore