22 research outputs found

    SSumM: Sparse Summarization of Massive Graphs

    Full text link
    Given a graph G and the desired size k in bits, how can we summarize G within k bits, while minimizing the information loss? Large-scale graphs have become omnipresent, posing considerable computational challenges. Analyzing such large graphs can be fast and easy if they are compressed sufficiently to fit in main memory or even cache. Graph summarization, which yields a coarse-grained summary graph with merged nodes, stands out with several advantages among graph compression techniques. Thus, a number of algorithms have been developed for obtaining a concise summary graph with little information loss or equivalently small reconstruction error. However, the existing methods focus solely on reducing the number of nodes, and they often yield dense summary graphs, failing to achieve better compression rates. Moreover, due to their limited scalability, they can be applied only to moderate-size graphs. In this work, we propose SSumM, a scalable and effective graph-summarization algorithm that yields a sparse summary graph. SSumM not only merges nodes together but also sparsifies the summary graph, and the two strategies are carefully balanced based on the minimum description length principle. Compared with state-of-the-art competitors, SSumM is (a) Concise: yields up to 11.2X smaller summary graphs with similar reconstruction error, (b) Accurate: achieves up to 4.2X smaller reconstruction error with similarly concise outputs, and (c) Scalable: summarizes 26X larger graphs while exhibiting linear scalability. We validate these advantages through extensive experiments on 10 real-world graphs.Comment: to be published in the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD '20

    Real-World Outcomes of Glucose Sensor Use in Type 1 Diabetesā€”Findings from a Large UK Centre

    Get PDF
    From MDPI via Jisc Publications RouterHistory: accepted 2021-11-10, pub-electronic 2021-11-15Publication status: PublishedFlash glucose monitoring (FGM) and real-time continuous glucose monitoring (RT-CGM) are increasingly used in clinical practice, with improvements in HbA1c and time in range (TIR) reported in clinical studies. We aimed to evaluate the impact of FGM and RT-CGM use on glycaemic outcomes in adults with type 1 diabetes (T1DM) under routine clinical care. We performed a retrospective data analysis from electronic outpatient records and proprietary web-based glucose monitoring platforms. We measured HbA1c (pre-sensor vs. on-sensor data) and sensor-based outcomes from the previous three months as per the international consensus on RT-CGM reporting guidelines. Amongst the 789 adults with T1DM, HbA1c level decreased from 61.0 (54.0, 71.0) mmol/mol to 57 (49, 65.8) mmol/mol in 561 people using FGM, and from 60.0 (50.0, 70.0) mmol/mol to 58.8 (50.3, 66.8) mmol/mol in 198 using RT-CGM (p 0.001 for both). We found that 23% of FGM users and 32% of RT-CGM users achieved a time-in-range (TIR) (3.9 to 10 mmol/L) of >70%. For time-below-range (TBR) 4 mmol/L, 70% of RT-CGM users and 58% of FGM users met international recommendations of 4%. Our data add to the growing body of evidence supporting the use of FGM and RT-CGM in T1DM

    Real-World Outcomes of Glucose Sensor Use in Type 1 Diabetesā€”Findings from a Large UK Centre

    No full text
    Flash glucose monitoring (FGM) and real-time continuous glucose monitoring (RT-CGM) are increasingly used in clinical practice, with improvements in HbA1c and time in range (TIR) reported in clinical studies. We aimed to evaluate the impact of FGM and RT-CGM use on glycaemic outcomes in adults with type 1 diabetes (T1DM) under routine clinical care. We performed a retrospective data analysis from electronic outpatient records and proprietary web-based glucose monitoring platforms. We measured HbA1c (pre-sensor vs. on-sensor data) and sensor-based outcomes from the previous three months as per the international consensus on RT-CGM reporting guidelines. Amongst the 789 adults with T1DM, HbA1c level decreased from 61.0 (54.0, 71.0) mmol/mol to 57 (49, 65.8) mmol/mol in 561 people using FGM, and from 60.0 (50.0, 70.0) mmol/mol to 58.8 (50.3, 66.8) mmol/mol in 198 using RT-CGM (p < 0.001 for both). We found that 23% of FGM users and 32% of RT-CGM users achieved a time-in-range (TIR) (3.9 to 10 mmol/L) of >70%. For time-below-range (TBR) < 4 mmol/L, 70% of RT-CGM users and 58% of FGM users met international recommendations of <4%. Our data add to the growing body of evidence supporting the use of FGM and RT-CGM in T1DM

    PERSONA: Personality-Based Deep Learning for Detecting Hate Speech

    No full text
    Hate speech in an online environment has detrimental impacts on the wellbeing of individuals, online communities, and social network platforms. Consequently, the automated detection of hate speech have become a significant issue for various stakeholders. While previous studies have proposed many approaches for this issue, we find an important research gap that they have neglected a plethora of studies from psychology investigating the relationship between personality and hate. To fill the gap, we adopt a text-mining approach which fully automates the process of personality inference. Based its results, we build a personality-based deep learning model for detecting online hate speech (i.e., PERSONA). We validated our model with two real-world cases. The results show that our model significantly outperforms state-of-the-art baselines including a method proposed by Google. Our study paves the way for future research by incorporating psychological aspects into the design of a deep-learning model for hate speech detection

    Reconsidering the measurement of tie strength in online social networks

    No full text
    Tie strength has long been used as one of popular network measures. However, the way of measuring tie strength has not been discussed much in the previous studies. Especially, there are few studies that focus on a theoretical understanding of tie strength measurements. In this study, we suggest a theoretical framework that clearly distinguishes different measures of tie strength. Then, by introducing the concept of theory of mind (ToM), we point out the current issue on the measurements of tie strength. Through an empirical verification, we propose a new way to measure tie strength that will improve the validity of future research

    Intention-based Deep Learning Approach for Detecting Online Fake News

    No full text
    One effective approach to fight fake news is to automatically filter it out using computational approaches. However, current approaches have neglected to identify the intention behind posting fake news leading to errors in flagging fake news. In this study, following the design science approach, we propose a novel deep-learning framework for detecting online fake news by incorporating theories of deceptive intention. Specifically, we first develop a transfer-learning model that identifies deceptive intention reflected in text and apply it to distinguish two subclasses of fake news: deceptive and non-deceptive fake news. Then, these two classes of fake news, along with an observed class of non-fake news (i.e., true news), are used to train deep bidirectional transformers whose goal is to determine news veracity. Our framework is empirically evaluated and benchmarked against cutting-edge deep learning models. Our analysis reveals that the models incorporating our deceptive-intention-based design significantly outperform state-of-the-art baselines

    Are Edge Weights in Summary Graphs Useful? -- A Comparative Study

    Full text link
    Which one is better between two representative graph summarization models with and without edge weights? From web graphs to online social networks, large graphs are everywhere. Graph summarization, which is an effective graph compression technique, aims to find a compact summary graph that accurately represents a given large graph. Two versions of the problem, where one allows edge weights in summary graphs and the other does not, have been studied in parallel without direct comparison between their underlying representation models. In this work, we conduct a systematic comparison by extending three search algorithms to both models and evaluating their outputs on eight datasets in five aspects: (a) reconstruction error, (b) error in node importance, (c) error in node proximity, (d) the size of reconstructed graphs, and (e) compression ratios. Surprisingly, using unweighted summary graphs leads to outputs significantly better in all the aspects than using weighted ones, and this finding is supported theoretically. Notably, we show that a state-of-the-art algorithm can be improved substantially (specifically, 8.2X, 7.8X, and 5.9X in terms of (a), (b), and (c), respectively, when (e) is fixed) based on the observation.Comment: To be published in the 26th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD 2022
    corecore