22 research outputs found
SSumM: Sparse Summarization of Massive Graphs
Given a graph G and the desired size k in bits, how can we summarize G within
k bits, while minimizing the information loss?
Large-scale graphs have become omnipresent, posing considerable computational
challenges. Analyzing such large graphs can be fast and easy if they are
compressed sufficiently to fit in main memory or even cache. Graph
summarization, which yields a coarse-grained summary graph with merged nodes,
stands out with several advantages among graph compression techniques. Thus, a
number of algorithms have been developed for obtaining a concise summary graph
with little information loss or equivalently small reconstruction error.
However, the existing methods focus solely on reducing the number of nodes, and
they often yield dense summary graphs, failing to achieve better compression
rates. Moreover, due to their limited scalability, they can be applied only to
moderate-size graphs.
In this work, we propose SSumM, a scalable and effective graph-summarization
algorithm that yields a sparse summary graph. SSumM not only merges nodes
together but also sparsifies the summary graph, and the two strategies are
carefully balanced based on the minimum description length principle. Compared
with state-of-the-art competitors, SSumM is (a) Concise: yields up to 11.2X
smaller summary graphs with similar reconstruction error, (b) Accurate:
achieves up to 4.2X smaller reconstruction error with similarly concise
outputs, and (c) Scalable: summarizes 26X larger graphs while exhibiting linear
scalability. We validate these advantages through extensive experiments on 10
real-world graphs.Comment: to be published in the 26th ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining (KDD '20
Real-World Outcomes of Glucose Sensor Use in Type 1 DiabetesāFindings from a Large UK Centre
From MDPI via Jisc Publications RouterHistory: accepted 2021-11-10, pub-electronic 2021-11-15Publication status: PublishedFlash glucose monitoring (FGM) and real-time continuous glucose monitoring (RT-CGM) are increasingly used in clinical practice, with improvements in HbA1c and time in range (TIR) reported in clinical studies. We aimed to evaluate the impact of FGM and RT-CGM use on glycaemic outcomes in adults with type 1 diabetes (T1DM) under routine clinical care. We performed a retrospective data analysis from electronic outpatient records and proprietary web-based glucose monitoring platforms. We measured HbA1c (pre-sensor vs. on-sensor data) and sensor-based outcomes from the previous three months as per the international consensus on RT-CGM reporting guidelines. Amongst the 789 adults with T1DM, HbA1c level decreased from 61.0 (54.0, 71.0) mmol/mol to 57 (49, 65.8) mmol/mol in 561 people using FGM, and from 60.0 (50.0, 70.0) mmol/mol to 58.8 (50.3, 66.8) mmol/mol in 198 using RT-CGM (p 0.001 for both). We found that 23% of FGM users and 32% of RT-CGM users achieved a time-in-range (TIR) (3.9 to 10 mmol/L) of >70%. For time-below-range (TBR) 4 mmol/L, 70% of RT-CGM users and 58% of FGM users met international recommendations of 4%. Our data add to the growing body of evidence supporting the use of FGM and RT-CGM in T1DM
Recommended from our members
Theory-Driven Design Science Research: Addressing Social Media Issues through Predictive Analytics
Design science research (DSR) is one of important research paradigms in information systems (IS) that focus on addressing business problems by building and implementing design artifacts. Recently, predictive analytics has become one major stream of DSR thanks to the improvement of computational power and methods and the increase in available datasets. While predictive analytics studies have been much effective in producing practical implications, often, they have been criticized for lacking theoretical contributions. As a response to this criticism, in this dissertation, I present two different approaches for theory-driven predictive analytics in the context of social media issues: theory-driven feature engineering and theory-driven model designing. Social media has been a mixed blessing to our society presenting both merits (e.g., easy access to information, social support, etc.) and demerits (e.g., fake news, hate speech, etc.). Lately, predictive analytics has been largely adopted as an effective methodological approach to create solutions to the issues of social media. Through the collection of three essays included in this dissertation, I demonstrate how the two theory-driven approaches for predictive analytics can be implemented in real-world settings. In detail, the first essay proposes a novel method to represent online text as signed networks based on the structural balance theory, which are fed as an important information source into a deep learning model that identifies false information. The second essay develops an automated fake news detection model that takes into account deceptive intention behind news publishers. The third essay leverages previous studies on personality traits and hate behavior to develop a deep learning model for identifying online hate speech. Rigorous evaluation reveals that the set of predictive models proposed in this dissertation not only address important social challenges but also broadly contributes to the literature in predictive analytics, design science, and IS research
Real-World Outcomes of Glucose Sensor Use in Type 1 DiabetesāFindings from a Large UK Centre
Flash glucose monitoring (FGM) and real-time continuous glucose monitoring (RT-CGM) are increasingly used in clinical practice, with improvements in HbA1c and time in range (TIR) reported in clinical studies. We aimed to evaluate the impact of FGM and RT-CGM use on glycaemic outcomes in adults with type 1 diabetes (T1DM) under routine clinical care. We performed a retrospective data analysis from electronic outpatient records and proprietary web-based glucose monitoring platforms. We measured HbA1c (pre-sensor vs. on-sensor data) and sensor-based outcomes from the previous three months as per the international consensus on RT-CGM reporting guidelines. Amongst the 789 adults with T1DM, HbA1c level decreased from 61.0 (54.0, 71.0) mmol/mol to 57 (49, 65.8) mmol/mol in 561 people using FGM, and from 60.0 (50.0, 70.0) mmol/mol to 58.8 (50.3, 66.8) mmol/mol in 198 using RT-CGM (p < 0.001 for both). We found that 23% of FGM users and 32% of RT-CGM users achieved a time-in-range (TIR) (3.9 to 10 mmol/L) of >70%. For time-below-range (TBR) < 4 mmol/L, 70% of RT-CGM users and 58% of FGM users met international recommendations of <4%. Our data add to the growing body of evidence supporting the use of FGM and RT-CGM in T1DM
PERSONA: Personality-Based Deep Learning for Detecting Hate Speech
Hate speech in an online environment has detrimental impacts on the wellbeing of individuals, online communities, and social network platforms. Consequently, the automated detection of hate speech have become a significant issue for various stakeholders. While previous studies have proposed many approaches for this issue, we find an important research gap that they have neglected a plethora of studies from psychology investigating the relationship between personality and hate. To fill the gap, we adopt a text-mining approach which fully automates the process of personality inference. Based its results, we build a personality-based deep learning model for detecting online hate speech (i.e., PERSONA). We validated our model with two real-world cases. The results show that our model significantly outperforms state-of-the-art baselines including a method proposed by Google. Our study paves the way for future research by incorporating psychological aspects into the design of a deep-learning model for hate speech detection
Reconsidering the measurement of tie strength in online social networks
Tie strength has long been used as one of popular network measures. However, the way of measuring tie strength has not been discussed much in the previous studies. Especially, there are few studies that focus on a theoretical understanding of tie strength measurements. In this study, we suggest a theoretical framework that clearly distinguishes different measures of tie strength. Then, by introducing the concept of theory of mind (ToM), we point out the current issue on the measurements of tie strength. Through an empirical verification, we propose a new way to measure tie strength that will improve the validity of future research
Intention-based Deep Learning Approach for Detecting Online Fake News
One effective approach to fight fake news is to automatically filter it out using computational approaches. However, current approaches have neglected to identify the intention behind posting fake news leading to errors in flagging fake news. In this study, following the design science approach, we propose a novel deep-learning framework for detecting online fake news by incorporating theories of deceptive intention. Specifically, we first develop a transfer-learning model that identifies deceptive intention reflected in text and apply it to distinguish two subclasses of fake news: deceptive and non-deceptive fake news. Then, these two classes of fake news, along with an observed class of non-fake news (i.e., true news), are used to train deep bidirectional transformers whose goal is to determine news veracity. Our framework is empirically evaluated and benchmarked against cutting-edge deep learning models. Our analysis reveals that the models incorporating our deceptive-intention-based design significantly outperform state-of-the-art baselines
Are Edge Weights in Summary Graphs Useful? -- A Comparative Study
Which one is better between two representative graph summarization models
with and without edge weights? From web graphs to online social networks, large
graphs are everywhere. Graph summarization, which is an effective graph
compression technique, aims to find a compact summary graph that accurately
represents a given large graph. Two versions of the problem, where one allows
edge weights in summary graphs and the other does not, have been studied in
parallel without direct comparison between their underlying representation
models. In this work, we conduct a systematic comparison by extending three
search algorithms to both models and evaluating their outputs on eight datasets
in five aspects: (a) reconstruction error, (b) error in node importance, (c)
error in node proximity, (d) the size of reconstructed graphs, and (e)
compression ratios. Surprisingly, using unweighted summary graphs leads to
outputs significantly better in all the aspects than using weighted ones, and
this finding is supported theoretically. Notably, we show that a
state-of-the-art algorithm can be improved substantially (specifically, 8.2X,
7.8X, and 5.9X in terms of (a), (b), and (c), respectively, when (e) is fixed)
based on the observation.Comment: To be published in the 26th Pacific-Asia Conference on Knowledge
Discovery and Data Mining (PAKDD 2022