67,116 research outputs found
Efficient Subgraph Matching on Billion Node Graphs
The ability to handle large scale graph data is crucial to an increasing
number of applications. Much work has been dedicated to supporting basic graph
operations such as subgraph matching, reachability, regular expression
matching, etc. In many cases, graph indices are employed to speed up query
processing. Typically, most indices require either super-linear indexing time
or super-linear indexing space. Unfortunately, for very large graphs,
super-linear approaches are almost always infeasible. In this paper, we study
the problem of subgraph matching on billion-node graphs. We present a novel
algorithm that supports efficient subgraph matching for graphs deployed on a
distributed memory store. Instead of relying on super-linear indices, we use
efficient graph exploration and massive parallel computing for query
processing. Our experimental results demonstrate the feasibility of performing
subgraph matching on web-scale graph data.Comment: VLDB201
Portinari: A Data Exploration Tool to Personalize Cervical Cancer Screening
Socio-technical systems play an important role in public health screening
programs to prevent cancer. Cervical cancer incidence has significantly
decreased in countries that developed systems for organized screening engaging
medical practitioners, laboratories and patients. The system automatically
identifies individuals at risk of developing the disease and invites them for a
screening exam or a follow-up exam conducted by medical professionals. A triage
algorithm in the system aims to reduce unnecessary screening exams for
individuals at low-risk while detecting and treating individuals at high-risk.
Despite the general success of screening, the triage algorithm is a
one-size-fits all approach that is not personalized to a patient. This can
easily be observed in historical data from screening exams. Often patients rely
on personal factors to determine that they are either at high risk or not at
risk at all and take action at their own discretion. Can exploring patient
trajectories help hypothesize personal factors leading to their decisions? We
present Portinari, a data exploration tool to query and visualize future
trajectories of patients who have undergone a specific sequence of screening
exams. The web-based tool contains (a) a visual query interface (b) a backend
graph database of events in patients' lives (c) trajectory visualization using
sankey diagrams. We use Portinari to explore diverse trajectories of patients
following the Norwegian triage algorithm. The trajectories demonstrated
variable degrees of adherence to the triage algorithm and allowed
epidemiologists to hypothesize about the possible causes.Comment: Conference paper published at ICSE 2017 Buenos Aires, at the Software
Engineering in Society Track. 10 pages, 5 figure
- …